A story about data manipulation in science

Posted by Stuart on August 23, 2015 · 3 mins read

Science is having an interesting time of it recently, as it comes under increasing intelligent scrutiny.

So I’ll tell this brief story. I was running a timed-response experiment. It was a fairly simple experiment, which looked at how people read email. The hypothesis was that layout shaped how quickly people can classify an email message, so that a formatted message made it easier to classify even when all the exact same words were included.

Each participant was shown a set of messages on the screen, sometimes formatted, and sometimes not. I also had a condition where I changed all the letters to an “X”, to see if people could actually classify emails when they couldn’t read any of the words at all.

Anyway, the design of the experiment was simple. Messages flashed on the screen, and we asked participants to press a button to classify the message as fast as they could.

Enter one participant, who we’ll call “Fran” because that isn’t their name. This participant was strikingly fast. Afterwards, as we debriefed the participant (I so love that phrase) “Fran” said that they’d actually just pressed any button as fast as the message appeared on the screen. They didn’t attempt to classify the message at all!

So now the ethical quandary. Should this data stand? “Fran” had not followed the experimental protocol, but omitting the data – would that count as misconduct. Would it go against this?

Falsification is manipulating research materials, equipment, or processes, or changing or omitting data or results such that the research is not accurately represented in the research record. U.S. Federal Policy on Research Misconduct

By today’s standards, I should probably have specified, in advance, the precise criteria I’d use to eliminate data. Which would be okay except that it’s impossible because of the Qualification Problem.

Of course, I could include the data anyway, and mark it because of the comments from the participant during the debrief. It is data, I suppose, just data that should never be analysed together with the rest. My analogy would be experimental equipment that failed to work as expected. You’d be okay ignoring that data too, I’d hope.

All in all, intent has to remain central to judgements on academic misconduct.