The latest news from the meaning blog


Bad data can mean more than bad research

Bending the rules a bit on who meets the quota should never be seen as a ‘technical’ breach of any research code of conduct, no matter what pressure we find we are under to meet the target. Karen Forcade’s admission that she passed off research findings in the USA that had been manipulated to meet the client’s sampling criteria  (see “Youth Research boss pleads guilty to federal fraud charges”), and in some cases, simply fabricated interviews, is all the more reprehensible because the study in question was for child safety relating to lighters. Forcade, along with another researcher at this now-defunct agency are due to be sentenced: it could mean jail for them.

It’s an extreme case, but it serves to remind us that our combined labours are about delivering the truth.  Often, what emerges from our inquiries is indistinct and contradictory and getting to the truth may legitimately involve editing both data and findings to bring it out. We also know that some respondents provide false data – and it is not a problem that only afflicts online surveys. Discretion is called for in choosing what to edit and how to edit it, and wherever discretion and professional judgement enters a process, so too does the potential for abuse. Respondents fake interviews because they’ve been promised trinkets and tokens. Forcade faked interviews because her firm gained around $15,000 for each safety study they did: higher stakes and also greater consequences, though quite why she did this, is still hard to comprehend.

Yet most errors are more mundane in origin.  From my own observations, and conversations I’ve had with those who work with data on the technical side, data quality is an issue that large areas of the industry have become complacent about. Execs and project directors look far less now at actual respondent data than they used to.  And while eyeballing the data will only uncover some problems: error detection is really best done using automated methods. Yet few research firms seem to be putting effort into innovating in this area. Manual processes for error checking seem to abound, focused on checking tables while other parts of the research process that will introduce error (scripting, editing, data processing, coding and report preparation) are largely left to their own devices.

Yet every time I’ve been involved in the migration of a tracker from one supplier or one technology to another, errors have  emerged where published findings have been found retrospectively to be just plain wrong. Only yesterday, I was talking with a technology provider about automated error detection, and we both found we had several anecdotes that illustrated this very situation. In one case, it was simply that the labels for one regional report had been inverted – the more the manager tried to get his low satisfaction score up, the lower the scores went. He was about to lose his job when someone spotted the error.It seems he’d actually been doing rather well.

Research does not need to be wilfully misrepresentative to do real damage to lives and reputations.

I’m curious to know if others have observed this problem too, or how they police their data.