In Greek mythology, it was Prometheus who stole the secret of fire from Zeus and gave it to humankind. The new non-profit organisation being launched by rival research giants GfK and Kantar to address industry-wide concerns about the online survey quality, seems to make a nod to the myth in its chosen name, the Promedius Group.
The industry’s concerns about online research are many and various, but a common complaint is the lack of transparency of sample providers in the composition of their samples and the extent to which these overlap. It’s worrying enough that, as response rates dwindle, research firms are probably already relying on less than 20% of the population to answer more than 80% of their surveys. But what if it is the hyperactive 0.1% of the population that turn out to be answering 10%, or 20%, as some fear, turning survey data into junk? Without the vantage point of the gods, no-one can really tell what is happening.
Good research is always a balance of art, craft and science. The risk is that if survey results are no longer generally reproduceable, any claims to scientific validity are lost. Those that spend the big money in research, like General Mills and P&G, have noticed this, and are highly likely to start putting their money in consumer intelligence gathering elsewhere unless research can be made more accountable again.
The solution is staring at us from the problem. There is a vast trail of data that can be associated with every web-based interaction – put it all together and it becomes possible to pinpoint individuals and identify, within reasonable probabilities, that they do seem to be taking 20 surveys a day, or that they are very unlikely to be a gynaecologist because the digital wake emanating from the same PC speaks more of college student. Getting at this data, however, is much more difficult. If you are a big player, with a large volume of interactions, you can do this – but even the industry’s own demi-gods face a major hindrance, in that most of the panel providers don’t reveal the key information you need to start putting this information together, like respondent IDs or IP addresses.
Promedius will, it appears, be making use of a lot of technology to match data and perform checks on data, and they will be making this technology available for other research companies to use. This is welcome news, as the problem has been proving too big for anyone to solve on their own. There are already commercial services – MarketTools’ Truesample and Peanut Lab’s Optimus to name two – and these have gained some traction. They also add cost, and are restricted to some extent by only ever showing part of the picture – from those samples and providers that have opted in.
With three major players backing this initiative (IPSOS were involved in the development of the technology behind Promedius) it is likely that it will have the critical mass that is needed for it to become established. What the technology does, and how affordable and convenient that is (the announcements do not say that this will be offered to the industry for free) remains to be seen. I’ll be looking to secure a software review as soon as it becomes available. But there is a good chance that Promedius will be putting fire into the hands of the people, as far as panel and online survey transparency is concerned.
Hopefully Promedius will enjoy a better fate than its near namesake, who after several other skirmishes, found himself bound to a rock by the vengeful Zeus, with an eagle visiting him every day into eternity to peck away his liver.
What it does
Web-based suite of interview fraud detection measures for online surveys which can be applied to any online panel source, including panel providers or your own samples.
Ease of use
Compatibility with other software
Value for money
From $2,500 to scan 5,000 completes, with discounts for higher volumes
- Highly accurate detection of the most common types of internet fraud
- User can determine the level of policing
- Interfaces directly with Confirmit, Market Tools Ztelligence and SPSS Dimensions
- Works with most browsers, Windows or Mac
- Some programming involved if using an unsupported interviewing package
- Does not detect all kinds of fraud, such as straightlining and ‘satisficing’
- Rules are system-wide: cannot vary them by project or client
- Fraud not detected during scheduled or unscheduled downtime of the Optimus server
Optimus is a standalone software-as-a-service or ASP solution for tackling fraudulent respondents that will work with any sample source and, effectively, any internet interviewing system. It comes from Peanut Labs, an online sample provider, though the service is not in any way tied to their samples.
If you happen to use Confirmit, SPSS Dimensions or Ztelligence – then it is easy to a set a command at the beginning and end of your interview to link your survey to the Optimus service. If you use other software, you will need to do a small amount of ad hoc web programming to link it in each time. Essentially, the link is achieved using a ‘redirect’, where the survey momentarily hands control over to the Optimus server, which then probes the respondent’s browser, gathers some information and then hands back to the server running the survey. None of this to-and-fro is visible to the respondent. Neither is any personally identifiable data involved. All that Optimus holds on your behalf is your respondent ID, so you can later identify problem respondents. It does not use email addresses or cookies.
The real strength of the software, and single reason you wish to use it, is the firm’s proprietary digital fingerprinting technology though which it is able to build up a database of individual PCs which it has ever encountered for your sample and for anyone else’s too. It relies on the fact that any web-browser will reveal a large amount of information about the configuration and resources available on the PC – and there is enough variation for this to be as good as being able to get the manufacturer’s serial number. None of this information is personally identifiable. But once logged against a panellist ID, Optimus is able to start pointing the finger at some respondents for various reasons.
Optimus collects two other factual measures: interview completion times and IP location. Speed is detected as the time taken to complete against the anticipated time, set by the researcher, and short interviews are logged as potential speeding violations.
The IP address of the ISP or company network the respondent uses to access the internet contains some useful high-level geographical information, which will pin the respondent down to a country, if not to a city. This can then be used or ignored as you choose. A panellist on a consumer survey in France is unlikely to be using an ISP in the Philippines, for example, though a business executive could be, if using the wireless network in their hotel bedroom, which could as easily be in Manila as Manchester.
From this raw data, Peanut Labs deduces six measures of suspect behavour: duplicates, Geo-IP violators, hyperactive respondents, respondents belonging to multiple panels, speeding and a sin-bin category of ‘repeat offenders’, where the respondent has repeatedly transgressed in the past.
When you log into the system, you have options to register new surveys and also the different panel sources or companies you wish to use. The ‘controls’ area is where you define your own rules of what constitutes suspect behaviour. You can switch on or off any of the rules for your own samples, and also you have considerable flexibility over adjusting the threshold for each one. For example, for hyperactive respondents, you can set an absolute limit on how much multiple participation is acceptable to you, set a period, and choose whether you restrict this just to your projects or across all projects by all users of the service. It is a pity that you can only have one set of rules for all your projects: the rules for a B2B survey could be very different to what you allow in consumer research, for example.
There are two principal outputs from the system: reports and files containing the IDs of violators, determined by your rules, together with the type of violation recorded, either to update your own panel database or to seek replacements and refunds from sample providers.
A range of largely graphical reports are well presented. The main ones chart each type of violation every day, which you can filter by project or sample source. But reporting choices are limited, and there really there need to be more options available – for example to allow comparisons between different surveys or between different sample sources.
It is also worth considering the effect of scheduled maintenance on the service, which though minimal tends to be scheduled for prime-time Monday morning in Europe, and when it is down, your interviewing will be unprotected.
Ultimately, the success of the solution will depend on the volume of traffic passing through it, so it achieves the critical mass of fingerprinted PCs to be able to differentiate clearly between the responsible and the abusive survey-taker.
Customer Viewpoint : Kristin Luck, Decipher Inc
Decipher started to use Optimus in April of this year, to control sample quality when using sample from multiple sources on client projects.
“The system is designed to track respondents from any sample source. Where it really comes in handy is where you are using a multiple source sample approach and you want track people who are trying to enter the survey multiple times, either from a single source or from multiple sources.”
“Some of the other solutions on the market are tied to a particular sample provider. What was appealing to us about Optimus was that it was a technology we could use even if we were not working with Peanut Labs for sample on a particular study.”
Decipher uses Optimus with its own in-house web interviewing solution. Although this means Decipher does not benefit from a direct program interface, as with some mainstream packages, linking a new survey in takes very little time “We currently have to use a programmer to connect into Optimus.” Kristin explains, “and the first time it was about an hour’s work, but it is a pretty short learning curve, and we now have it down to about 15 minutes on a new project. In the future we will be able to implement without the use of a programmer.”
Another attraction was that the web-based interface can provide controlled access to the data to their clients, so that the entire quality control process is transparent to everyone. “It is really easy to use” says Kristin. By using the service, Decipher has identified and removed around 11% of the sample from multiple sources.
“We have found some panel providers have 21% or more of their sample has a problem and we have others where it is 8% or less,” Kristin states. “We tend to see lower percentages from the companies that have been making a lot of noise about panel quality, and higher percentages from those that have been largely silent about this.”
Being able to specify their own rules to determine fraud is another advantage for Kristin, as Decipher tend not to exclude hyperactive respondents. However, Kristin would like more granularity in how rules are applied, so that a client or a project can have its own particular rules applied- currently this is not possible without a manual programming process.
A version of this review first appeared in Research, the magazine of the Market Research Society, July 2008, Issue 505