What it does
Intelligent verbatim content management system and coding environment for researchers and coders, with options for either manually-assisted coding or machine-learning automated coding for higher volumes. Delivered as either web browser-based and web-enabled desktop software modules.
Ease of use
Compatibility with other software
Value for money
Conventional coding: between 3 and 5 US cents per verbatim coded. Automated coding: between 10 and 30 US cents per verbatim coded.
- Automated coding option will code thousands of open-ends in seconds
- Machine learning mimics human coders and produces comparable and highly consistent results
- Many tools to optimise effort when coding manually
- Web based environment makes it easy to distribute coding work to satellite offices and outworkers
- Automated coding only saves time on larger projects such as trackers
- Web interface is in need of a refresh
- Windows only – requires Microsoft Internet Explorer
new A little while ago, Language Logic estimated that their Ascribe online coding product was probably handling over fifty per cent of all of the open-ended coding generated by research agencies in the United States, and a decent proportion from the rest of the world too. The challenge is where do you go next, when you have half the market and no real rivals? One direction is to grow the market for verbatims, by making it possible to code the vast number of open-ends that never get coded – and the new Ascribe Automated Coding Module or ACM, promises to do just that.
I happen to know something about the technology behind this tool, because it I worked on a prototype with the online bank Egg (and even co-presented a paper on it at the 2007 Research conference ). Language Logic has subsequently worked with its creators, the Italian government-run research foundation ISTI-CNR, to integrate theor technology into Ascribe. Though I am often hesitant to state anything is the best, the ISTI-CNR engine is easily the best I have found as it is the most MR-savvy of any automated text-processing technologies. This is not a discovery or text mining tool – it is a coding department in a box.
ACM closely mimics the normal human-intervention coding process, and fits seamlessly into the traditional Ascribe workflow. By using machine learning, it does not attempt to interpret, or extract meaning by looking up words in dictionaries – in fact, it actually does not use dictionaries at all. Instead, you provide it with examples of how you would classify your data into a codeframe, and then set it to learn from this. In Ascribe, this means you simply start coding the data in the way you normally would. As you code, you are creating the training set that ACM needs. When you have coded enough to create a decent training set, you take your foot off the pedal, and let ACM accelerate through the rest.
First, you build the ‘classifiers’ that will identify matching answers. These work by looking for telltale features of the examples you coded. For any individual answer, it could create thousands of these unique features – patterns of words, letters and so on. So many, in fact, that it easily overcomes problems of poorly spelt words, synonyms and so on. When the classifiers have been built, you can then apply them to your uncoded data, and it will categorise them too, applying a confidence score to each coding decision it takes – you can adjust this threshold to make it more or less sensitive. It takes just a few seconds to zip through thousands of verbatims. There is a process for validating the coding decisions the ACM has made, and it will helpfully present validation examples in order of those where it was least confident of its coding decision.
This validation step makes the system very manageable, as you can understand what it is doing and you can improve its performance by correcting any assignment errors, and even react to changes over time. It feels uncanny, too, as the marginal decisions it identifies are often the ones that have the human coders debating where it should go too.
Not that you have to use the ACS with Ascribe – it does command a premium in pricing over manual coding and it is only really suitable for larger volumes. The overhead of training and validation is comparable to manually coding a couple of thousand interviews. However, it can also be applied to qualitative projects and web content, such as blogs.
Even manual coding in Ascribe is highly optimised, with tools to let you find similar answers, code by word or phrase matching, and if you wish, re-categorise items at any point. You use it both to create your codeframe and assign answers to it in one integrated step. It’s a multi-user system, and you can give assign responsibilities among the team: some can build codeframes, others only code, and others only analyse. Ascribe also has a surprisingly rich set of analytical tools – even cross-tabbing capabilities. You are not restricted to uploading only the verbatim texts, but the entire survey can go in. It can handle data from SPSS Dimensions now with ease,and it is totally integrated into Confirmit using the Confirmit Web Services interface. Upload routes are provided for most other MR packages.
It’s not the prettiest of tools to use: the interface may be on the web but is hardly of the web and is in need of a makeover. Language Logic are redesigning some modules as thin client Window apps, which have a better-looking interface, but it would improve the approachability of Ascribe if it’s web interface as better structured and designed. True, it is productive to use, but it does not help you get there as a novice, and the documentation (which is being redone at present) is not as comprehensive as it needs to be. It’s a pity as both make it a challenge to harness all of the power that is in this otherwise remarkable system.
Customer viewpoint: Joy Boggio, C&R Research Services, Chicago
Joy Boggio is Director of Coding at C&R Research Services, a full service agency in Chicago. Joy introduced Ascribe to C&R in 2004, having used it previously elsewhere. Ascribe is used for all verbatim coding on quant studies at C&R and also some of their qual projects. She explains: “Within a day or two of introducing Ascibe, we immediately cut down the deliver time on project by, in some cases, a week. The features of Ascribe that are the most attractive are it being web based – you can easily hand out the work very easily to many different people in many different places; if you have had the study before, you can merge it with the previous study and autocode a part of it; you are not restricted in the formats of data you can input, nor are you restricted in how you export the data out, and we can do some rudimentary data processing within the tool.”
Although C&R has a research staff of around 60, Joy is able to support all of the verbatim coding activities with a team of just three coders. But it is not only the coders that use Ascribe – many of the researchers also use it to access the verbatim responses, using its filtering and analytical capabilities to indentify examples to include in reports and presentations. “It means they can dive down a little deeper into the data. The problem you have with the process of coding data is that you can flatten out the data – the challenge is always to make sure you can retain the richness that is there. With Ascribe you can keep the data vibrant and alive – because the analytical staff can still dive into the data and bring some of that richness to the report in a qualitative way.”
Joy notes that using Ascribe telescopes the coding process, saving precious time at the start. “It’s now a one-step process, instead of having to create the codebook first, before getting everyone working on it. With this, as you work through the verbatims you are automatically creating codes and coding at the same time, so you don’t have to redo that work. When you are happy with the codebook, you can put others onto the project to code the rest. This is where the efficiency comes in.”
Joy estimates that it reduces the hours of coding effort required a typical ad hoc project by around 50 per cent, but due to the ease of allocating work, and the oversight the system provides, she remarks: “You are also likely to save at least a day of work on each project in management time too.”
C&R Research makes extensive everyday use of the manual coding optimisation tools Ascribe offers, such as to search for similar words and phrases, but so far has only experimented with using the new automated machine learning coding in ACM. Joy comments: “It seems to be more appropriate for larger volumes of work – more than we typically handle. There is a bit of work up from to train it, but once you get it going, I can see this would rapidly increase your efficiency. It would really lend itself to the larger tracking study, and result in a lot less people-time being required.”
A version of this review first appeared in Research, the magazine of the Market Research Society, December 2009, Issue 523