A paper co-authored by Tim Macer of meaning, Mark Pearson of Creation Insight (and formerly of Egg), and Fabrizio Sebastiani of the Italian National Council for Research in Pisa, and presented at Research 2007 in March, was announced as an award winner in the category ‘Best New Thinking’ at the 2007 Research Excellence and Effectiveness Awards ceremony at the Royal Lancaster Hotel, London, on the 10th December 2007.
The paper documents the trio’s successful application of human language processing technology to the task of analysing and interpreting open-ended comments written by survey respondents when completing online questionnaires.
The Resarch 2007 Conference Award for Best New Thinking The VCS (verbatim coding system) was devloped for online bank Egg and is based on the technology pioneered by Dr Sebastiani at the Italian National Council for Research in Pisa. Tim Macer, from meaning, was engaged by Egg to facilitate and manage the project, and work on the design with developers from ISTI-CNR.
What it does
Windows-based verbatim answer classification management for coding and/or transcription of handwritten scanned images from self-completion surveys or for coding open text responses from CATI, CAPI and Web interviews.
Ease of use
Compatibility with other software
Value for money
€7,500 one-off for base 2-user system, additional users €500 or less, according to volume; support and maintenance 18% of purchase price annually. Special terms for public sector and academic users.
- Well-crafted system full of practical features for coding
- ‘Packages’ option allows coding work to be distributed to outworkers with a standalone PC
- Seamless integration with Readsoft Forms (formerly Eyes and Hands)Powerful administrative features to manage workflow and simplify tasks for coders
- Windows based only – not web-enabled
- Automation features for typed texts are limited in current version
- Documentation not yet in English (due April with version 3)
It’s been a while since anyone attempted to provide better software to manage the coding of open-ended questions. Most data collection suites offer rudimentary tools that cope with verbatim responses, but do little to automate the work. One early attempt, Verbastat, has now sadly disappeared from the market – probably shaded out by web-based Ascribe. But neither of these products are much help if the openended responses originate as handwritten items on paper. This is the gap that streamBASE GmbH, a German software provider. has plugged with its Coding-Modul.
The program actually consists of two modules – a ‘Coding Control’ for administrators, and ‘Coding Station’ for the coder to use. The Control module neatly strikes the balance between a clean, simple to learn interface while packing in a lot of options to provide flexibility in the way coding workflows are managed. A panel to the left provides a tree view of the work, split logically into Surveys and Users. Everything can be found within this tree. A survey contains entries for questions, coding rules and transaction rules, each of which can be added to or altered simply by right-mouse clicking and selecting from a context sensitive menu.
It is within ‘Questions’ that codeframes are defined, and these can be as simple or as complex as you like, with multiple hierarchies allows and a wealth of tools for managing codeframe changes over time on continuous studies. You can also preview the quality of scanned images here, to check that coders will be able to work with what they are being given.
At the core of the system are ‘rules’, which define the work to be done. A rule will select and filter verbatims from the pool of work to be done in any survey, and send it to the coders you designate. Therefore one rule could assign one question to one coder, another could assign several questions to one coder, another question to several coders and so on. Rules have a rich set of options associated with them which you can switch on to do fancier things. For instance, when working with scanned images, you can have it sort and deliver the openended boxes to coders in order of the density of the response, so that non-responses such as a dash or the word ‘nothing’ tend to come at the end, and the coder can decide when there is no more real data is coming up, without having to plough through all of them. Transaction rules determine how the data get exported out for analysis.
Graphical displays and a reporting tool give you some lovely snapshots of all the work in progress and work still to be done – it is though having this kind of management information available that real productivity gains can be achieved.
The coder’s interface is very simple and obvious to use. Administrators can allow greater flexibility to more experienced coders to manage their own workflows and even to add to the codeframe as they go. The software also provides support for satellite workers using a standalone PC or laptop – on the move, or even on the kitchen table, by creating a ‘package’ of work which you despatch by CD or DVD to the remote coder. Email is not normally an option, due to the size of the bitmap images.
Mid-stream software like this needs to be versatile with its inputs and outputs. A variety of imports, exports and more tightly coupled ‘plug-ins’ are offered, best demonstrated with Readsoft Documents for Forms (formerly Eyes and Hands), where Coding-Modul communicates directly with the Forms database, avoiding the need for any intermediate file transfer. ODBC is widely accepted open standard for exchanging data between database, and the software offers an ODBC interface, which makes it easy to integrate with any other database-oriented data collection platform. Already, Streambase offers plug-ins to Confirmit, NIPO’s ODIN and even non-database Quancept, and the firm expects to add other interfaces, as customers request them.
At this point, it might be worth waiting a month or two until version 3 is out. Alongside a much needed English translation of the Help system, Streambase is also promising to increase the support for handling electronic data, using word counts for sorting texts prior to classification and other automation techniques for the coding of texts. In my view, this is a must. Where verbatim texts are in a machine readable form, as they are from CATI or Web, the bottom-up approach followed by this software, which may be a necessity for scanned bitmaps, is not really good enough. Word searching and some mass aggregation techniques will be required to provide a measurable advantage over the built-in coding module you are likely to find in any existing data collection suite.
How far version 3 will take us up the escalator to coding heaven remains to be seen, though something to that should make CATI centre managers sit up is the planned support for coding of audio snippets where verbatims have been digitally recorded.
Customer viewpoint: IPSOS Germany
Ipsos Germany has been using Coding Module for the bulk of its scanned paper-based surveys since 2003. It is used in combination with the company’s Readsoft Documents for Forms data capture system. Britta Dorn, manager of the coding and data entry department, says: “It is a lot faster using this – you save a lot of time with it. You don’t have to go through the questionnaire twice. For example, if you have a questionnaire with 75 pages, you do not have to waste time going through all these pages looking for the next question to code. Everything is there on the screen -it is much better. We use it also when we need verbatims to send directly to the client. Rather than go through the questionnaires, we can use the coding module and transcribe from there. So it is quicker here too.”
Often Ipsos will transcribe verbatims from the images on screen into actual text as well as coding them. But if the client only wants to have a snapshot of what people are saying, without classifying the answers, Britta will request a verbatim report which simply lists the images.
Britta continues: “That way, it does not take a lot of time. And it is also very quick getting the data out, for quality control. When you need to output data for evaluation purposes, you can do this in just a few minutes.
“We don’t use it for every survey. For example, if you have questionnaires with semi-open questions, it may be quicker to code these manually. I would say we use if for 90% of our paper and pencil interviewing – but we look at each job separately.
“It is very easy to operate. It does not take a long time to train coders – it takes about an hour and then they know everything they need for Coding. And the administration is very easy to handle. Our coders are very experienced – many of them have worked here for 15 or 20 years – and they work well with the software.”
“We have also had very good experiences with the company. If you want some modifications to the software they will usually do this quickly if it’s possible for them. ”
A version of this review first appeared in Research, the magazine of the Market Research Society, March 2007, Issue 490