As a researcher, you are aware that planning studies, designing materials and collecting data each take a lot of work. So when you get your hands on a new dataset, the first thing you want to do is start testing your ideas. Were your hypotheses supported? Which marketing campaign should you launch? How do consumers feel about your new product? These are the types of questions you want answered. But before you can draw conclusions from your dataset, you need to inspect and clean it, which entails identifying and removing problem participants.
Most data quality problems in online studies stem from a lack of participant attention or effort. It isn’t always easy to distinguish between the two types of problems, but researchers have a number of tools at their disposal to identify low-quality respondents and remove them from a dataset.
Speeders work through studies as quickly as they can and often engage in what is known as satisficing — skimming questions and answer options until they find a response that meets some minimum threshold of acceptability.
Speeders sometimes can be identified by examining the time it takes them to complete parts of a study (or the entire study). Other times, speeders can be identified by their failure to pass simple attention checks or by the overall quality of their data.
Straightliners select the same answer for nearly every question in the study (e.g., selecting “Agree” for all questions). Straightlining is less common than some other forms of inattentive responding because it is relatively easy to spot, and respondents may worry about having their submission rejected.
Slackers, or shirkers, lack the proper motivation to fully engage with your study. Several things may contribute to participant slacking: the pay for your study, the difficulty of the tasks, situational aspects of the participants’ environment, and an individual participant’s level of commitment. Slackers are usually identified by overall measures of data quality.
In the world of online research, some people create scripts, or “bots,” that can automatically fill in question bubbles, essay boxes and other simple questions. While bots or scripts represent an extreme type of poor respondent, they nevertheless remain a large concern for researchers. Fortunately, the goal of a person using scripts to complete online studies is incompatible with providing quality data. The unusual responses or low-effort answers to open-ended questions means bots are often easy to spot.
There are several terms — imposters, fraudsters — for people who provide false demographic information in order to gain access to studies targeting specific populations. Regardless of what you call them, you want to keep these people out of your studies.
The most effective way to keep imposters out of your study is to remove the opportunity for people to misrepresent themselves. Dissociate study screeners from the actual study (i.e., establish a system to examine the consistency of people’s self-reported demographics over time or test people’s relative knowledge in the domain of interest) to prevent imposters from ruining your study. Although only a small percentage of people provide incorrect information in order to gain access to your study, the sheer number of respondents in online panels means even a small percentage of these respondents can give your study a large number of imposters (see Chandler & Paolacci, 2017).
On online participant platforms, some people complete many studies and remain on the platform for a long period of time. Participants involved in many studies may become increasingly familiar with the flow of studies, typical instructions and measures commonly used by researchers. As these participants become less naive, they may pose a threat to data quality.
Researchers can do one of two things to address the problems discussed above: select different participants or change study design practices.
Not all online panels are created the same, and some data quality problems can be solved by selecting participants suited to the demands of the research task. For many areas of research, this is described as finding participants who are “fit for purpose.”
When there is little researchers can do to select different participants, they may choose to focus their energies on changing aspects of study design to improve data quality. Some changes to study design are simple. For example, researchers can spend more time piloting materials and ensuring the study instructions are clear and easily understood. In other cases, changes to survey design are more extensive and require more effort. In particular, to catch various types of inattentive respondents, researchers might:
One of the most common tools for detecting inattentive participants is the attention check question and its many varieties (e.g., trap questions, red herrings, instructional manipulation checks). Attention check questions seek to identify participants who are not paying attention through such questions as: “Have you ever had a fatal heart attack while watching TV?” Anyone reading this question should be able to easily indicate their attention by selecting “No.”
Although researchers have traditionally used attention check questions to identify participants who should be removed from datasets, recent research has uncovered problems with the construction of certain attention check questions and found a number of negative consequences of relying on attention checks as the sole indication of data quality.
Specifically, some researchers construct long and elaborate attention check questions that require participants to read large chunks of text and to ignore “lure” questions that lead participants to believe they should respond to the question by answering the lure. What research shows, however, is that these sorts of attention check questions are not strong measures of attention. By requiring more attention from participants than most other portions of the study, these questions introduce bias into decisions about which participants to keep and which ones to exclude. Online participants with less education and lower socioeconomic status are more likely to fail such checks than those with more education and higher socioeconomic standing.
Another problem with attention check questions is that they often lack validity. Research shows that people who pass one attention check question in a study do not necessarily pass other attention checks in the same study at a very high rate. In addition, a participant who passes an attention check question in one study may not pass the same attention check in a different study. Therefore, the current best practice with attention checks appears to be the inclusion of multiple brief attention check questions and the recognition that attention checks are simply one, not the sole, measure of data quality and of who should and should not be included in data analyses.
Identifying and controlling for the various types of poor respondents in online surveys is not an easy task. To make this work easier, CloudResearch engages in a number of practices to help researchers select appropriate participants for their projects. We prescreen participants in our online panels using patent-pending technology to ensure they pass basic language comprehension tests and are likely to pay attention during your study. We regularly publish research investigating participant behavior in different online platforms, best practices for screening participants and instructions on gathering high-quality data for various types of research projects. Contact us today to learn how you can draw on our expertise to improve your data collection methods.
The CloudResearch Guide to Data Quality, Part 3: The Advantages and Unique Challenges of Managing Data Quality in Online Studies Technology has transformed behavioral science research. Researchers today can quickly...Read More >