Solving the Challenges of Managing Data Quality in Online Research


The CloudResearch Guide to Data Quality, Part 3:
The Advantages and Unique Challenges of Managing Data Quality in Online Studies

Technology has transformed behavioral science research. Researchers today can quickly access participants from all over the world and collect data in ways not possible in the past. Key to this transformation has been online participant recruitment platforms like Mechanical Turk (MTurk) and market research panels. Although these panels offer the opportunity to conduct research quickly and efficiently, they also pose unique opportunities and challenges for managing data quality. Not all platforms are built the same way or are equally valid for different types of research. As we will see, finding participants “fit for purpose” for a research study is a big part of managing data quality in online studies.

Amazon’s Mechanical Turk: Driving a Scientific Revolution

In academia and for many businesses, Amazon’s Mechanical Turk (MTurk) has been at the heart of the web-based research revolution. MTurk was launched in 2005 as a platform where “requesters” could post small jobs to people known as “workers.” All the tasks posted on MTurk require human intelligence — transcribing data, categorizing images, moderating website content and completing behavioral science studies. Initially, many requesters were interested in using MTurk to build model data that could be used to train machine-learning algorithms. Then, a few years later, behavioral scientists co-opted MTurk as a research tool.

The popularity of MTurk among behavioral scientists increased dramatically in 2011. That year, researchers demonstrated that high-quality data for human behavioral research could be collected on MTurk quickly and inexpensively (Buhrmester et al., 2011). In addition, companies like Qualtrics and SurveyMonkey dramatically lowered the technical skills required to program web-based experiments. Thus, the stage was set for the research revolution: MTurk made it possible for researchers to locate and pay participants, while a whole host of online tools made it easy for nearly anyone to create sophisticated online surveys and experiments.

Within a few years of MTurk’s adoption by academic researchers, it was clear much of social science was at a tipping point. MTurk made it possible for researchers to collect data in a fraction of the time required for lab-based experiments, and the participants on MTurk often were more diverse than students in university subject pools. In addition, because MTurk was more affordable than other online alternatives, academics invested resources in understanding issues significant to MTurk: data quality; participant representativeness; replicability of established experimental findings; factors associated with participant availability; and characteristics associated with non-naive MTurk participants. Shortly after MTurk’s adoption by academic researchers, a majority of articles published in the top journals of some disciplines contained data collected from Mechanical Turk.

How MTurk Works to Improve Response Quality

Perhaps one of the strongest tools for maintaining data quality on MTurk is something Mechanical Turk built into its platform at the start: a reputation mechanism. On MTurk, requesters have complete discretion over whether to accept submissions from workers. When a worker’s submission is rejected, the worker is not paid and their reputation suffers. Because requesters often place reputation restrictions their tasks, workers are incentivized to submit quality work and avoid rejections.

In addition to MTurk’s reputation mechanism, a number of other features allow researchers to get more participant engagement than is typical on online platforms. One is the ability of researchers to set compensation amounts for each task. When a researcher needs participants to engage in an arduous task, the researcher can pay more or offer a bonus. Another platform feature useful for researchers is that each worker has a unique worker ID. Worker IDs can be used to recontact workers, making longitudinal or follow-up studies possible.

Although MTurk makes it possible for researchers to conduct a wide variety of online studies, speed and efficiency are not always possible. Because MTurk was not built as a platform for social science research, there are a number of common research tasks that are challenging, time-consuming, or impossible to accomplish without interacting with MTurk’s application programming interface. CloudResearch’s MTurk Toolkit helps researchers manage MTurk studies and ensure data quality by simplifying the setup and execution of MTurk studies.

MTurk Alternatives: The Expanding World of Online Participant Panels

As online data collection practices expanded among basic researchers in academia, they were also being developed in applied areas of behavioral science, like market research. However, because applied research often takes place within industry, where researchers are better funded, these industry researchers developed platforms for online participant recruitment faster than the academics who relied on MTurk.

Many of the online panels used in industry began in the early 2000s, developed to meet demands of market researchers. Because market researchers often want to collect data from large samples or segment the population to learn what specific groups think about specific brands or products, online panel providers created large participant pools similar to MTurk in providing access to people willing to take online studies, but different from MTurk in terms of their focus and structure.

In terms of focus, online panels sought to sign up as many potential participants as possible. Building panels with tens of millions of people worldwide gave panel providers the ability to fill requests for studies seeking thousands of participants. In addition, large panels allowed researchers to reach diverse groups of people, hone in on specific demographic criteria and create samples stratified by important demographic characteristics like ethnicity or wealth.

These online panels also differed from MTurk in terms of structure. Whereas participants on MTurk are people who opt in to work on the platform, participants in online panels might opt in, but they also might have been recruited while engaged in other activities on the internet. Because participants in online panels are recruited in various ways, not all participants are integrated into the platform or possess the same level of motivation to complete studies. This has a big effect on the type of data researchers can expect.

The biggest challenge of using online panels for research is that many participants are inattentive. Research studies show that the trade-off for recruiting tens of millions of participants is that many of these people are not motivated to provide quality data. Fortunately, research has also shown that inattentive participants in online panels can be screened and directed away from a study before they enter the study and contribute low-quality data.

Finally, researchers on MTurk choose the amount participants will be compensated, while in other online platforms, panel providers are in charge of participant compensation. Although research shows compensation does not affect data quality when participants are asked to answer survey questions, compensation does affect people’s willingness to engage in long, challenging or complex tasks requiring effort and ability. Therefore, researchers are often able to get stronger participant engagement in MTurk than through online panels.

CloudResearch is the only platform in the world that integrates MTurk and market research panels, bringing the power and flexibility of both to academia, marketing, government and nonprofits. Our patent-pending technology identifies inattentive participants before they reach your study. The result is data quality that is unmatched in the industry.

Continue Reading

Related Articles