Five Things You Should Be Doing in Online Data Collection

Aaron Moss, PhD

By Aaron Moss, PhD & Leib Litman, PhD

Researchers are responsible for being an expert, or at least knowledgeable, in several areas. There’s the topic of your research, the methods common within your discipline, best practices for open science, and the mediums used to communicate about your work—just to name a few. For many researchers, online data collection has been revolutionary, helping collect data faster and more affordably than ever before.

Yet, with the emergence of online research, there is now one more domain to be an expert in. Given the steep learning curve for really learning how to best run online studies, we put together this blog to highlight five practices that if you’re not already doing in your online research, you should be. These practices primarily apply to online research on Amazon’s Mechanical Turk when using CloudResearch’s MTurk Toolkit, but some practices can be applied to other platforms as well.

1. Use Titles That Avoid Selection Bias

One good practice is to give your studies generic titles so as not to introduce self-selection bias. Consider, for example, a study title like, “A survey to examine the relationship between body image and eating disorders.” While the title provides a clear description of the study’s aims, it also has the potential to attract workers who may be interested in the topic—perhaps because they have a history of eating disorders—or repel workers who have little interest in the topic.

Rather than descriptive study names, we recommend generic study titles such as, “psychology survey” or “public health study.” Generic titles let workers know they will be asked to fill out questionnaires and to participate in a typical academic study on MTurk without giving enough information to introduce self-selection bias.

2. Pay a Minimum of 10 Cents a Minute

For standard behavioral science studies, we recommend paying a minimum of 10 cents per minute. Although pay on MTurk is a controversial topic, we believe a minimum of 10 cents per minute is advantageous for two reasons. First, 10 cents per minute equates to $6.00 for an hour-long study. Second, a minimum of 10 cents per minute keeps studies affordable for most academic researchers while also leaving ample room to increase pay for more demanding tasks (e.g., lots of open-ended responses, dyadic studies, or longitudinal studies) or when the researchers’ ethics require paying more. MTurk leaves compensation entirely up to requesters, meaning researchers who wish to pay more, should simply pay more.

3. Set Study Completion Times to 3 or 4 Times the Expected Length of the Survey

On MTurk, the “Time allotted per assignment” is the maximum amount of time a participant has to complete the HIT. If a participant does not submit the HIT within the time allotted, they will be locked out of the HIT and not paid. For this reason, requesters should make sure workers have enough time to complete the study. At the same time, however, giving workers too much time (e.g., 24 hours) can be problematic because some workers may not complete the study in one sitting, may accept the HIT but wait until later to complete it, or for technical reasons the HIT may be unavailable for a long period of time after a worker abandons it.

To balance these concerns, we recommend setting the time allotted per assignment to three or four times the expected length of the survey. If the survey is expected to take 15 minutes, you should give workers 45 or 60 minutes to complete it.

4. Collect Worker IDs

Collecting Worker IDs should be standard practice. Without Worker IDs, it is impossible to match worker’s to their data. Worker IDs can also be used to determine who is eligible or ineligible for your studies. If some workers are not attentive, do not properly follow the instructions, or in some other way provide bad data, their Worker ID can be used to exclude them from future studies. In addition, Worker IDs are essential for longitudinal research, where data cannot be analyzed without matching Worker IDs across data sets.

If you’re hesitant to collect Worker IDs due to privacy concerns, you can collect CloudResearch Worker IDs within the MTurk Toolkit. Our “Anonymize Worker IDs” feature, encrypts Amazon Worker IDs to give workers extra privacy while allowing you to use the encrypted ID for all the same purposes you would use an MTurk ID (e.g., Including or Excluding workers from a study).

5. Use Dynamic Approval Codes

Workers submit approval codes to show they have completed the study and to get paid. There are multiple ways to set up approval codes, but the most efficient and secure method is to use dynamic approval codes. Dynamic approval codes ensure that each worker receives a unique code. In addition, with dynamic codes, you can set your study up to auto-approve workers once they submit the correct code. To learn more about completion codes, see here.

Bonus Tip:

Examine Your Study’s Stats

When you use the MTurk Toolkit, we provide a number of stats about your study that you can use to make important decisions. For example, when you click on your study title from the main Dashboard, you will see a dropdown box that includes information like the average, median, and expected time to complete the study. From these three numbers alone you can see whether it is taking workers longer to complete the study than you estimated and adjust the pay accordingly. In addition, the dropdown box shows information like the number of workers who have started your HIT, been approved or rejected, and the study’s bounce rate. This information can help you determine whether something about your study is unattractive to workers (using the bounce rate) or whether an aspect of your study setup is making it hard for workers to complete the study (using the numbers who have started and been approved for the study).

Related Articles