Three weeks ago, we published a blog explaining five things you should be doing in your online data collection. In this blog, we follow up with five things you should NOT be doing when collecting data on MTurk.
This advice may sound obvious, but some researchers do not pilot studies before launching them. Piloting studies makes sense both so you can catch any errors in the instructions, study programming, or technical details (e.g., redirecting workers to an external webpage) and so you can accurately estimate the time it will take workers to complete the study. With an accurate estimate of how long the study takes, you can set a fair compensation rate.
“Piloting” your study may be as simple as walking through the survey yourself, asking a friend or research assistant to take the survey, or running a small portion of the study (say 10%) before launching the full version. Whatever way you choose to pilot is fine, just make sure you plan time to do it.
For most academic HITs, workers expect to complete a survey or experiment. Anytime you ask workers to do something extra or out of the ordinary, you should let them know in the HIT instructions so they can decide whether it is something they want to do before accepting the HIT. Examples of things that should be disclosed in the HIT instructions are: a) if workers will be asked to download a file or app onto their device, b) if workers will be asked to engage in a dyadic study with a partner, c) if workers will be asked to engage in a video chat or to create a recording of themselves, or d) if the study is overly tedious. There are, of course, other situations where workers should be informed about what they will be asked to do. As a rule, we recommend disclosing anything that may catch workers off guard or cause multiple people to return the HIT after they find out what they will be asked to do. Give people the information up front and let them decide whether they want to participate or not.
Time is money, and that’s certainly true on MTurk. If you set up a demographic questionnaire or other screener to determine worker eligibility, you should not terminate ineligible workers without paying them. While it may not seem like a big deal to have workers answer four or five questions for your study, when many requesters engage in the same behavior, workers spend a lot of time doing things for free.
Instead of terminating workers who do not meet your eligibility criteria, you can run a screener study to determine worker eligibility before launching your actual study. By separating the screener from your actual study, you can compensate workers a small amount for answering your screener and remove the temptation for workers to “guess in” to your study by answering the screener multiple times until they find the right eligibility criteria. Once you have a list of workers who meet your criteria, you can place the eligible workers on an Include list and send workers an email to let them know the study is available.
As a second option, you can use is TurkPrime’s survey tools to recruit workers who meet your eligibility criteria. With our Panel features, you can target workers who meet your demographic criteria without having to first run a screener. We determine workers’ demographic information by asking questions unassociated with any one particular HIT. In addition, we ask the same questions over time so that we can determine workers’ response consistency. People who consistently give the same responses are granted a qualification which can then be used to target them for specific studies.
When using TurkPrime’s Panel features, you should not add all the demographic features necessary to ensure your sample matches all sampling criteria. Instead, you should use Panel features to tailor your subject selection to the most important features of your sample. The reason for this is that each additional Panel feature you add limits the number of workers you can draw from for your study because workers must have answered ALL the demographic questions you specify in order to qualify for the study.
To make this idea concrete, imagine you are running a study in which you want to sample women, between the ages of 30 and 50, born in the US. When setting up your Panel features, you should select women and ages between 30 and 50, but NOT people born in the US. If your study is limited to US participants–the default for TurkPrime studies–most workers will be born in the US. Adding an additional Panel feature severely limits the pool of available workers because workers must have answered all three demographic questions enough times for us to determine response consistency rather than just two.
To many researchers, this advice may sound a bit controversial. But, there is currently a debate underway within the academic community about the value of attention checks and how they should be used (see this article for a quick overview and links to several academic papers). At this stage of the debate, what seems clear is that attention checks should not be your sole measure of data quality and just because a participant fails an attention check it does not necessarily mean they did not spend time or effort on the rest of your survey. Because rejecting work has severe consequences for workers, we do not recommend rejecting work just because a worker failed attention checks.
If you have clear evidence workers have not given an honest effort in your study, you can reject their work. But, if you are unsure how much effort a worker gave and you do not want to respond with a rejection, you can place the worker onto one of many exclude lists to ensure they do not continually show up in your studies.