The internet has the reputation of being a place where people can hide in anonymity, and present as being very different people than who they actually are. Is this a problem on Mechanical Turk? Is the self-reported information provided by Mechanical Turk workers reliable? These are important questions which have been addressed with several different methods. Researchers have examined a) consistency of responding to the same questions over time and across studies b) the validity of responses, or the degree to which the items capture responses that represent the truth from participants. It turns out that there are certain situations in which MTurk workers are likely to lie, but they are who they say they are in almost all cases.
One way of measuring truthfulness of responses is to examine how individual workers respond at different times to the same questions. In one study, data collected from over 80,000 MTurk workers examined the reliability in reported demographic information over time. This large study found that participants were overwhelmingly consistent when reporting demographic variables across different studies over time, with gender identification being 98.9% consistent, race 98.2% consistent, and birth year being 96.2% consistent, with this slightly lower score being largely due to technical issues rather than Turkers not being truthful (Rosenzweig, Robinson, & Litman, 2017).
Various forms of validity have been examined in data collected through MTurk, with results showing that data are by and large valid. We will focus on convergent validity, which refers to how a measure is correlated with other measures of known related constructs. Convergent validity of self-reported information at a group level can be established by examining whether workers are providing logically consistent information. Data collected on TurkPrime show that associations between variables are consistent with what is found in the general population. For example, older Mechanical Turk workers tend to be more religious and more conservative, a pattern that is consistent with the general US population. The reported number of children correlates strongly with age and family status, as do divorce rates. Self-reported time of day preference is correlated with the time of day that workers are actually active, which is also correlated with a cluster of clinical, personality, and behavioral variables that have been previously reported in the literature in studies of the general population (Unpublished Data). Similar consistent patterns have been observed in health information collected from Mechanical Turk workers. TurkPrime profiled over 10,000 Mechanical Turk workers on over 50 questions relating to physical health, with a factor analysis revealing that symptoms clustered around underlying conditions in the expected way. For example hypertension, high cholesterol, and diabetes formed a single factor. This factor, interpreted to be metabolic syndrome, correlated with other variables such as age and gender in the expected way. The rate of metabolic syndrome increases with age and was higher among men. BMI also correlated with self-reported exercise (See also Litman et al., 2015). Some other examples include the fact that rates of chronic illnesses are significantly higher among smokers compared to non-smokers, and strongly associated with BMI, with both higher and lower than average BMI being predictive of chronic illnesses.
Video tools that are currently in beta testing at TurkPrime are starting to be used to verify participants’ reported demographic characteristics such as gender, and race, and the presence of a second person for dyadic research, with promising initial results indicating that participants are highly truthful.
Research has additionally examined the reliability of data collected when selection criteria were listed as a prerequisite to enter a study (e.g.“only open to males”). Data show that when participants are incentivized to not be truthful, such as when they are only able to take a lucrative study if they identify as a particular demographic group, they lie (Chandler & Paolacci, 2017; Rosenzweig et al., 2017). For example, in a study with a HIT title that said it was open “for men only”, 44% of participants who entered had previously consistently reported their gender as “female”.
When researchers want to selectively recruit participants on MTurk, they have several options. Some researchers recruit for a specific demographic group by including such specifications/selection criteria in a study open to all workers, and rely on workers to tell the truth when opting in to such a study. Based on the data, this is a mistake that will lead untruthful participants to opt-in. There are, however, several ways to selectively recruit participants that are who they say they are. One option is to use the qualifications system with characteristics already verified by MTurk, or by using TurkPrime’s qualification system. Another option is to run a study open to all workers, ask a series of initial demographic questions, and only have participants who match the desired demographic criteria proceed to the next round in the study, paying even those who were of the wrong demographic for their time. You can create worker groups on TurkPrime based on such pre-screenings which can help you track and subsequently recruit participants who match your criteria of interest.
Chandler, J. J., & Paolacci, G. (2017). Lie for a Dime: When Most Prescreening Responses Are Honest but Most Study Participants Are Impostors. Social Psychological and Personality Science,8 (5), 500-508.
Litman, L., Rosen, Z., Spierer, D., Weinberger-Litman, S., Goldschein, A., & Robinson, J. (2015). Mobile exercise apps and increased leisure time exercise activity: a moderated mediation analysis of the role of self-efficacy and barriers.Journal of medical Internet research,(8).
Rosenzweig, C., Robinson, J., & Litman, L. (2017, January). Are They Who They Say They Are?: Reliability and Validity of Web-Based Participants’ Self-Reported Demographic Information. Poster presented at the 18th Society for Personality and Social Psychology Annual Convention, San Antonio, TX.
Last week, the research community was struck with concern that “bots” were contaminating data collection on Amazon’s Mechanical Turk (MTurk). We wrote about the issue and conducted our own preliminary...Read More >