By Aaron Moss, PhD, Cheskie Rosenzweig, PhD, & Leib Litman, PhD
Among public pollsters, the year 1936 lives in infamy, because that year, the magazine Literary Digest conducted what remains one of the worst public opinion polls in history.
After correctly predicting the previous five presidential contests, the Digest decided to mail questionnaires to more than 10 million Americans asking who they planned to vote for in the 1936 presidential race. Based on more than 2 million responses, the Digest confidently predicted that Alf Landon would win the presidency with 62% of the vote. Yet on election day, it was Franklin Roosevelt who won in a landslide, leaving the Digest to wonder how it got the outcome so wrong.
As it turned out, the answer was a classic example of sampling bias.
Because the Digest identified potential voters using telephone and automobile records, the people they sampled tended to be wealthy. Although wealthy Americans preferred Landon over Roosevelt, poor Americans favored Roosevelt by a strong margin. Undersampling poor people caused the Digest to get the outcome of the election completely backward.
As the Literary Digest example demonstrates, sampling bias occurs when the range of people who can participate in a study is restricted. Several factors can lead to sampling bias.
First, the people who participate in a study may be systematically different than the population being studied. When this occurs, it is known as coverage bias. Second, some people who have the opportunity to participate in a study may choose not to. When there is a common characteristic between the people who choose not to participate, result is non-response bias. Finally, the opposite of non-response bias, self-selection bias, occurs when the people who choose to participate in a study all share a characteristic that makes them systematically different than those who do not participate.
Whether or not sampling bias affects the interpretation of a study’s results depends on both how researchers gather their data and how they use the data. Generally speaking, behavioral scientists conduct three types of research.
Returning to our conversation about sampling bias, the strength of random sampling methods is that they eliminate most sources of bias. When researchers use random sampling, they can reasonably apply their results to the population being studied, regardless of whether the research aims to describe the frequency of behavior, investigate the association between variables, or test experimental effects.
When researchers use non-random samples, they have to think more carefully about potential sources of bias. For example, one form of non-random sampling is known as convenience sampling. When researchers use convenience sampling, they gather data from whomever is readily available and by allowing participants to enter the study on a first-come, first-served basis, with no effort to control sample composition (see Baker et al., 2013). Because this approach leaves a study open to a number of sampling biases, it is common for researchers running studies online to try to control or eliminate sources of sampling bias.
When researchers take this middle ground, they gather what may be called controlled samples, which are a kind of hybrid, falling somewhere between random and non-random samples. Although controlled samples are gathered from sources that are based on convenience, researchers take active measures to eliminate sources of bias and maximize the generalizability of their results. Research suggests that these hybrid samples often provide a reasonable trade-off between the high cost, impracticality and slow nature of random sampling and the unconstrained sources of bias inherent to convenience samples, as researchers often find similar results in both samples.
People typically find it easiest to think about sampling bias within the realm of random samples when the researcher’s goal is to apply the findings from a sample of the population. However, most research conducted by academics, market researchers and businesses isn’t conducted with random samples. Instead, researchers in these domains often run studies with controlled online samples that are purposefully gathered to correct for known sources of bias. How might sampling bias affect online studies?
Simply put, sampling bias within non-random samples can distort research findings. When the demographic variables or other characteristics of the people in a sample are systematically related to the topic being investigated, researchers will obtain findings that are either stronger or weaker than exists within other groups.
For example, consider a researcher on the East Coast who launches an online study early in the morning. Because most online studies fill up in a matter of hours, people who live on the West Coast are unlikely to take part in the study due to the time difference (Casey et al., 2017).
If, as research suggests, people on the West Coast have different attitudes toward things like social norms (e.g. Plaut et al., 2012) and climate than those on the East Coast, a study investigating behaviors to combat climate change might have a region-specific sampling bias. This sampling bias could distort the researcher’s findings by leading them to believe the relationship between social norms and behaviors to combat climate change is either weaker or stronger than it actually is within a broader population.
Fortunately for researchers who conduct studies online, many causes of sampling bias are well known. They include:
Participant demographics are a common source of sampling bias. For some research questions, participant age, gender, ethnicity, religion, political ideology, socioeconomic status or other demographic characteristics might be related to the research question. If so, the researcher may want to control for these variables.
Research shows that if you ask most Americans to consider how God feels about abortion before reporting their own attitudes, they will become significantly more opposed to abortion than if you asked them to report their own attitudes before considering God’s attitude (Epley, Converse, Delbosc, Monteleon, Cacioppo, 2009). As you might suspect, however, this effect depends on how religious people are. People on Mechanical Turk are, as a whole, less religious than the general U.S. population. Thus, when trying to replicate the “God on Our Side” effect or other questions related to religion, researchers might want to control for the religion of participants.
Depending on where the researcher gathers data, there may be factors about the platform or website that introduce bias. For example, most online panels work hard to sign up a large number of potential participants. Some participants within these panels complete more studies than others and become familiar with common research procedures. For some studies, participant experience or tenure might introduce a form of sample bias that researchers want to control.
Some people on Amazon’s Mechanical Turk have a lot of experience with research studies. At times, this experience may reduce the strength of common experimental manipulations. To avoid the potential problems of sampling highly experienced participants, researchers may choose to sample in a way that ensures participants are inexperienced. By choosing to sample inexperienced participants, researchers can control for the potential biasing effect of participant experience.
A benefit of online studies is that they are often available anytime that’s convenient for research participants. This convenience, however, may also have a drawback. For some research questions having most participants complete the study at an unusual time of day (e.g., 3:00 a.m.) or on specific days of the week may introduce sampling bias. In addition, the ability of participants to easily quit a study may produce problems for experiments that rely on random assignment.
As originally reported by Kouchaki and Smith (2014), the morning morality effect explains a general tendency for people to act more ethically early in the day and less ethically later in the day, as resources for self-control are taxed. Although the original research reported a general effect of time of day, subsequent research suggested that accounting for people’s circadian typology, in addition to time of day, might do a better job of predicting moral behavior than time of day alone (Gunia, Barnes, Sah, 2014). Thus, studies that investigate phenomena that vary across time of day and fail to account for variation in peoples’ circadian typology may yield incomplete or inaccurate findings.
Avoiding sampling bias requires thoughtful planning and careful execution during data collection. When running studies online, researchers need to think about potential sources of bias and how much of a threat each poses to their research before starting data collection.
If researchers think participant gender, age, ethnicity or some other demographic characteristic is a potential source of bias within their study, then there are several potential solutions. Perhaps the easiest solution is to set up quotas for each identified demographic. Quotas allow researchers to evenly sample people from different demographic groups within the study.
In fact, a commonly used quota system for many online studies is a census-matched template. With census matching, quotas are automatically applied to a study so that the final sample has participants of different ages and ethnicities that are based on each group’s representation in the U.S. census. Similar quotas can be used for a variety of other demographic variables.
Within any one research platform, it usually becomes easier to control potential sources of bias as more research accumulates examining the characteristics of the platform. For example, one of the best-studied research platforms is Amazon Mechanical Turk. The accumulation of public knowledge about MTurk is part of the reason why the platform became so popular among academic researchers.
One fact that researchers have uncovered about MTurk is that participants on Mechanical Turk are less religious than people in the U.S. population. Because researchers know this about MTurk, a researcher examining the influence of religion on attitudes toward capital punishment would know to target people with a wide range of religious beliefs so that the sample would have enough diversity to test the idea. Knowledge about the platform can help avoid bias.
Some online platforms give researchers limited control over the data collection process, while others give researchers nearly complete control. On platforms that give researchers control, such as Amazon’s Mechanical Turk, researchers can choose what level of participant experience is acceptable for their study. Using reputation qualifications, researchers can sample people who have lots of experience completing studies or people who have very little prior experience. Each of these groups of participants can be useful for different types of projects.
Similar to a researcher’s ability to control factors related to the research platform, the ability to curb the influence of the environment on a participant’s responses depends on how much control the platform gives the researcher. For example, with many online platforms, researchers do not have control over when the study is launched or when it is available to participants. Yet within a few platforms, researchers can choose when to launch their study or when to pause it. In addition, some services, such as the CloudResearch’s MTurk Toolkit, provide researchers with the means of slowing down data collection so that time of day and day of the week do not bias a study’s results.
CloudResearch makes it easy to trust your data by giving you the knowledge and tools to control sources of sampling bias. You can use our demographic targeting tools to control sample composition, gather a census-matched sample, or minimize the effect of environmental factors by controlling when your data collection occurs. Contact us to learn how you can get the research sample you need.