Sampling Guide

What Is the Purpose of Sampling in Research?

Aaron Moss, PhD

By Aaron Moss, PhD, Cheskie Rosenzweig, PhD, & Leib Litman, PhD

Online Researcher's Sampling Guide, Part 1:
What Is the Purpose of Sampling in Research?

Every ten years, the U.S. government conducts a census—a count of every person living in the country—as required by the constitution. It's a massive undertaking.

The Census Bureau sends a letter or a worker to every U.S. household and tries to gather data that will allow each person to be counted. After the data are gathered, they have to be processed, tabulated and reported. The entire operation takes years of planning and billions of dollars, which begs the question: Is there a better way?

As it turns out, there is.

Instead of contacting every person in the population, researchers can answer most questions by sampling people. In fact, sampling is what the Census Bureau does to gather other information like the average household income, the level of education people have, and the kind of work people do for a living. But what, exactly, is sampling, and how does it work?

At its core, a research sample is like any other sample: It's a small piece or part of something that represents a larger whole.

So, just like the sample of glazed salmon you eat at Costco or the double chocolate brownie ice cream you taste at the ice cream shop, behavioral scientists often gather data from a small group (a sample) to understand a larger whole (a population). The process of selecting this sample, known as a sampling technique, can take many forms. Even when the population being studied is as large as the U.S., researchers only need to sample just a few thousand people to understand everyone.

Now, you may be wondering how that works. How can researchers accurately understand hundreds of millions of people by gathering data from just a few thousand of them? Your answer comes from Valery Ivanovich Glivenko and Francesco Paolo Cantelli.

Glivenko and Cantelli were mathematicians who studied probability. At some point during the early 1900s, they discovered that several observations randomly drawn from a population will naturally take on the shape of the population distribution. In plain English this means that as long as researchers randomly sample from a population and obtain a sufficiently large sample, then the sample will contain characteristics that roughly mirror those of the population. Thanks to this quality of probability, researchers can understand large populations by sampling small groups, and effective sampling relies on this mathematical principle.

Bell curve distribution showing random sampling — **Figure 1.** The blue line represents a normal distribution, also commonly known as a bell curve. Each red circle represents an observation, or a person sampled from the population. If each observation is selected randomly, then the sample will naturally reflect the qualities of the population. Thanks to this quality of probability, researchers are able to understand large populations by sampling small groups from the population.

“Ok. That's great,” you say. But what does it mean to randomly sample people, and how does a researcher do that?

Defining Random vs. Non-Random Sampling

Random sampling occurs when a researcher ensures every member of the target population has an equal chance of being selected to participate in the study. This process involves random selection, often using tools like a random number generator or random digit dialing. Although simple random sampling is the most basic form, other probability methods like systematic sampling, stratified sampling, and cluster sampling are suited to different research needs.

Importantly, ‘the target population' for a study is not necessarily all the inhabitants of a country or a region. Instead, a population can refer to people who share a common quality or characteristic. So, everyone who has purchased a Ford in the last five years can be a population and so can registered voters within a state or college students at a city university. A population is the group that researchers want to understand.

To understand a population using random sampling, researchers begin by identifying a sampling frame—a list of all the people in the target population. For example, a database of all landline and cell phone numbers in the U.S. is a sampling frame. Once the researcher has a sampling frame, he or she can randomly select people from the list to participate in the study.

However, as you might imagine, it is not always practical or even possible to gather a sampling frame. There is not, for example, a list of all the people who use the internet, purchase coffee at Dunkin', have grieved the death of a parent in the last year, or consider themselves fans of the New York Yankees. Nevertheless, there are very good reasons why researchers may want to study people in each of these groups.

When it isn't possible or practical to gather a random sample, researchers often gather a non-random sample using the methods of convenience sampling. A non-random sample is one in which every member of the target population does not have an equal chance of being selected into the study.

Quick Reference: Common Sampling Methods
While this guide focuses on the practical aspects of online sampling, you may encounter these types of sampling methods in other contexts. Understanding the basic types of sampling used in a study helps you evaluate its findings.
Probability methods (where each person in the target population has an equal chance of being selected):
Simple random sampling: Using a random number generator to select participants from a complete list. This is the gold standard that involves random selection of each participant
Systematic sampling: Selecting every nth person from a list after a random starting point
Stratified sampling: Dividing the target population into subgroups (like age brackets), then randomly sampling from each group to ensure representation
Cluster sampling: Randomly selecting geographic areas or organizational groups, then surveying everyone within those clusters
Non-probability methods (used in most online research):
Convenience sampling: Gathering data from whoever is readily available. This is the most common sampling technique for online studies
For online research, most studies use convenience sampling with controls to reduce bias, which is exactly what the rest of this guide helps you do. The key is understanding which sampling technique best serves your research goals.

Because non-random samples do not select participants based on probability, it is often difficult to know how well the sample represents the population of interest. Despite this limitation, a wide range of behavioral science studies conducted within academia, industry and government rely on non-random samples. When researchers use non-random samples, it is common to control for any known sources ofsampling bias during data collection. By controlling for possible sources of bias, researchers can maximize the usefulness and generalizability of their data.

Why Is Sampling Important for Researchers?

Everyone who has ever worked on a research project knows that resources are limited; time, money and people never come in an unlimited supply. For that reason, most research projects aim to gather data from a sample of people, rather than from the entire population (the census being one of the few exceptions). This is because sampling allows researchers to:

Save Time

Contacting everyone in a population takes time. And, invariably, some people will not respond to the first effort at contacting them, meaning researchers must invest more time for follow-up. Random sampling is much faster than surveying everyone in a population, and obtaining a non-random sample is almost always faster than random sampling. Thus, sampling saves researchers lots of time.

Save Money

The number of people a researcher contacts is directly related to the cost of a study. Sampling saves money by allowing researchers to gather the same answers from a sample that they would receive from the population. Whether you need a sample size of 100 or 10,000, sampling is almost always more cost-effective than a full census.

Non-random sampling is significantly cheaper than random sampling, because it lowers the cost associated with finding people and collecting data from them. Because all research is conducted on a budget, saving money is important.

Collect Richer Data

Sometimes, the goal of research is to collect a little bit of data from a lot of people (e.g., an opinion poll). At other times, the goal is to collect a lot of information from just a few people (e.g., a user study or ethnographic interview). Either way, sampling allows researchers to ask participants more questions and to gather richer data than does contacting everyone in a population.

The Importance of Knowing Where to Sample

Efficient sampling has benefits for researchers. But just as important as knowing how to sample is knowingwhere to sample. Some research participants are better suited for the purposes of a project than others. The sampling technique you choose and the type of sampling method you employ should match your research objectives. Finding participants that are fit for the purpose of a project is crucial, because it allows researchers to gather high-quality data.

For example, consider a team of researchers who decides to conduct a study online. The team has several different sources of participants to choose from. Some sources provide a random sample, and many more provide a non-random sample. When selecting a non-random sample, some studies are especially well-suited to an online panel that offers access to millions of different participants worldwide and others are better suited to a crowdsourced site that generally has fewer participants but more participant engagement.

To make these options more tangible, let's look at examples of when researchers might use different kinds of online samples.

Different Use Cases of Online Sampling

Academic Research

Academic researchers gather lots of samples online. Some projects require random samples based on probability sampling methods. Most rely on non-random samples. In these non-random samples, researchers may sample a general audience from crowdsourcing websites or selectively target members of specific groups using online panels. The variety of research projects conducted within academia lends itself to many different types of online samples.

Market Research

Market researchers often want to understand the thoughts, feelings and purchasing decisions of customers or potential customers. For that reason, most online market research is conducted in online panels that provide access to tens of millions of people and allow for complex demographic targeting. For some projects, crowdsourcing sites, such as Amazon Mechanical Turk, allow researchers to get more participant engagement than is typically available in online panels, because they allow researchers to select participants based on experience and to award bonuses.

Public Polling

Public polling is most accurate when it is conducted on a random sample of the population. Hence, lots of public polling is conducted with nationally representative samples. There are, however, an increasing number of opinion polls conducted with non-random samples. When researchers poll people using non-random methods, it is common to adjust for known sources of bias after the data are gathered.

User Testing

User testing requires people to engage with a website or product. For this reason, user testing is best done on platforms that allow researchers to get participants to engage deeply with their study. Crowdsourcing platforms are ideal for user testing studies, because researchers can often control participant compensation and reward people who are willing to make the effort in a longer study.

There are hundreds of companies that provide researchers with access to online participants, but only a few facilitate research across different types of online panels or direct you to the right panel for your project. At CloudResearch, we are behavioral and computer science experts with the knowledge to connect you with the right participants for your study and provide expert advice to ensure your project's successful conclusion. Check out our Connect participant platform, our Prime Panels tool orcontact us today to learn more about how we can help with your research.

Part 2: How to Reduce Sampling Bias in Research

Part 3: How to Build a Sampling Process for Marketing Research

Part 4: Pros and Cons of Different Sampling Methods

What Is Data Quality and Why Is It Important?

Aaron Moss, PhD · 08/27/19

If you were a researcher studying human behavior 30 years ago, your options for identifying participants for your studies were limited. If you worked at a university, you might be...

How to Identify and Handle Invalid Responses to Online Surveys