What’s the Secret to CloudResearch’s Data Quality? Sentry®

Aaron Moss, PhD

At CloudResearch, we work with behavioral scientists across industries and organizations. From academia, to government agencies, think tanks, market and corporate research companies, and non-profit organizations, anyone who relies on human subjects data and intends to gather that data online can turn to us for their participant recruitment needs. There is little we haven’t seen, and our reputation for the highest quality data is hard won.

But, maybe you’ve wondered: what’s the secret to CloudResearch’s data quality? In an online environment increasingly challenged by fraud, artificial intelligence, and plain old respondent inattention, how does CloudResearch deliver high-quality data? The answer is with Sentry.

What Is CloudResearch’s Sentry Vetting System?

Sentry is a cutting-edge tool designed to improve the quality of online data from any sample source. It contains a variety of measures crafted to identify fraud, bots, and inattentive respondents. The measures typically take ~30 seconds to complete and allow CloudResearch to judge with a high degree of accuracy whether a respondent will provide high-quality survey data or not.

Sentry is typically used as part of the participant onboarding process or as a pre-survey screener. People who pass Sentry are allowed to participate in studies while those who fail are blocked from your online projects. A schematic of the workflow is below.

Each of the measures within Sentry are based on years of research and development by our team of expert behavioral and computer scientists. What separates Sentry from other data quality tools on the market is its mix of behavioral assessment with technical checks for fraud. 

The behavioral measures within Sentry are pulled from large libraries with different question types. Some questions check people’s proficiency in the language of the survey, some measure attention or the ability to follow instructions, and some assess honesty or participant misrepresentation. All of the items within Sentry rotate so it is harder for bad actors to learn how to game the system, and our team has extensively tested each item to ensure it removes undesirable respondents without harming the demographic representation of the underlying sample.

In addition to robust behavioral vetting, Sentry contains technical checks that look for suspicious behavior on a participant’s device as they complete the study. For instance, Sentry can examine each person’s IP address, geolocation, device type, and look for duplicate entries among different participant accounts. It can also “see” when people translate a survey into another language, copy and paste text from outside sources, or use automation to fill in forms, bubbles, or other question types. The videos below show bad actors caught by Sentry.

A participant caught by Sentry translating the survey into another language.
A participant using their mobile device’s text suggestions to complete an open ended item. Sentry catches this type of behavior as well as copied and pasted responses.

While each measure within Sentry contributes to its overall effectiveness, perhaps its most important feature is versatility. Sentry has been designed to improve data quality from any online source, including all CloudResearch products.

Interested in adding Sentry to your next data collection project? Our Managed Research team can manage your next project for you, ensuring that you get the most accurate sample protected by Sentry.

How Does CloudResearch Deploy Sentry Across Products?

CloudResearch offers access to participants from multiple online sources with the understanding that any one source will not meet the needs of all projects. Over time, we’ve adapted Sentry so it is compatible with all of our services.

When it comes to Connect, our premiere participant recruitment platform and the only platform CloudResearch has built from the ground up, Sentry is a key component of the onboarding process. Shortly after creating an account, participants complete a version of Sentry. We then pair the Sentry data with data from internal projects, flags and rejections submitted by researchers, and other back-end assessments of suspicious activity. Together, these measures enable us to offer the best participant recruitment platform anywhere online, with fantastic quality for both closed- and open-ended measures.

“I’ve collected data for a decade and won’t go back to any other online panel. I love Connect’s UI, ability to target specific samples, and its consistent delivery of the highest quality data.”

Nick Rosemarino, Columbia University

“This platform is great, it’s super fast and almost everyone passes attention checks!”

Dr. Grant E. Donnelly, Ohio State University

Sentry functions similarly in the process used to create our CloudResearch Approved group of participants on Amazon’s Mechanical Turk (MTurk). The CloudResearch Approved group is a subset of participants on MTurk that is only available through our MTurk Toolkit or an API licensing agreement. Each of the more than 150,000 participants in this group has been vetted by a version of Sentry. Peer-reviewed research, our own and that from others, shows that this vetting improves data quality on MTurk.

Finally, the third source of participants CloudResearch offers access to is Prime Panels. Unlike Connect and MTurk, Prime Panels taps into participant sources often used for market research. These platforms are exponentially larger and built on aggregation across multiple participant providers, making respondent level vetting at the point of sign up impossible. Therefore, we deploy Sentry as a pre-survey screener. Participants who fail our measures are blocked before they ever even enter your project, while those who pass are allowed to take your survey. In a peer reviewed paper, we showed that a significantly paired down version of Sentry improves data quality from market research panels.

Although each section above details how we use Sentry to improve data quality from CloudResearch, Sentry can be used with any online sample source. That means that even if you get participants from somewhere else, you can direct participants through Sentry before they reach your survey. Doing so delivers the same benefits as if you sampled from CloudResearch.

What Evidence is There For Sentry’s Effectiveness?

It is often hard to find evidence of disasters that didn’t happen (i.e., the mistakes that Sentry can prevent). Fortunately, however, we’ve compiled several convincing, real-world examples of how conducting online research without Sentry leads to one set of (inaccurate) conclusions while using Sentry leads to a very different set of conclusions (and decisions). Three of these examples are:

  • A CDC Case Study
  • Our partnership with Kelloggs, and
  • An investigation into survey fraud

CDC Case Study

Early in the COVID-19 pandemic we saw a paper published by researchers at the Center for Disease Control and Prevention claiming that 40% of Americans were engaging in dangerous cleaning practices, including a portion who reported doing things like gargling bleach. We were skeptical of these findings. Knowing what we know about fraud in online participant recruitment platforms, we decided to re-run the study exactly as the CDC ran it. The only difference was that we applied Sentry. 

The results of our study showed that without Sentry we replicated the CDC’s findings almost perfectly. With Sentry, however, we found that most reports of dangerous cleaning practices, including nearly all of those related to gargling dangerous cleaners like bleach, were false. The CDC’s findings created headlines that were covered by hundreds of news organizations around the world and read by countless numbers of people. But the headlines were wrong and the reason why was fraud. Sentry could have prevented the consequences of bad data quality.   

Kellogg’s Partnership

Every large brand needs data to understand how consumers feel and how they might react to new products. Often, an easy place to acquire this data is online. But, if the data gathered online is infected with fraud, brands can get burned by bad decisions.

In partnership with Kellogg’s, we presented at IIEX 2023 in Austin (watch here), to show how inattentive and fraudulent online participants created the potential for researchers at Kellogg’s to misread consumer sentiment. The project conducted by Kellogg’s sought to understand how consumers would react to a new product using the theme of ‘fire eaters’ in its marketing. When the team at Kellogg’s uncovered evidence of an unsavory history to this term, they wanted to know how consumers felt about it. An online research project seemed like a good way to find out.

Even though the Kellogg’s team shelved the idea for other reasons, our work with them showed that fraud and inattentive respondents significantly undersold the actual risks of using the concept of ‘fire eaters’ in a marketing campaign. When online data are low quality, companies cannot make accurate decisions and the consequences of a misstep may be high.

Finding Fraud

In addition to our internal work, Sentry is regularly used by several market research agencies, corporations conducting online research, and the panels providing market research participants. That means each month millions of online respondents pass through our system, giving us a good look at data quality and fraud.

We took the data gleaned from Sentry and launched an investigation into the sources of online survey fraud. Our investigation led us to conduct hundreds of interviews with people in countries all over the world. By conducting video interviews with fraudsters, we were able to learn about their methods and ensure Sentry stays one step ahead. You can see a presentation of our findings from an Insights Association webinar here.

What Benefits Can I Expect When Adopting Sentry?

Because Sentry yields significantly cleaner data than alternative sample sources, it delivers two primary benefits: time saved screening data and confidence in quality.

Every researcher and every research team has their own process for assessing quality in online research studies. Sentry is not a replacement for these processes (no system is 100% perfect). Instead, Sentry results in substantially less time devoted to assessing quality because most of the responses are already high quality.  

The second benefit of Sentry is confidence. No one feels confident when looking at a dataset that is full of fraud. Sentry keeps these responses out of your project, and because CloudResearch’s team are the foremost experts on survey fraud anywhere in the world you can also be confident that we will change and adapt our measures as the nature of survey fraud changes. You don’t need to worry about how to stop fraud from ruining your research because we’re already doing that for you!

So, there you have it: a summary of why Sentry is the backbone of data quality at CloudResearch and the gold standard for data quality in online research. When you are ready to run your next project, consider whether a CloudResearch sample source is right for you–it comes with the benefit of Sentry protection. If your sample needs take you elsewhere, don’t forget that you can take Sentry with you and rest easy knowing you have the best in data quality protection regardless of your sample source. 

Related Articles