Skip to content

Low Incidence + Poor Data Quality Practices = False Insights


For the next post of our blog series about the new e-book, Overcoming the Biggest Threats to Market Research: Bad Data & Bad Actors, we’re moving from survey farms to incidence rates with tangible examples of how, even at a small percentage, survey farms and other bad actors can drastically change survey results. If you’ve missed our previous posts, you can start back at the beginning of this series with “The Modern Slump in Data Quality.”

Collecting informative survey data can be a challenge. And this challenge grows as the target audience shrinks. Low incidence rate studies, with target respondents representing less than 5% of the general population, are among the most informative and challenging to manage. On the one hand, the specificity of the target population gives researchers the opportunity to ask extremely detailed and nuanced questions that can be transformative for a business. On the other hand, low incidence rate studies are especially vulnerable to inauthentic or dishonest respondents. Larger incentives sometimes accompany low incidence rate studies, which risks attracting an influx of bad actors. 

Let’s take a look at how this happens and what you can do to prevent bad data from influencing important decisions.

The Impact of Incidence

An incidence rate is the percentage of all respondents who end up qualifying to participate in a survey.

An incidence rate of 100%, for example, means that all respondents passed the screening criteria and qualified as part of the desired audience. A 50% incidence rate, on the other hand, means that only half qualified.

The challenge for getting good data is that as incidence decreases, the share of cheaters often increases.

Let’s assume an average of 5% of respondents are cheaters. Motivated by an easy reward, these malicious actors are focused on qualifying rather than answering honestly. They’re good at spotting and correctly guessing the qualifying answers to screener questions.

When surveying a niche topic, say one with only 10% incidence, researchers must screen a larger number of respondents to find enough eligible panelists. If a survey needs 200 panelists to represent a reasonable sample, 2,000 individuals would need to be screened for eligibility.

If 5% of the 2,000 initial participants are cheaters looking to sneak into the survey, upwards of half of the 200 panelists could, in the end, be dishonest actors. When 50% of your respondents are cheaters, it’s nearly impossible to generate insights that reflect reality.

How to Rescue a Low Incidence Survey

Because the impact of cheaters grows significantly as incidence decreases, it’s critical to leverage best practices and advanced data-cleaning techniques on lower-incidence surveys.

Clean, actionable data starts with a good survey design. Best practice survey design includes adding knowledge checks and hard-to-guess-on screener questions to weed out dishonest respondents before and during the survey. For low-incidence surveys, knowledge checks could be questions that only your truly eligible demographic would be able to answer.

Reducing the impact of scammers in a low-incidence survey also means being more diligent with data cleaning. During and after the survey, researchers should use a combination of expertise and technology to remove bad data from the results. This should be more than a simple algorithm to look for specific patterns—data should be assessed with multiple rubrics and assigned a score based on how it performs.

For a deep dive into how innovative researchers use advanced data-cleaning techniques to improve survey results and how to maximize the quality of your data in low-incidence studies, check out our comprehensive e-book that includes further examples and calculations of the impacts at various incidence rates.




Subscribe to our Monthly Newsletter