Consistency Testing of Online Research Panels
Is your data measuring opinion or a change in the sampling frame?
Consistency of online samples is the core issue for market researchers. The ARF has said it, intuitively
we all know it and recent studies, including those of Mktg, Inc. have proven it.
After all, much of the value we provide is in the tracking studies we perform; even one-time studies
should relate to some reference and not float in a sea of variability between panels. If your data changes,
it is essential to know if the changes are real or the inadvertent product of sample inconsistency.
Where once research was well grounded in a probabilistic framework supported by an underlying census
of the population, online market research has moved into a new era, from a probabilistic framework to
“working without a net.” In the absence of a probabilistic net to anchor samples, non-probabilistic
samples can drift without our knowing.
One, now historic, example of this happening was presented by Ron Gailey (IIR 2008), now of Coca
Cola, previously of Washington Mutual, who disclosed how 29 studies representing 40,000 online
interviews had gone astray due to panel inconsistency. The lingering question, now that WaMu is gone,
is how the tainted research impacted on critical business decisions.
Hidden in all of this, is the concept of consistency. After all, if we measure bias and can’t anticipate its
shifts over time, then we will not understand which changes are coming from our data or from
background noise in the sample. Thus, as the ARF announced in June of 2009, the issue of consistency is
the most important area of concern. We must learn to measure not only what the constituent elements of
our data sources are but also how they change over time. In other words, we have to enter a new world of
Mktg, Inc. has moved onward from its initial study of the American markets and has expanded its
research to include 140 panels in 35 nations. In each, a standard instrument is used in a tracking study
that includes a diversity of measures but mostly focuses on buying behavior segmentations. By
conducting repeat waves of this consistency study, a local Grand Mean is calculated for each market. In
addition, using standard quality control techniques, an analysis of the consistency of each panel is
The Grand Mean is an aggregate statistic. It is a measurement of consistency that should be reliable and
yield a sense of predictable change. No panel represents the universe as well as the sum of many panels
together. Think of the Grand Mean as a group of indices that are measured from the sample of each panel
over and over again: tracking panel quality through time and then merged to create a composite reference
of the online universe.
We test panels regularly for consistency: each participates in at least four waves of audit per year. As the
panel companies can share this data with as many customers as they chose, this provides end users with
assurances regarding the stability of panel output. Their combined consistency data is compiled to
generate a Grand Mean within a market as a new metric to anchor panels and tracking studies alike. The
composite data provides rich insights into shifting in the sample universe and inconsistencies within
individual panels. If needed, source blending using optimization modeling serves to correct drift. The
Grand Mean metric itself is anchored to a battery of outside benchmarks (we have collected data on over
thirty such measures).
Are changes in data real or sample shifts?
This “radar” plot summarizes changes over time in a multiple wave consistency analysis. Also shown is
the expected sampling error. Values greater than the expected error are viewed as potential important
issues of inconsistency.
Buying behavior, our most important measure.
Changes in the panel population are measured by the distribution of structural segments including as here,
buyer behavior segments. This segmentation scheme captures the overall effective changes in over 30