Monday, April 9, 2012

How and When Should I Use Statistical Testing?

Statistical testing is a common deliverable provided by market research vendors. But in some cases the users of the research findings may be uncertain about what the statistical testing really means and whether or not it should influence the way they use the data. Below are five key questions to keep in mind when using statistical testing.
1. What kind of data am I dealing with? Statistical testing can only be applied to quantitative data, such as survey data. There are no statistical tests for qualitative data, such as focus groups and in-depth interviews.
2. What am I trying to learn? Most statistical testing is used primarily to help decide which of the differences we see in our data are real in terms of the population we are interested in. For example, if your findings show that 45% of men like a new product concept and 55% of women like the concept, you need to decide if that difference is real ‒that is, the difference seen in your survey accurately reflects a difference between men and women that exists in the larger population of target consumers.
3. How certain do I need to be? Confidence intervals are the most common way of deciding whether percentage differences of this sort are meaningful. The size of a confidence interval is determined by the level of certainty we demand – usually 90% or 95% in market research, 95% or 99% in medical research – and the size of our sample relative to the population it is drawn from. The higher the level of certainty we demand, the wider the confidence interval will be – with a very high standard of certainty, we need a wide interval to be sure we have captured the true population percentage. Conversely, the bigger the sample, the narrower the confidence interval - as the sample gets bigger it becomes more and more like the target population and we become more certain that the differences we see are valid.
4. How good is my sample? Most statistical tests rely on key assumptions about how you selected the sample of people from whom you collected your data. For tests like the confidence intervals described above, this key assumption is having some element of random selection built into your sample that makes it mathematically representative of the population you are studying. The further your sampling procedure strays from this assumption, the less valid your statistical testing will be. If you can make the case that your sample is not biased in any important ways relevant to your research questions, you can rely on your stats tests to identify meaningful differences. If you have doubts about your sample, use the tests with caution.
5. Does my data meet other key assumptions about the test? Some stats tests assume particular data distributions, such as the bell-shaped curve which is an underlying assumption for confidence intervals. If your data are distributed in some other way – lop-sided toward the high or low end of the scale or polarized – the stats test is worse than worthless, it will actually be misleading!
6. Does the stats testing seem to align with other things I know about the research topic? Stats tests should supplement your overall understanding of the data. They are not a substitute for common sense. Keep in mind that most data analysis software will produce stats tests automatically, whether or not the tests are appropriate for the particular data set you are using. Almost every experienced researcher has watched someone (or been someone) trying to explain a “finding” that was nothing more than a meaningless software output.
If you can provide honest, satisfactory answers to these five questions, stats testing can hugely improve your understanding of your data and help you identify its key themes. And likewise, these key questions can keep you from wasting your time analyzing differences that aren’t really there.

No comments:

Post a Comment