There’s an old saying that goes, “figures don’t lie, but liars can figure.” But sometimes even the figures can spin a confusing story. That’s why I’ve always appreciated the power of understanding statistics. I even remember encouraging my sons to take a statistics class over a calculus course because, unlike calculus, I’ve used statistics almost every day, but I haven’t used calculus since I happily departed differential equations class in college.
Many organizations fall in a common fundamental error when it comes to analyzing and attributing their own successes and failures. They confuse correlation with causation.
Correlation Versus Causation
Let’s start by defining our terms. Correlation is when two items are linked in some way statistically. In other words, we can use stats to show that these two things are linked together with a high rate of probability.
Let’s say that when I eat too much pizza, I feel pain in my stomach. The action of eating a lot of pizza is highly correlated with me feeling sick.
While these two actions are correlated, they don’t necessarily mean that the first action caused the second. What if, for example, it turned out that I had Celiac disease–which is an allergy to wheat. In this case, that means that it’s because the pizza crust was made with wheat which caused my illness–not that I ate too much.
The Correlation Trap
A more classic example of how it’s easy to confuse correlation with causation is that we can statistically prove that 97% of people who got into a car accident drank at least one glass of water 24 hours before the accident. There is clear mathematical proof that drinking a glass of water is highly correlated with car accidents. But we all know there is no causation to this ridiculous relationship.
While that example is clear, the identical mistake is made thousands of times a day in businesses. Statistical analysis is performed between a factor and an outcome, and a high degree of correlation is found. This is a case of confusing correlation with causation. This comes out when the experiment is scaled based on those statistics and the outcome isn’t generated. Something else was causing the outcome, not the non-causal, but correlated factor.
Thanks to the famous philosopher and writer John Stuart Mill, who studied the relationship between correlation and causation back in his , A System of Logic Ratiocinative and Inductive (1843), there are three questions you can ask to determine causality.
- Do X