Causation indicates that an event affects an outcome. Do fatty diets cause heart problems? If you study for a test, does it cause you to get a higher score?
In statistics, causation is a bit tricky. As you’ve no doubt heard, correlation doesn’t necessarily imply causation. An association or correlation between variables simply indicates that the values vary together. It does not necessarily suggest that changes in one variable cause changes in the other variable. Proving causality can be difficult.
If correlation does not prove causation, what statistical test do you use to assess causality? That’s a trick question because no statistical analysis can make that determination. In this post, learn about why you want to determine causation and how to do that.
Relationships and Correlation vs. Causation
The expression is, “correlation does not imply causation.” Consequently, you might think that it applies to things like Pearson’s correlation coefficient. And, it does apply to that statistic. However, we’re really talking about relationships between variables in a broader context. Pearson’s is for two continuous variables. However, a relationship can involve different types of variables such as categorical variables, counts, binary data, and so on.
For example, in a medical experiment, you might have a categorical variable that defines which treatment group subjects belong to—control group, placebo group, and several different treatment groups. If the health outcome is a continuous variable, you can assess the differences between group means. If the means differ by group, then you can say that mean health outcomes depend on the treatment group. There’s a correlation, or relationship, between the type of treatment and health outcome. Or, maybe we have the treatment groups and the outcome is binary, say infected and not infected. In that case, we’d compare group proportions of the infected/not infected between groups to determine whether treatment correlates with infection rates.
Through this post, I’ll refer to correlation and relationships in this broader sense—not just literal correlation coefficients. But relationships between variables, such as differences between group means and proportions, regression coefficients, associations between pairs of categorical variables, and so on.
Why Determining Causality Is Important
What is the big deal in the difference between correlation and causation? For example, if you observe that as one variable increases, the other variable also tends to increase—isn’t that good enough? After all, you’ve quantified the relationship and learned something about how they behave together.
If you’re only predicting events, not trying to understand why they happen, and do not want to alter the outcomes, correlation can be perfectly fine. For example, ice cream sales correlate with shark attacks. If you just need to predict the number of shark attacks, ice creams sales might be a good thing to measure even though it’s not causing the shark attacks.
However, if you want to reduce the number of attacks, you’ll need to find something that genuinely causes a change in the attacks. As far as I know, sharks don’t like ice cream!
There are many occasions where you want to affect the outcome. For example, you might want to do the following:
- Improve health by using medicine, exercising, or flu vaccinations.
- Reducing the risk of adverse outcomes, such as procedures for reducing manufacturing defects.
- Improving outcomes, such as studying for a test.
For intentional changes in one variable to affect the outcome variable, there must be a causal relationship between the variables. After all, if studying does not cause an increase in test scores, there’s no point for studying. If the medicine doesn’t cause an improvement in your health or ward off disease, there’s no reason to take it.
Before you can state that some course of action will improve your outcomes, you must be sure that a causal relationship exists between your variables.
Confounding Variables and Their Role in Causation
How does it come to be that variables are correlated but do not have a causal relationship? A common reason is a confounding variable that creates a spurious correlation. A confounding variable correlates with both of your variables of interest. It’s possible that the confounding variable might be the real causal factor! Let’s go through the ice cream and shark attack example.
In this example, the number of people at the beach is a confounding variable. A confounding variable correlates with both variables of interest—ice cream and shark attacks in our example.
In the diagram below, imagine that as the number of people increases, ice cream sales also tend to increase. In turn, more people at the beach cause shark attacks to increase. The correlation structure creates an apparent, or spurious, correlation between ice cream sales and shark attacks, but it isn’t causation.
Confounders are common reasons for associations between variables that are not causally connected.
Related post: Confounding Variables Can Bias Your Results
Causation and Hypothesis Tests
Before moving on to determining whether a relationship is causal, let’s take a moment to reflect on why statistically significant hypothesis test results do not signify causation.
Hypothesis tests are inferential procedures. They allow you to use relatively small samples to draw conclusions about entire populations. For the topic of causation, we need to understand what statistical significance means.
When you see a relationship in sample data, whether it is a correlation coefficient, a difference between group means, or a regression coefficient, hypothesis tests help you determine whether your sample provides sufficient evidence to conclude that the relationship exists in the population. You can see it in your sample, but you need to know whether it exists in the population. It’s possible that random sampling error (i.e., luck of the draw) produced the “relationship” in your sample.
Statistical significance indicates that you have sufficient evidence to conclude that the relationship you observe in the sample also exists in the population.
That’s it. It doesn’t address causality at all.
Related post: Understanding P-values and Statistical Significance
Hill’s Criteria of Causation
Determining whether a causal relationship exists requires far more in-depth subject area knowledge and contextual information than you can include in a hypothesis test. In 1965, Austin Hill, a medical statistician, tackled this question in a paper* that’s become the standard. While he introduced it in the context of epidemiological research, you can apply the ideas to other fields.
Hill describes nine criteria to help establish causal connections. The goal is to satisfy as many criteria possible. No single criterion is sufficient. However, it’s often impossible to meet all the criteria. These criteria are an exercise in critical thought. They show you how to think about determining causation and highlight essential qualities to consider.
Studies can take steps to increase the strength of their case for a causal relationship, which statisticians call internal validity. To learn more about this, read my post about internal and external validity.
A strong, statistically significant relationship is more likely to be causal. The idea is that causal relationships are likely to produce statistical significance. If you have significant results, at the very least you have reason to believe that the relationship in your sample also exists in the population—which is a good thing. After all, if the relationship only appears in your sample, you don’t have anything meaningful! Correlation still does not imply causation, but a statistically significant relationship is a good starting point.
However, there are many more criteria to satisfy! There’s a critical caveat for this criterion as well. Confounding variables can mask a correlation that actually exists. They can also create the appearance of correlation where causation doesn’t exist, as shown with the ice cream and shark attack example. A strong relationship is simply a hint.
Consistency and causation
When there is a real, causal connection, the result should be repeatable. Other experimenters in other locations should be able to produce the same results. It’s not one and done. Replication builds up confidence that the relationship is causal. Preferably, the replication efforts use other methods, researchers, and locations.
In my post with five tips for using p-values without being misled, I emphasize the need for replication.
It’s easier to determine that a relationship is causal if you can rule out other explanations. I write about ruling out other explanations in my posts about randomized experiments and observational studies. In a more general sense, it’s essential to study the literature, consider other plausible hypotheses, and, hopefully, be able to rule them out or otherwise control for them. You need to be sure that what you’re studying is causing the observed change rather than something else of which you’re unaware.
It’s important to note that you don’t need to prove that your variable of interest is the only factor that affects the outcome. For example, smoking causes lung cancer, but it’s not the only thing that causes it. However, you do need to perform experiments that account for other relevant factors and be able to attribute some causation to your variable of interest specifically.
For example, in regression analysis, you control for other factors by including them in the model.
Temporality and causation
Causes should precede effects. Ensure that what you consider to be the cause occurs before the effect. Sometimes it can be challenging to determine which way causality runs. Hill uses the following example. It’s possible that a particular diet leads to an abdominal disease. However, it’s also possible that the disease leads to specific dietary habits.
The Granger Causality Test assesses potential causality by determining whether earlier values in one time series predicts later values in another time series. Analysts say that time series A Granger-causes time series B when significant statistical tests indicate that values in series A predict future values of series B.
Despite being called a “causality test,” it really is only a test of prediction. After all, the increase of Christmas card sales Granger-causes Christmas!
Temporality is just one aspect of causality!
Hill was a biologist, hence the focus on biological questions. He suggests that for a genuinely causal relationship, there should be a dose-response type of relationship. If a little bit of exposure causes a little bit of change, a larger exposure should cause more change. Hill uses cigarette smoking and lung cancer as an example—greater amounts of smoking are linked to a greater risk of lung cancer. You can apply the same type of thinking in other fields. Does more studying lead to even higher scores?
However, be aware that the relationship might not remain linear. As the dose increases beyond a threshold, the response can taper off. You can check for this by modeling curvature in regression analysis.
If you can find a plausible mechanism that explains the causal nature of the relationship, it supports the notion of a causal relationship. For example, biologists understand how antibiotics inhibit microbes on a biological level. However, Hill points out that you have to be careful because there are limits to scientific knowledge at any given moment. A causal mechanism might not be known at the time of the study even if one exists. Consequently, Hill says, “we should not demand” that a study meets this requirement.
Coherence and causation
The probability that a relationship is causal is higher when it is consistent with related causal relationships that are generally known and accepted as facts. If your results outright disagree with accepted facts, it’s more likely to be correlation. Assess causality in the broader context of related theory and knowledge.
Experiments and causation
Randomized experiments are the best way to identify causal relationships. Experimenters control the treatment (or factors involved), randomly assign the subjects, and help manage other sources of variation. Hill calls satisfying this criterion the strongest support for causation. However, randomized experiments are not always possible as I write about in my post about observational studies. Learn more about Experimental Design: Definition, Types and Examples.
If there is an accepted, causal relationship that is similar to a relationship in your research, it supports causation for the current study. Hill writes, “With the effects of thalidomide and rubella before us we would surely be ready to accept slighter but similar evidence with another drug or another viral disease in pregnancy.”
Determining whether a correlation also represents causation requires much deliberation. Properly designing experiments and using statistical procedures can help you make that determination. But there are many other factors to consider.
Use your critical thinking and subject-area expertise to think about the big picture. If there is a causal relationship, you’d expect to see consistent results that have been replicated, other causes have been ruled out, the results fit with established theory and other findings, there is a plausible mechanism, and the cause precedes the effect.