What Does a Negative Correlation Mean?
A negative correlation exists when two variables change in opposing directions—as one variable increases, the other decreases. Statisticians also refer to them as an inverse correlation or relationship. This type of correlation has a negative coefficient.
Unsurprisingly, a negative correlation is the opposite of a positive relationship, where the variables move in the same direction—for example, height and weight increase together.
A negative correlation sounds suspiciously like saying a relationship does not exist between two variables. However, that’s not true. A relationship between variables indicates that knowing the value of one variable tells you something about the value of the other. That’s the case with a negative correlation, where you know that an above-average X-value tends to correspond with a below-average Y-value and vice versa.
In contrast, when a relationship indeed doesn’t exist between two variables, you can increase one variable and the other variable’s change is random. It won’t tend to either increase or decrease. For example, you don’t expect the amount of exercise to correlate with the stock market. A true lack of a relationship has a correlation coefficient of zero—neither negative nor positive.
To determine whether a negative correlation exists conceptually, ask yourself, “What do I expect to happen if the value of Variable X increases?” If you expect Variable Y to decrease, it’s an inverse or negative relationship.
The usual caveat about how correlation doesn’t necessarily imply causation applies to these inverse relationships too.
We’ll look at it graphically shortly.
Learn more about Interpreting Correlation Coefficients and the Correlation Formula Walkthrough.
Negative Correlation Examples
Here are five intuitive examples of negative correlation:
- Car’s age and its value: As a car gets older, its resale value generally decreases.
- Outdoor temperature and heating costs: As the temperature outside increases, the cost to heat a home tends to decrease.
- Study time and number of errors on a test: As the time spent studying increases, the number of errors made on a test typically decreases.
- Driving speed and travel time: As the speed at which one drives increases, the time it takes to reach a destination generally decreases.
- Amount of exercise and resting heart rate: As a person’s amount of exercise increases, their resting heart rate tends to decrease.
In all these negative correlation examples, knowing the value of one variable helps you predict the value of the other. As one variable increases, the other tends to decline. The variables are related but move in opposite directions.
Graphical Examples
Previously, I showed you how to identify inverse relationships conceptually. Now we’ll look at graphical negative correlation examples. When you graph the data on a standard scatterplot, you can identify them by looking for a downward slope going from left to right.
These relationships, like their positive counterparts, can have varying strengths indicated by the correlation coefficient. The strongest negative correlation is -1 and as the coefficient moves closer to zero, that inverse relationship weakens. The graphs below illustrate that. Rho (ρ) is the correlation coefficient.
And here’s a positive and zero correlation for comparison.
Charlotte says
Hi, Jim. Thanks for another great post!
You have said that a correlation coefficient of 0 means that there is no relationship – which I understand – but I wondered what the nature of the correlation is when the correlation coefficient is 0.1 or -0.1, for example. Would that mean that there is a very weak positive or negative correlation or that there is no correlation because it is so close to 0? I am not sure if there is a cut off point, which helps determine this – perhaps this varies by discipline. Any help would be appreciated.
Jim Frost says
Hi Charlotte,
That’s a great question and there are actually several things going on here.
First, it’s important to remember that the correlation apparent in a random sample is an estimate of a population correlation and that there will be random sampling error. So, you can actually have a zero correlation in population and have a non-zero correlation in your sample. You can perform a hypothesis test to help you make that determination.
Next, suppose you have a low correlation of 0.1 or -0.1 as you suggest and your p-value for it is statistically significant. You can reject the null hypothesis which is that the population correlation is zero. Your sample provide enough evidence to conclude that there is a non-zero correlation in the population. However, the estimate of say 0.1 is extremely weak. If you were to graph that it would still look like a blob of data points with no apparent relationship. If you fit a regression model to it, you’d see the R-square was 0.1^2 = 0.01. That model explains only 1% of the variance! Almost nothing. That’s a case of being statistically significant but not significant in a practical, real-world sense.
What researchers consider to be strong, medium and weak relationships do vary by discipline. Although, I think all disciplines would consider anything around 0.1 or -0.1 to be extremely weak! And some disciplines have tried to establish benchmark values for those levels. But I don’t like that approach at all because it depends so much on the subject area.
For example, consider a study looking at a natural phenomenon, physics type relationship with high quality, low noise measurements. You’d expect a correlation near 0.99!! Anything lower might indicate a problem somewhere. However, if you’re looking at psychology research that uses a variable to predict human behavior and got a correlation of 0.5, that would be considered to be pretty good! It’s harder to predict human behavior than a natural, physical phenomenon.
When you get a correlation, it’s crucial to compare it to similar studies to see if it’s consistent or not. Use your subject-area knowledge to see how yours fits in with related research. That’s better than comparing it to some arbitrary benchmark value determined in an entirely different context.