• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar
  • My Store
  • Glossary
  • Home
  • About Me
  • Contact Me

Statistics By Jim

Making statistics intuitive

  • Graphs
  • Basics
  • Hypothesis Testing
  • Regression
  • ANOVA
  • Probability
  • Time Series
  • Fun

Using Excel to Calculate Correlation

By Jim Frost Leave a Comment

Excel can calculate correlation coefficients and a variety of other statistical analyses. Even if you don’t use Excel regularly, this post is an excellent introduction to calculating and interpreting correlation.

In this post, I provide step-by-step instructions for having Excel calculate Pearson’s correlation coefficient, and I’ll show you how to interpret the results. Additionally, I include links to relevant statistical resources I’ve written that provide intuitive explanations. Together, we’ll analyze and interpret an example dataset!

Before proceeding, check that Excel’s Data Analysis ToolPak is installed. On the Data tab, find Data Analysis, as shown below.

Excel menu with Data Analysis ToolPak.

If you don’t see Data Analysis, install that ToolPak. Learn how to add it in my post about using Excel to perform t-tests. It’s free!

Correlation Basics

When a correlation between a pair of variables exists, it indicates that as one variable changes, the other variable tends to change in a particular direction. There are two important properties to evaluate, direction and strength.

The direction of the correlation corresponds to the coefficient’s sign, positive or negative. As one variable increases, does the other variable tend to increase or decrease?

  • A positive correlation indicates that as one variable increases, the other tends to increase.
  • A negative correlation signifies that as one variable increases, the other tends to decrease.

The values indicate the strength of the correlation and they can range from -1 to +1. The closer the value is to -1 or 1, the stronger the relationship. Values near zero represent no relationship.

For a more thorough understanding of correlations, please read my post about Interpreting Correlation Coefficients. I provide examples, graphs that display negative and positive relationships of different strengths, important caveats and warnings, and so on.

Example Height and Weight Dataset for Correlation

This height and weight dataset is the same one I use for my post about Descriptive Statistics in Excel, so you’ll see that information in the Excel file as well. Download the Excel file that contains the data for this example: HeightWeightCorr.

Before performing statistical analyses, it’s always helpful to graph your data. Below is the scatterplot of the Height and Weight dataset.

Excel's scatterplot of the height and weight dataset that displays a positive correlation.

The chart displays an upward slope. In other words, as a person’s height increases, their weight also tends to increase. We have a positive correlation.

How strong is the relationship?

If you pick an individual height on the scatterplot, say near 1.5M, you’ll see weights that range from about 35kg to nearly 60kg. This spread indicates that the correlation is not perfect. You can also find taller individuals that weigh less than shorter individuals. For example, just by looking at data points on the graph, I see an individual at about 1.63M who weighs nearly 30kg less than a shorter person at 1.52M.

While it’s not a perfect association, there’s a definite tendency for taller people to weigh more. Visually, it appears to be a moderately strong association. That’s a subjective assessment.

Let’s calculate the correlation using Excel to get an objective measure!

Related post: Guide to Data Types and How to Graph Them

Using Excel to Calculate Correlation

For this example, we’ll assess two variables, the height and weight of preteen girls. I collected these data during an actual experiment. To use the correlation feature in Excel, arrange your data in columns or rows. I have my data in columns, as shown in the snippet below.

Portion of the height and weight dataset for the correlation example.

In Excel, click Data Analysis on the Data tab, as shown above. In the Data Analysis popup, choose Correlation, and then follow the steps below.

Excel's popup menu of data analysis tools with correlation highlighted.

Step-by-Step Instructions for Excel’s Correlation Analysis

  1. Under Input Range, select the range for the variables that you want to analyze. You must include at least two variables for correlation and they must form a contiguous block in your worksheet. Excel performs all possible pairwise correlations for the variables you specify.
  2. In Grouped By, choose how your variables are organized. I always include one variable per column because this format is standard across statistical software. Alternatively, you can include one variable per row.
  3. Check the Labels in first row checkbox when you have meaningful variable names in row 1. This option makes the output easier to interpret.
  4. In Output options, choose where you want Excel to display the results. I like to put them in a separate worksheet.
  5. Click OK.

For our example, fill in the dialog box as shown below.

Excel's correlation dialog box.

Interpreting Excel’s Correlation Results

After Excel creates the statistical output, I autofit the columns for clarity.

Excel's correlation results that display a correlation between height and weight of 0.705.

We’re interested in the relationship between Weight kg and Height M. The two perfect values of 1 are each variable correlated with itself!

The height by weight correlation is 0.705. The positive coefficient is consistent with the upward slope we saw on the scatterplot. The strength of the association is 0.705, which is moderate. It definitely exists. Taller people tend to weigh more. However, as we saw on the graph, it’s not close to being a perfect correlation. A tendency exists, but there are exceptions to the trend.

For the height and weight data, there’s undoubtedly an overall trend, or tendency, for taller people to weigh more.

In this post, I use Excel to calculate Pearson’s correlation coefficient. To see how I use Excel to calculate Spearman’s rank order correlation, read my post about Spearman’s Correlation Explained.

Share this:

  • Tweet

Related

Filed Under: Basics Tagged With: analysis example, Excel, graphs, interpreting results

Reader Interactions

Comments and Questions Cancel reply

Primary Sidebar

Meet Jim

I’ll help you intuitively understand statistics by focusing on concepts and using plain English so you can concentrate on understanding your results.

Read More...

Buy My Introduction to Statistics Book!

Cover of my Introduction to Statistics: An Intuitive Guide ebook.

Buy My Hypothesis Testing Book!

Cover image of my Hypothesis Testing: An Intuitive Guide ebook.

Buy My Regression Book!

Cover for my ebook, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models.

Subscribe by Email

Enter your email address to receive notifications of new posts by email.

    I won't send you spam. Unsubscribe at any time.

    Follow Me

    • FacebookFacebook
    • RSS FeedRSS Feed
    • TwitterTwitter

    Top Posts

    • How to Interpret P-values and Coefficients in Regression Analysis
    • How To Interpret R-squared in Regression Analysis
    • Multicollinearity in Regression Analysis: Problems, Detection, and Solutions
    • Mean, Median, and Mode: Measures of Central Tendency
    • How to Find the P value: Process and Calculations
    • How to do t-Tests in Excel
    • Z-table
    • Choosing the Correct Type of Regression Analysis
    • One-Tailed and Two-Tailed Hypothesis Tests Explained
    • How to Interpret the F-test of Overall Significance in Regression Analysis

    Recent Posts

    • Slope Intercept Form of Linear Equations: A Guide
    • Population vs Sample: Uses and Examples
    • How to Calculate a Percentage
    • Control Chart: Uses, Example, and Types
    • Monte Carlo Simulation: Make Better Decisions
    • Principal Component Analysis Guide & Example

    Recent Comments

    • Jim Frost on Monte Carlo Simulation: Make Better Decisions
    • Gilberto on Monte Carlo Simulation: Make Better Decisions
    • Sultan Mahmood on Linear Regression Equation Explained
    • Sanjay Kumar P on What is the Mean and How to Find It: Definition & Formula
    • Dave on Control Variables: Definition, Uses & Examples

    Copyright © 2023 · Jim Frost · Privacy Policy