• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar
  • My Store
  • Glossary
  • Home
  • About Me
  • Contact Me

Statistics By Jim

Making statistics intuitive

  • Graphs
  • Basics
  • Hypothesis Testing
  • Regression
  • ANOVA
  • Probability
  • Time Series
  • Fun
  • Calculators

Variance: Definition, Formulas & Calculations

By Jim Frost 2 Comments

Variance is a measure of variability in statistics. It assesses the average squared difference between data values and the mean. Unlike some other statistical measures of variability, it incorporates all data points in its calculations by contrasting each value to the mean.

When there is no variability in a sample, all values are the same, and the variance equals zero. As the data values spread out further, variability increases.

For example, these two distributions have the same mean. However, the dataset on the right has greater variability and, hence, a higher variance.

Graph that shows two distributions with more and less variability.

In this post, learn how to calculate both population and sample variance and how to interpret them.

Related post: Measures of Variability

Variance Formulas

There are two formulas for the variance. The correct formula depends on whether you are working with the entire population or using a sample to estimate the population value. In other words, decide which formula to use depending on whether you are performing descriptive or inferential statistics.

The equations are below, and then I work through an example of finding the variance to help bring it to life.

Population variance formula

Use the population form of the equation when you have values for all members of the group of interest. In this case, you are not using the sample to estimate the population. Instead, you have measured all people or items and need the variance for that specific group. For example, if you have measured test scores for all class members and need to know the value for that class, use the population variance formula.

The population variance formula is the following:

Population variance formula.

In the population variance formula:

  • σ2 is the population variance.
  • Xi is the ith data point.
  • µ is the population mean.
  • n is the number of observations.

To find the variance, take a data point, subtract the population mean, and square that difference. Repeat this process for all data points. Then, sum all of those squared values and divide by the number of observations. Hence, it’s the average squared difference.

Statisticians refer to the numerator portion of the variance formula as the sum of squares.

Sample variance formula

Use the sample variance formula when you’re using a sample to estimate the value for a population. For example, if you have taken a random sample of statistics students, recorded their test scores, and need to use the sample as an estimate for the population of statistics students, use the sample variance formula.

The population formula tends to underestimate variability when you use it with a sample. The sample formula below corrects for that bias.

Sample variance formula.

In the sample variance formula:

  • s2 is the sample variance.
  • Xi is the ith data point.
  • x̅ is the sample mean.
  • n–1 is the degrees of freedom.

The calculation process for samples is very similar to the population method. However, you’re working with a sample instead of a population, and you’re dividing by n–1. This denominator counteracts a bias where samples tend to underestimate the population value.

Let’s work through an example calculation!

How to Find Variance

Here’s an example of how to calculate the variance using the sample formula. The dataset has 17 observations in the table below. The numbers in parentheses correspond to table columns.

To calculate the statistic, take each data value (1) and subtract the mean (2) to calculate the difference (3), and then square the difference (4).

At the bottom of the worksheet, I sum the squared values, and divide it by 17 – 1 = 16 because we’re finding the sample value.

The variance for this dataset is 201.

Table illustrating how to find the variance.

You can also use my Variance Calculator that finds the answer, shows you how to calculate it, and graphs your data!

Interpreting the Variance

The variance in statistics is the average squared distance between the data points and the mean. Because it uses squared units rather than the natural data units, the interpretation is less intuitive. Higher values indicate greater variability, but there is no intuitive interpretation for specific values. Despite this drawback, some statistical hypothesis tests use it in their calculations. For example, read about the F-test and ANOVA.

Squaring the differences serves several purposes.

Squaring the differences prevents values above and below the mean from canceling each other out. Consequently, variance is always greater than or equal to zero. It is almost always a positive value because only datasets containing one repeated value (e.g., all values equal 15) have a value of zero.

Additionally, squaring differences disproportionately increases the impact of data points that are further from the mean. This additional weight mirrors the properties of the normal distribution where outliers are substantially less likely to occur. Extreme values do not fall off linearly.

If you take the square root of the variance, you obtain the standard deviation, which does use the intuitive natural data units. The mean absolute deviation is another measure of variability that also uses natural units, but its formula does not square the differences.

Share this:

  • Tweet

Like this:

Like Loading…

Related

Filed Under: Basics Tagged With: conceptual, distributions, interpreting results

Reader Interactions

Comments

  1. Bob E. says

    February 8, 2022 at 8:43 am

    I think what Jim Frost meant is that there is more variability in the population than in a sample of that population. Therefore, the sample variance with (n) as a denominator underestimates the population variance. However, if you divide by the smaller number (n-1) instead of (n), the resultant variance is otherwise larger and now is a good estimate of the population variance.

    Loading...
    Reply
  2. Darshan Goswami says

    February 1, 2022 at 4:55 am

    Thanks for your explanation. Is there any specific reason behind having n-1 (DF) as denominator for calculating sample variance? I couldn’t get the logic “to counteract a bias where samples tend to underestimate the population value”. Can you pl. elaborate.

    Loading...
    Reply

Comments and QuestionsCancel reply

Primary Sidebar

Meet Jim

Iโ€™ll help you intuitively understand statistics by focusing on concepts and using plain English so you can concentrate on understanding your results.

Read More...

Buy My Introduction to Statistics Book!

Cover of my Introduction to Statistics: An Intuitive Guide ebook.

Buy My Hypothesis Testing Book!

Cover image of my Hypothesis Testing: An Intuitive Guide ebook.

Buy My Regression Book!

Cover for my ebook, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models.

Subscribe by Email

Enter your email address to receive notifications of new posts by email.

    I won't send you spam. Unsubscribe at any time.

    Buy My Thinking Analytically Book!

    Cover for my book, Thinking Analytically: An Guide for Making Data-Driven Decisions.

    Top Posts

    • F-table
    • Z-table
    • Cronbachโ€™s Alpha: Definition, Calculations & Example
    • How To Interpret R-squared in Regression Analysis
    • Box Plot Explained with Examples
    • Multicollinearity in Regression Analysis: Problems, Detection, and Solutions
    • How to Interpret P-values and Coefficients in Regression Analysis
    • Interpreting Correlation Coefficients
    • Choosing the Correct Type of Regression Analysis
    • Root Mean Square Error (RMSE)

    Recent Posts

    • Data Collection Methods: Step-By-Step Guide with Examples
    • ANOVA Calculator
    • Positive Predictive Value: Meaning, Formula, and Interpretation
    • Median Absolute Deviation Calculator
    • Median Absolute Deviation: Definition, Finding & Formula
    • Outlier Calculator

    Recent Comments

    • Skata na fas on Comparing Regression Lines with Hypothesis Tests
    • Jim Frost on Comparing Regression Lines with Hypothesis Tests
    • Skata na fas on Comparing Regression Lines with Hypothesis Tests
    • Skata na fas on Comparing Regression Lines with Hypothesis Tests
    • Jim Frost on Pareto Chart: Making, Reading & Examples

    Copyright © 2026 · Jim Frost · Privacy Policy

    %d