Chebyshev’s Theorem estimates the minimum proportion of observations that fall within a specified number of standard deviations from the mean. This theorem applies to a broad range of probability distributions. Chebyshev’s Theorem is also known as Chebyshev’s Inequality.

If you have a mean and standard deviation, you might need to know the proportion of values that lie within, say, plus and minus two standard deviations of the mean. If your data follow the normal distribution, that’s easy using the Empirical Rule! However, what if you don’t know the distribution of your data or you know that it *doesn’t* follow the normal distribution? In that case, Chebyshev’s Theorem can help you out!

In this post, learn why Chebyshev’s theorem is valuable and how to use it to solve problems. Additionally, I’ll compare the theorem to the Empirical Rule, which serves a similar purpose.

## Equation for Chebyshev’s Theorem

Chebyshev’s Theorem helps you determine where most of your data fall within a distribution of values. This theorem provides helpful results when you have only the mean and standard deviation. You do not need to know the distribution your data follow.

There are two forms of the equation. One determines how close to the mean the data lie and the other calculates how far away from the mean they fall:

Maximum proportion of observations that are more than k standard deviations from the mean | |

Minimum proportion of observations that are within k standard deviations of the mean |

Where k equals the number of standard deviations in which you are interested. K must be greater than 1.

As you can see, it’s a fairly straightforward equation.

For more information about the mean and standard deviation, read my posts about Measures of Central Tendency and Measures of Variability.

## Using Chebyshev’s Theorem

By entering values for k into the equation, I’ve created the table below that displays proportions for various standard deviations.

Standard Deviations |
Minimum % within |
Max % outside |

0.50 | 0.50 | |

1.5 | 0.56 | 0.44 |

2 | 0.75 | 0.25 |

3 | 0.89 | 0.11 |

4 | 0.94 | 0.06 |

5 | 0.96 | 0.04 |

For example, if you’re interested in a range of three standard deviations around the mean, Chebyshev’s Theorem states that at least 89% of the observations fall inside that range, and no more than 11% fall outside that range.

A crucial point to notice is that Chebyshev’s Theorem produces minimum and maximum proportions. For example, at least 56% of the observations fall inside 1.5 standard deviations, and a maximum of 44% fall outside.

The theorem does not provide exact answers, but it places limits on the possible proportions. For the example above, more than 56% of the observations can lie within 1.5 standard deviations of the mean.

The minimum and maximum proportions arise due to uncertainties about the data’s distribution. While the theorem is valuable because it applies to all distributions, it also limits the precision of the results.

## Example Problems

Suppose you know a dataset has a mean of 100 and a standard deviation of 10, and you’re interested in a range of ± 2 standard deviations. Two standard deviations equal 2 X 10 = 20. Consequently, Chebyshev’s Theorem tells you that at least 75% of the values fall between 100 ± 20, equating to a range of 80 – 120. Conversely, no more than 25% fall outside that range.

An interesting range is ± 1.41 standard deviations. With that range, you know that at least half the observations fall within it, and no more than half fall outside of it. If we use a mean of 100 and a standard deviation of 10 again, 1.41 standard deviations equal 14.1. Hence, at least half the values lie in the range 100 ± 14.1, or 85.9 – 114.1.

Suppose a class takes a test. The average score is 75 and the standard deviation is 5. What is the proportion of scores that fall between 65 and 85?

The mean is 75. 65 is 10 points below the mean and 85 is 10 points above the mean. The standard deviation is 5. Consequently, you want to determine the proportion of scores that fall within 10 / 5 = 2 standard deviations of the mean. Using the table above, you know that at least 75% of the scores will fall within the range of 65 – 85.

## Chebyshev’s Theorem compared to The Empirical Rule

The Empirical Rule also describes the proportion of data that fall within a specified number of standard deviations from the mean. However, there are several crucial differences between Chebyshev’s Theorem and the Empirical Rule.

Chebyshev’s Theorem applies to all probability distributions where you can calculate the mean and standard deviation. On the other hand, the Empirical Rule applies only to the normal distribution.

As you saw above, Chebyshev’s Theorem provides approximations. Conversely, the Empirical Rule provides exact answers for the proportions because the data are known to follow the normal distribution.

**Related post**: Identifying the Distribution of Your Data

The table below compares the results from both methods for the proportions of data falling within the specified number of standard deviations.

Standard Deviations |
Empirical Rule |
Chebyshev’s Theorem |

1 | 68% | NA |

2 | 95% | ≥75% |

3 | 99.7% | ≥88.9% |

Again, notice that the Empirical Rule provides exact answers while Chebyshev’s Theorem gives approximations.

If you know that your data follow the normal distribution, use the Empirical Rule. Otherwise, Chebyshev’s Theorem might be your best choice!

For more information about the Empirical Rule, read my post about The Normal Distribution, which discusses it.

Sameer Sippy says

Appreciate Jim for your reply…!!! Indeed, your comments do help in giving better clarity to the aforesaid concept. Thanks once again Jim…

Jim Frost says

You’re very welcome, Sameer! 🙂

Sameer Sippy says

Very interesting article. Liked it Thanks for sharing a write-up on Chebyshev’s theorem.

However, with this certain doubts still remain in my mind as follows:—

a) Is Chebyshev’s theorem used in Non-Parametric Tests?

b) If the Sampling Distribution remains unknown or rather simply put it as the the data Distribution is NOT Normal, how do we use it for making inferences on Population Parameters (i.e. proportions)?

c) If this is case as stated in b vis-a-vis Chebyshev’s Theorem, how do we establish the following viz.: —

1) Confidence Intervals

2) Hypothesis Testing

& 3) p-values

Could you elaborate regarding the same?

Jim Frost says

Hi Sameer,

Chebyshev’s Theorem is just a quick and simple way to determine the proportion of observations that fall within different ranges of your data’s distribution. As far as I know, it’s not used in any hypothesis testing or confidence intervals. It works with the distribution of the data, not sampling distributions.

John McGurk says

🙏🏼 Jim. Great clear insight on a very useful techique.

Elish Sharmin says

Thanks and gratitude for publishing this article in an elucidated manner. I got to understand it.

Abdul kayum says

Absolutely great ! Thank you jim for sharing such a informative article . First when I opened it and thought damn! It is hard but the way you explain appreciate it .

Bill says

Thanks for this. Timing was appropriate as I just taught this to the students in my Business Statistics course. I think this article will help them immensely.