Chebyshev’s Theorem estimates the minimum proportion of observations that fall within a specified number of standard deviations from the mean. This theorem applies to a broad range of probability distributions. Chebyshev’s Theorem is also known as Chebyshev’s Inequality.

If you have a mean and standard deviation, you might need to know the proportion of values that lie within, say, plus and minus two standard deviations of the mean. If your data follow the normal distribution, that’s easy using the Empirical Rule! However, what if you don’t know the distribution of your data or you know that it *doesn’t* follow the normal distribution? In that case, Chebyshev’s Theorem can help you out!

In this post, learn why Chebyshev’s theorem is valuable and how to use it to solve problems. Additionally, I’ll compare the theorem to the Empirical Rule, which serves a similar purpose.

## Equation for Chebyshev’s Theorem

Chebyshev’s Theorem helps you determine where most of your data fall within a distribution of values. This theorem provides helpful results when you have only the mean and standard deviation. You do not need to know the distribution your data follow.

There are two forms of the equation. One determines how close to the mean the data lie and the other calculates how far away from the mean they fall:

Maximum proportion of observations that are more than k standard deviations from the mean | |

Minimum proportion of observations that are within k standard deviations of the mean |

Where k equals the number of standard deviations in which you are interested. K must be greater than 1.

As you can see, it’s a fairly straightforward equation.

For more information about the mean and standard deviation, read my posts about Measures of Central Tendency and Measures of Variability.

## Using Chebyshev’s Theorem

By entering values for k into the equation, I’ve created the table below that displays proportions for various standard deviations.

Standard Deviations |
Minimum % within |
Max % outside |

0.50 | 0.50 | |

1.5 | 0.56 | 0.44 |

2 | 0.75 | 0.25 |

3 | 0.89 | 0.11 |

4 | 0.94 | 0.06 |

5 | 0.96 | 0.04 |

For example, if you’re interested in a range of three standard deviations around the mean, Chebyshev’s Theorem states that at least 89% of the observations fall inside that range, and no more than 11% fall outside that range.

A crucial point to notice is that Chebyshev’s Theorem produces minimum and maximum proportions. For example, at least 56% of the observations fall inside 1.5 standard deviations, and a maximum of 44% fall outside.

The theorem does not provide exact answers, but it places limits on the possible proportions. For the example above, more than 56% of the observations can lie within 1.5 standard deviations of the mean.

The minimum and maximum proportions arise due to uncertainties about the data’s distribution. While the theorem is valuable because it applies to all distributions, it also limits the precision of the results.

## Example Problems

Suppose you know a dataset has a mean of 100 and a standard deviation of 10, and you’re interested in a range of ± 2 standard deviations. Two standard deviations equal 2 X 10 = 20. Consequently, Chebyshev’s Theorem tells you that at least 75% of the values fall between 100 ± 20, equating to a range of 80 – 120. Conversely, no more than 25% fall outside that range.

An interesting range is ± 1.41 standard deviations. With that range, you know that at least half the observations fall within it, and no more than half fall outside of it. If we use a mean of 100 and a standard deviation of 10 again, 1.41 standard deviations equal 14.1. Hence, at least half the values lie in the range 100 ± 14.1, or 85.9 – 114.1.

Suppose a class takes a test. The average score is 75 and the standard deviation is 5. What is the proportion of scores that fall between 65 and 85?

The mean is 75. 65 is 10 points below the mean and 85 is 10 points above the mean. The standard deviation is 5. Consequently, you want to determine the proportion of scores that fall within 10 / 5 = 2 standard deviations of the mean. Using the table above, you know that at least 75% of the scores will fall within the range of 65 – 85.

## Chebyshev’s Theorem compared to The Empirical Rule

The Empirical Rule also describes the proportion of data that fall within a specified number of standard deviations from the mean. However, there are several crucial differences between Chebyshev’s Theorem and the Empirical Rule.

Chebyshev’s Theorem applies to all probability distributions where you can calculate the mean and standard deviation. On the other hand, the Empirical Rule applies only to the normal distribution.

As you saw above, Chebyshev’s Theorem provides approximations. Conversely, the Empirical Rule provides exact answers for the proportions because the data are known to follow the normal distribution.

**Related post**: Identifying the Distribution of Your Data

The table below compares the results from both methods for the proportions of data falling within the specified number of standard deviations.

Standard Deviations |
Empirical Rule |
Chebyshev’s Theorem |

1 | 68% | NA |

2 | 95% | ≥75% |

3 | 99.7% | ≥88.9% |

Again, notice that the Empirical Rule provides exact answers while Chebyshev’s Theorem gives approximations.

If you know that your data follow the normal distribution, use the Empirical Rule. Otherwise, Chebyshev’s Theorem might be your best choice!

For more information, read my post, Empirical Rule: Definition, Formula, and Uses.

Annie says

My Q is that we know, we can convert any distribution to Normal Distribution and then after converting it, we can use the empirical rule. Why do we need Chebyshev when we can do this ?

Jim Frost says

Hi Annie,

You’re correct. If you have the data, there’s a lot more you can do with it. But, I open this post asking, if you have the mean and standard deviation and know that your data don’t follow the normal distribution, what conclusions can you draw about the distribution of values?

Chebyshev’s theorem is more for the quick, back-of-the-envelop calculations. For those cases where you probably don’t have the dataset but want to draw some rough conclusions about how the values are distributed.

For cases where you

dohave the full data, there’s a lot more you can do! You mention transforming the data to the normal distribution. That is one possible method. You’d need to transform the data, pick out the values you need, and then back transform so they’re in the original data units.However, I would not recommend doing that. For one thing, a transformation won’t always produce a normal distribution for all datasets. More importantly, you lose the intuitive sense of the data. The values are all transformed values (not in the original data units) and the distributions looks like a bell curve normal distribution even though it is not.

Instead, I’d recommend identifying the probability distribution that your data follow natively. No transformation is required. The data values use the natural units and the probability distribution function actually looks like the true distribution. You’ll be able to calculate all the probabilities and distribution properties that you want without needing to back transform to the natural data units.

To see what I mean, read my post about understanding probability distributions. Scroll down to the part where I talk about continuous distributions. One of the examples is a skewed lognormal distribution and I illustrate the benefits that I mention in this comment.

Sameer Sippy says

Appreciate Jim for your reply…!!! Indeed, your comments do help in giving better clarity to the aforesaid concept. Thanks once again Jim…

Jim Frost says

You’re very welcome, Sameer! đź™‚

Sameer Sippy says

Very interesting article. Liked it Thanks for sharing a write-up on Chebyshev’s theorem.

However, with this certain doubts still remain in my mind as follows:—

a) Is Chebyshev’s theorem used in Non-Parametric Tests?

b) If the Sampling Distribution remains unknown or rather simply put it as the the data Distribution is NOT Normal, how do we use it for making inferences on Population Parameters (i.e. proportions)?

c) If this is case as stated in b vis-a-vis Chebyshev’s Theorem, how do we establish the following viz.: —

1) Confidence Intervals

2) Hypothesis Testing

& 3) p-values

Could you elaborate regarding the same?

Jim Frost says

Hi Sameer,

Chebyshev’s Theorem is just a quick and simple way to determine the proportion of observations that fall within different ranges of your data’s distribution. As far as I know, it’s not used in any hypothesis testing or confidence intervals. It works with the distribution of the data, not sampling distributions.

John McGurk says

đź™ŹđźŹĽ Jim. Great clear insight on a very useful techique.

Elish Sharmin says

Thanks and gratitude for publishing this article in an elucidated manner. I got to understand it.

Abdul kayum says

Absolutely great ! Thank you jim for sharing such a informative article . First when I opened it and thought damn! It is hard but the way you explain appreciate it .

Bill says

Thanks for this. Timing was appropriate as I just taught this to the students in my Business Statistics course. I think this article will help them immensely.