What is the Mean?
The mean in math and statistics summarizes an entire dataset with a single number representing the data’s center point or typical value. It is also known as the arithmetic mean, and it is the most common measure of central tendency. It is frequently called the “average.”
Learn how to find the mean and know when it is and is not a good statistic to use!
How to Find the Mean
Finding the mean is very simple. Just add all the values and divide by the number of observations. The mean formula is below:
For example, if the heights of five people are 48, 51, 52, 54, and 56 inches. Here’s how to find the mean:
48 + 51 + 52 + 54 + 56 / 5 = 52.2
Their average height is 52.2 inches.
Mean Formula
There are two versions of the mean formula in math—the sample and population formulas. In each case, the process for how to find the mean mathematically does not change. Add the values and divide by the number of values. However, the formula notation differs between the two types.
Sample Mean Formula
The sample mean formula is the following:
Where:
- x̄ is the sample average of variable x.
- ∑xn= sum of n values.
- n = number of values in the sample.
Typically, the sample formula notation uses lowercase letters.
Population Mean Formula
The population mean formula is the following:
Where:
- µ is the population average.
- ∑XN= sum of N values.
- N = number of values in the population.
Typically, the population mean formula notation uses Greek and uppercase letters.
When Do You Use the Average?
Ideally, the mean in math (aka the average) indicates the region where most values in a distribution fall. Statisticians refer to it as the central location of a distribution. You can think of it as the tendency of data to cluster around a middle value. The histogram below illustrates the average accurately finding the center of the data’s distribution.
However, the average does not always find the center of the data. It is sensitive to skewed data and extreme values. For example, when the data are skewed, it can miss the mark. In the histogram below, the average is outside the area with the most common values.
This problem occurs because outliers have a substantial impact on the mean. Extreme values in an extended tail pull it away from the center. As the distribution becomes more skewed, the average is drawn further away from the center.
In these cases, the average can be misleading because it might not be near the most common values. Consequently, it’s best to use the average to measure the central tendency when you have a symmetric distribution.
For skewed distributions, it’s often better to use the median or trimmed mean, which use different methods to find the central location. Note that the average provides no information about the variability present in a distribution. To evaluate that characteristic, assess the standard deviation.
Relate post: Measures of Central Tendency
Using Sample Means to Estimate Population Means
In statistics, analysts often use a sample average to estimate a population mean. For small samples, the sample can differ greatly from the population. However, as the sample size grows, the law of large numbers states that the sample average is likely to be close to the population value.
Hypothesis tests, such as t-tests and ANOVA, use samples to determine whether population means are different. Statisticians refer to this process of using samples to estimate the properties of entire populations as inferential statistics.
Related post: Descriptive Statistics Vs. Inferential Statistics
In statistics, we usually use the arithmetic average, which is the type I focus on this post. However, there are other types of averages, including the geometric version. Read my post about the geometric mean to learn more. There is also a weighted mean.
Great explanation, Jim!