What is a Trimmed Mean?
The trimmed mean is a statistical measure that calculates a dataset’s average after removing a certain percentage of extreme values from both ends of the distribution. By excluding outliers, this statistic can provide a more accurate representation of a dataset’s typical or central values. Usually, you’ll trim a percentage of values, such as 10% or 20%.
For example, a 10% trimmed mean excludes the highest 10% of values and the lowest 10%. In other words, it uses the middle 80%.
When summarizing a dataset, the mean is often the go-to statistic. It’s simple to calculate, giving us a quick idea of our data’s “average” value. However, outliers can significantly distort the mean, causing it to misrepresent the typical value.
The trimmed mean helps us tame outliers and obtain a robust measure of central tendency. By removing extreme values, this statistic can better represent typical dataset values.
A famous example of a trimmed mean occurs in Olympic figure skating, where officials remove the highest score and lowest score. This method helps limit the effects of a biased judge. Statisticians also refer to removing only the minimum and maximum values as the modified mean.
Trimming the lowest 25 percent and the highest 25 percent of the dataset produces the interquartile mean—the average of the middle half of the dataset.
The median is an extreme form of a trimmed mean because it removes all values except one or two, depending on whether the dataset contains an odd or even number of values. Learn more about the Median Definition and Uses.
For comparison, a winsorized mean takes extreme values and replaces them with less extreme values instead of removing them like a trimmed mean.
In this post, learn how to calculate the trimmed mean, work through an example, see how it improves statistical analyses, and get practical usage tips.
Related post: What are Robust Statistics?
Step-by-Step Guide to Calculate the Trimmed Mean
Follow these steps to find the trimmed mean:
- Sort the dataset: Arrange the data in ascending order to facilitate trimming.
- Determine the percentage of values to trim: Choose the percentage of extreme values you want to exclude from each end of the dataset.
- Calculate the number of observations to trim: Multiply the percentage by the total number of observations. Round the result to the nearest integer to determine how many observations you must discard from each end.
- Trim the dataset: Remove the designated number of observations from both ends of the sorted dataset.
- Calculate the trimmed mean: Add the values and divide by the number of remaining observations.
Excel provides a built-in formula for the trimmed mean: TRIMMEAN. To use this function, enter the range of cells containing your data and the percentage of values to trim. However, note that Excel’s trimming percentage definition differs from the standard statistical definition. The percentage in the standard definition relates to the amount of data removed from each individual side of the distribution. In contrast, Excel’s percentage refers to the total amount removed from both sides.
For example, suppose your data is in cells A1:A10, and you want to calculate the 20% trimmed mean. In Excel, you need to double the percentage to 40% so it can take 20% off each side of the distribution. The formula is the following:
Example of Calculating a Trimmed Mean
Suppose we have the following dataset of 10 values:
2, 3, 4, 5, 7, 8, 9, 10, 12, 15
Let’s find the 20% trimmed mean. Because there are ten values, we need to remove the smallest two values (2 and 3) and the largest two values (12 and 15), leaving us with the following six values:
4, 5, 7, 8, 9, 10
Then we take the mean of these six values to find the trimmed mean, which is 7.1667.
Practical Applications of the Trimmed Mean
Now that we understand its importance let’s explore some real-world scenarios where the trimmed mean proves its mettle.
Financial Analysis: When examining stock returns, extreme values (such as unusually high or low returns) can skew the mean, potentially leading to misinterpretation. By applying the trimmed mean, we obtain a more reliable estimate of the central tendency of returns.
Education Assessment: In an examination where the scores of a few students are significantly higher or lower than the rest, the trimmed mean helps assess the overall performance by reducing the impact of outliers.
Retail Pricing Analysis: In the world of retail, pricing strategies play a crucial role in determining profitability and customer satisfaction. When analyzing price data, outliers can arise due to occasional promotions, errors, or unique product features. By applying the trimmed mean, retailers can obtain a more accurate representation of the typical price, enabling them to make informed decisions about setting competitive prices and maximizing revenue.
Climate Studies: Understanding climate patterns and trends is essential for predicting future weather conditions, assessing environmental impact, and formulating mitigation strategies. However, climate datasets often contain extreme values due to weather anomalies, rare events, or measurement errors. These outliers can distort statistical measures like the mean, hindering accurate trend analysis.
By harnessing the power of the trimmed mean in these diverse fields, we can delve deeper into our data, unraveling meaningful insights while reducing the influence of outliers.
When using the trimmed mean in statistical analyses, discarding between 5 to 25 percent of the dataset’s extreme values is common. But other percentages are possible. As you remove more of the dataset, the trimmed mean becomes more robust to outliers.
However, robustness comes with a tradeoff because you use less of the original data, reducing your effective sample size and information about the original data. While outliers can be problematic, they sometimes represent legitimate variability in your subject matter. Inappropriately removing data points can distort your understanding of the phenomenon you’re studying. For more information, read my Guidelines for Handing Outliers.
Typically, the goal for using a trimmed mean is to minimize the standard error of a dataset containing outliers and small deviations from normality. Reducing the standard error increases the precision of the estimate and the statistical power of hypothesis tests.
In this context, trimmed means help you navigate a tradeoff between the regular mean and the median.
On the one hand, the regular mean provides optimal performance with normal distributions and no outliers. On the other hand, the median provides better performance for datasets with numerous outliers and highly skewed distributions.
Trimmed means provide an effective compromise for scenarios falling between these two conditions where neither the regular mean nor the median are optimal. Based on computer simulations, Wilcox and Keselman (2003) suggest that a 20% trim is a good default choice for minimizing the standard error in these cases.
Use Yuen’s t-test to evaluate trimmed means. This test can handle outliers and nonconstant variance.
Wilcox, R. R., Keselman, H. J. (2003), Modern Robust Data Analysis Methods: Measures of Central Tendency, Psychological Methods, Vol. 8, No. 3, 254—274.