The coefficient of variation (CV) is a relative measure of variability that indicates the size of a standard deviation in relation to its mean. It is a standardized, unitless measure that allows you to compare variability between disparate groups and characteristics. It is also known as the relative standard deviation (RSD).
In this post, you will learn about the coefficient of variation, how to calculate it, know when it is particularly useful, and when to avoid it.
How to Calculate the Coefficient of Variation
Calculating the coefficient of variation involves a simple ratio. Simply take the standard deviation and divide it by the mean.
Higher values indicate that the standard deviation is relatively large compared to the mean.
For example, a pizza restaurant measures its delivery time in minutes. The mean delivery time is 20 minutes and the standard deviation is 5 minutes.
Interpreting the Coefficient of Variation
For the pizza delivery example, the coefficient of variation is 0.25. This value tells you the relative size of the standard deviation compared to the mean. Analysts often report the coefficient of variation as a percentage. In this example, the standard deviation is 25% the size of the mean.
If the value equals one or 100%, the standard deviation equals the mean. Values less than one indicate that the standard deviation is smaller than the mean (typical), while values greater than one occur when the S.D. is greater than the mean.
In general, higher values represent a greater degree of relative variability.
Absolute versus Relative Measures of Variability
In another post, I talk about the standard deviation, interquartile range, and range. These statistics are absolute measures of variability. They use the variable’s unit of measurement to describe the variability.
For the five minute standard deviation in the pizza delivery example, we know that the typical delivery occurs five minutes before or after the mean delivery time.
That information is very useful! It tells us the variability in our data using, conveniently, the original measurement units. We can conceivably compare this delivery time variability to another pizza restaurant.
For more information, read my post about the standard deviation and other absolute measures of variability.
On the other hand, relative measurements use a standardization process that removes the original units of measurement. In the CV ratio, both the standard deviation and the mean use the same units, which cancels them out and produces a unitless statistic.
When would you want to use the coefficient of variation? Its unitless nature provides it with some advantages. Specifically, the coefficient of variation facilitates meaningful comparisons in scenarios where absolute measures cannot.
Use the coefficient of variation when you want to compare variability between:
- Groups that have means of very different magnitudes.
- Characteristics that use different units of measurements.
In these two cases, absolute measures can be problematic. Let’s learn more!
Using the Coefficient of Variation when Means are Vastly Different
When you measure a characteristic that has a wide range of values, you’d often expect the mean and standard deviation to change together. This phenomenon frequently occurs in cross-sectional data. In these cases, you want to know how the standard deviation compares relatively to the vastly different means.
Suppose you’re measuring household expenditures and want to compare the variability of spending among high-income and low-income households. These data are fictional.
|Expenditures||High Income||Low Income|
These values use the same unit of measurement (U.S. dollars), allowing you to compare the standard deviations. The variability in high-income household expenses is much greater than low-income households ($125,000 S.D. vs. $10,000 S.D.). However, given the vast difference in mean expenses, that’s not surprising.
However, if you want to compare variability while accounting for the disparate means, you need to use a relative measure of variability, such as the coefficient of variation. The table below shows that when you account for the differences in expenses, the low-income group actually has equal variability.
|Coefficient of Variability||High Income||Low Income|
Analysts frequently use the coefficient of variability when their dataset has a broad range of means, as shown in the previous example.
Researchers use the CV for assessing the inequality of incomes across different countries. Average incomes by country vary greatly. There are affluent countries and impoverished countries. To consider inequality within each country while accounting for the vastly different mean incomes, analysts use the coefficient of variability. In this context, when a country has a larger coefficient of variability, it represents a greater degree of income disparity.
Similarly, financial analysts use the coefficient of variability to assess the volatility of returns for financial investments across a wide range of valuations. In this context, higher coefficients indicate a more significant risk.
The coefficient of variation is particularly helpful when your data follow a lognormal distribution. In these distributions, the standard deviation changes depending on the portion of the distribution you are assessing. However, the coefficient of variation remains constant throughout a lognormal distribution.
Using the Coefficient of Variation to Compare Measurements that Use Different Units
When measurements use different scales, you can’t compare them directly. Suppose you want to compare the variability in SAT scores to ACT scores? While these college entrance exams are similar in nature and purpose, they use different scales. Consequently, you can’t compare their standard deviations directly.
However, the coefficient of variation standardizes the raw data, which means you can compare the relative variability of SAT and ACT scores.
Furthermore, any time you want to assess the variability of inherently different characteristics, you’ll need to use a relative measure of variability, such as the coefficient of variability. For example, you might want to assess the variability of the operating temperature and speed of rockets. Or compare the variability of the weight and strength of material samples. You can’t meaningfully compare standard deviations that use different units, such as kilograms for weight and megapascals for strength!
However, if your kilograms variable has a higher coefficient of variability than megapascals, then you know weight is relatively more variable than strength.
These examples measure entirely different characteristics using different units. However, you can use the coefficient of variation to compare their relative variability!
Cautions About When Not to Use the Coefficient of Variability
While the coefficient of variability is extremely useful in some contexts, there are cases when you should not use it.
Do not use when the mean is close to zero
If the mean equals zero, the denominator of the ratio is zero, which is problematic! Fortunately, you’re not likely to have a mean that equals zero exactly. But when the mean is close to zero, the coefficient of variation can approach infinity, and its value is susceptible to small changes in the mean!
Do not use with interval scales
Use the coefficient of variation only when your data use a ratio scale. Don’t use it for interval scales.
Ratio scales have an absolute zero that represents a total lack of the characteristic. For example, zero weight (using the Imperial or metric system) indicates a complete absence of weight. Weight is a ratio scale.
However, temperatures in Fahrenheit and Celsius are interval scales. These measurement systems have a zero value, but those zeros don’t indicate an absence of temperature. (Kelvin has an absolute zero that does represent a lack of temperature. Kelvin is a ratio scale.)
Interval scales do not allow you to divide measurements in a meaningful fashion. For example, 10C is not 1/3 the temperature of 30C! Because the coefficient of variation involves division, this statistic is meaningless for interval scales.
Let’s see an example of the problem that occurs when using the coefficient of variation with interval scales!
The table below displays pairs of equivalent temperatures. You’d expect their coefficients of variation to be equal. Let’s check!
The CVs are quite different! That occurs because we are assessing interval scales.
Use the coefficient of variation only when you have a true absolute zero on a ratio scale!
Absolute versus Relative Measures in other Statistical Contexts
The need to choose between using an absolute measure (e.g., standard deviation) versus a relative, standardized measure (e.g., coefficient of variability) occurs elsewhere in statistics. For more information on this topic, read my posts about:
- Standardizing values using the normal distribution.
- Using standardized regression coefficients in regression analysis.
- R-squared versus the standard error of the regression.