The coefficient of variation (CV) is a relative measure of variability that indicates the size of a standard deviation in relation to its mean. It is a standardized, unitless measure that allows you to compare variability between disparate groups and characteristics. It is also known as the relative standard deviation (RSD).
In this post, you will learn about the coefficient of variation, how to calculate it, know when it is particularly useful, and when to avoid it.
How to Calculate the Coefficient of Variation
Calculating the coefficient of variation involves a simple ratio. Simply take the standard deviation and divide it by the mean.
Higher values indicate that the standard deviation is relatively large compared to the mean.
For example, a pizza restaurant measures its delivery time in minutes. The mean delivery time is 20 minutes and the standard deviation is 5 minutes.
Interpreting the Coefficient of Variation
For the pizza delivery example, the coefficient of variation is 0.25. This value tells you the relative size of the standard deviation compared to the mean. Analysts often report the coefficient of variation as a percentage. In this example, the standard deviation is 25% the size of the mean.
If the value equals one or 100%, the standard deviation equals the mean. Values less than one indicate that the standard deviation is smaller than the mean (typical), while values greater than one occur when the S.D. is greater than the mean.
In general, higher values represent a greater degree of relative variability.
Absolute versus Relative Measures of Variability
In another post, I talk about the standard deviation, interquartile range, and range. These statistics are absolute measures of variability. They use the variable’s unit of measurement to describe the variability.
For the five minute standard deviation in the pizza delivery example, we know that the typical delivery occurs five minutes before or after the mean delivery time.
That information is very useful! It tells us the variability in our data using, conveniently, the original measurement units. We can conceivably compare this delivery time variability to another pizza restaurant.
For more information, read my post about the standard deviation and other absolute measures of variability.
On the other hand, relative measurements use a standardization process that removes the original units of measurement. In the CV ratio, both the standard deviation and the mean use the same units, which cancels them out and produces a unitless statistic.
When would you want to use the coefficient of variation? Its unitless nature provides it with some advantages. Specifically, the coefficient of variation facilitates meaningful comparisons in scenarios where absolute measures cannot.
Use the coefficient of variation when you want to compare variability between:
- Groups that have means of very different magnitudes.
- Characteristics that use different units of measurements.
In these two cases, absolute measures can be problematic. Let’s learn more!
Using the Coefficient of Variation when Means are Vastly Different
When you measure a characteristic that has a wide range of values, you’d often expect the mean and standard deviation to change together. This phenomenon frequently occurs in cross-sectional data. In these cases, you want to know how the standard deviation compares relatively to the vastly different means.
Suppose you’re measuring household expenditures and want to compare the variability of spending among high-income and low-income households. These data are fictional.
|Expenditures||High Income||Low Income|
These values use the same unit of measurement (U.S. dollars), allowing you to compare the standard deviations. The variability in high-income household expenses is much greater than low-income households ($125,000 S.D. vs. $10,000 S.D.). However, given the vast difference in mean expenses, that’s not surprising.
However, if you want to compare variability while accounting for the disparate means, you need to use a relative measure of variability, such as the coefficient of variation. The table below shows that when you account for the differences in expenses, the low-income group actually has equal variability.
|Coefficient of Variability||High Income||Low Income|
Analysts frequently use the coefficient of variability when their dataset has a broad range of means, as shown in the previous example.
Researchers use the CV for assessing the inequality of incomes across different countries. Average incomes by country vary greatly. There are affluent countries and impoverished countries. To consider inequality within each country while accounting for the vastly different mean incomes, analysts use the coefficient of variability. In this context, when a country has a larger coefficient of variability, it represents a greater degree of income disparity.
Similarly, financial analysts use the coefficient of variability to assess the volatility of returns for financial investments across a wide range of valuations. In this context, higher coefficients indicate a more significant risk.
The coefficient of variation is particularly helpful when your data follow a lognormal distribution. In these distributions, the standard deviation changes depending on the portion of the distribution you are assessing. However, the coefficient of variation remains constant throughout a lognormal distribution.
Using the Coefficient of Variation to Compare Measurements that Use Different Units
When measurements use different scales, you can’t compare them directly. Suppose you want to compare the variability in SAT scores to ACT scores? While these college entrance exams are similar in nature and purpose, they use different scales. Consequently, you can’t compare their standard deviations directly.
However, the coefficient of variation standardizes the raw data, which means you can compare the relative variability of SAT and ACT scores.
Furthermore, any time you want to assess the variability of inherently different characteristics, you’ll need to use a relative measure of variability, such as the coefficient of variability. For example, you might want to assess the variability of the operating temperature and speed of rockets. Or compare the variability of the weight and strength of material samples. You can’t meaningfully compare standard deviations that use different units, such as kilograms for weight and megapascals for strength!
However, if your kilograms variable has a higher coefficient of variability than megapascals, then you know weight is relatively more variable than strength.
These examples measure entirely different characteristics using different units. However, you can use the coefficient of variation to compare their relative variability!
Cautions About When Not to Use the Coefficient of Variability
While the coefficient of variability is extremely useful in some contexts, there are cases when you should not use it.
Do not use when the mean is close to zero
If the mean equals zero, the denominator of the ratio is zero, which is problematic! Fortunately, you’re not likely to have a mean that equals zero exactly. But when the mean is close to zero, the coefficient of variation can approach infinity, and its value is susceptible to small changes in the mean!
Do not use with interval scales
Use the coefficient of variation only when your data use a ratio scale. Don’t use it for interval scales.
Ratio scales have an absolute zero that represents a total lack of the characteristic. For example, zero weight (using the Imperial or metric system) indicates a complete absence of weight. Weight is a ratio scale.
However, temperatures in Fahrenheit and Celsius are interval scales. These measurement systems have a zero value, but those zeros don’t indicate an absence of temperature. (Kelvin has an absolute zero that does represent a lack of temperature. Kelvin is a ratio scale.)
Interval scales do not allow you to divide measurements in a meaningful fashion. For example, 10C is not 1/3 the temperature of 30C! Because the coefficient of variation involves division, this statistic is meaningless for interval scales.
Let’s see an example of the problem that occurs when using the coefficient of variation with interval scales!
The table below displays pairs of equivalent temperatures. You’d expect their coefficients of variation to be equal. Let’s check!
The CVs are quite different! That occurs because we are assessing interval scales.
Use the coefficient of variation only when you have a true absolute zero on a ratio scale!
Absolute versus Relative Measures in other Statistical Contexts
The need to choose between using an absolute measure (e.g., standard deviation) versus a relative, standardized measure (e.g., coefficient of variability) occurs elsewhere in statistics. For more information on this topic, read my posts about:
- Standardizing values using the normal distribution.
- Using standardized regression coefficients in regression analysis.
- R-squared versus the standard error of the regression.
Dalia Feltman says
Thanks so much for your helpful and quick response!
This explanation of COVs is very helpful. Once I have the COV’s, how can I compare them to say they are significantly different? Is there a test I should use? or do I just say one is greater than the other?
Jim Frost says
Assuming that you’re using sample statistics and want to draw conclusions about the coefficient of variations for entire populations (i.e., inferential statistics), then, yes, you’d need to perform a hypothesis test to make that determination. Each sample estimate of the coefficient of variation has some sampling error. A hypothesis test accounts for that. Some statistical software include a hypothesis test for comparing coefficients of variation. The test will calculate a p-value, which you compare to your significance level. Forkman’s test is the one I’m most familiar with.
Forkman (2009), “Estimator and Tests for Common Coefficients of Variation in Normal Distributions”, Communications in Statistics – Theory and Methods, Vol. 38, pp. 233-251.
However, if you’re simply performing descriptive statistics and don’t need to generalize to populations, you can just compare the sample statistics and say that one sample has a higher COV than the other. In this case, you’re explicitly restricting your conclusions to the samples themselves.
I hope that helps!
I would also love to hear thoughts on this topic. I also work in market research and often use Likert scales (1-7) to evaluate brands on various attributes. I want to make insights based on homogeneity of the respondents’ perception of a brand (i.e. lower CV means respondents agree on how well a brand is performing on a specific attribute).
However, I’m afraid to commit to CV calculations because Likert scale isn’t a ratio, and even though the values are non-negative, there is no absolute zero.
Can anyone help me understand better if CV is meaningful for comparing means derived from Likert scale scores?
Hasan Abdel says
Thanks for your help. I’d like to ask a question please. Does it make sense to use the CV when the mean is negtaive?
FEKADU BEYENE CHEMO says
This really is quality teaching from an extremely knowledgeable person! Thank you very much,i tip my hat for you boss.
Jim Frost says
Thanks so much! 🙂
David A Gutting says
I’m trying to determine if CV is a useful measure in a market research project I’ve been doing. We have consumers “grade” brands on a 100 point scale for how good a job they do in their product, their customer service, customer experience, how easy they are to find, and how relevant and memorable their advertising is. Some brands do well on some things, not so well on others.
Our mean scores in all these categories are about 75 on a 100 scale–low scores are in the 50s, highs in the mid 80s, with SDs ranging from 9 to 16 and averaging about 11. CVs range from 9% to 24%. These scores are very good predictors of how well a brand will perform in the marketplace, with correlations between r=.75 to r=,80 with measurements like brand preference and brands bought most often.
What I’ve found is that higher scoring brands (think Nike, Amazon, etc) tend to have lower CVs–meaning less variability in their performance. Weaker brands (think Spirit Airlines or Macy’s) tend to have higher CVs, sometimes twice as high as the stronger brands.
Here’s my question: is this level of variance notable in any way? I know in the investment world, financial analysts use CVs to analyze risk–higher CVs are seen as more volatile and therefore riskier, lower CVs are more stable.
Our theory in all this is that more holistic brands–ones that score high on multiple measures and do so consistently–are significantly more valuable than others (and the data seems to prove this). Low CV scores seem to indicate more consistent performance. This is important because traditionally in the marketing world, brands are told to focus on one or two things–such as product and advertising, and let other concerns go.
What the CV data for the higher performing brands suggests to me is that it may prove that your brand will get demonstrably better results if it operates at full throttle on all dimensions. But–in your view, does that make sense? Are variances ranging for 9% to 24% notable enough? Here’s an example–Nike, with a CV of 9% and brand scores in the high 80s, is five times more effective in the external market than Spirit Airlines, with brand scores in the low 60s and a CV of 24%
Sorry for the lengthy post here. I’ve talked about this to many of my colleagues (including Ph.D.s with strong statistical backgrounds), and they think the pattern is interesting but don’t seem to want to commit to a strong view. Would love to hear your thoughts.
Gloria Lasu says
For example it is said that the CV in the lab experiments should not exceed 15 and 20 in the field experiment. I want to know the importance of why does it not exceed this percentage?
Jim Frost says
There is no universal value that is appropriate for all contexts. Instead, analysts need to determine thresholds that shouldn’t be crossed for each subject area. That can become an analysis all its own! Presumably, someone did that for the lab experiments to which you refer. Problems probably start occurring when variability exceeds those values. The nature of the problems and why they occur at those values is specific to the context. You’ll need to contact people familiar with the reasoning behind those values to understand why they were chosen.
Hi Jim, what does it mean when the coefficient of variation of a dataset of demand is higher than 1? I know that the standard deviation is higher than the mean in that case, but does this mean that the data set has a high variability? (in comparison with a coefficient of variation below 1)
Jim Frost says
Any CV greater than one just means that the variability is greater than the mean. In isolation, it’s hard to interpret. Whether that is large or small depends on the context, subject area, or comparison to another measure. So, I can tell you that the variability is greater than the mean but I don’t know if for your context it’s considered large. In comparison to a a measure with a CV less than one, yes, you’d say that that the relative variability is great for the greater than 1 measure than for the less than 1 measure.
The coefficient of variation is a relative measure, which means you really need to be able to compare it to another measure to interpret it as high.
David M says
there’s any convention on how to normalize or relativize CV of a measurement (in this case is always the same one but as CV is unitless it shouldn’t matter) with respect to the CV of a reference case? i’m thinking on evaluating intrinsic noise and CV should capture total noise, so assuming that extrinsic noise is constant the CV should tell me something about it. but i have doubts on if i should calculate the ratio or the difference between the sample and the reference CV, and which implications could derive from either of this options. any thoughts?
david powell says
I am for means VERY close to zero but for means in the 2-5% range I’m also seeing CVs range from 50%-150%. Should I be concerned about that?
Andrés Limas says
Hi, wuold you mind share with us, some of the literature that we could use to reference this value? Kind regards from Colombia¡¡
Jim Frost says
I don’t have a good reference offhand. However, it’s a fairly basic value so I’d imagine most Intro textbooks will cover it.
Are there any guidelines that suggest how much above zero means should be before CVs become more reliable? I’m working with percentage data and just eyeballing my data it seems the CVs are quite high until the means reach about 10%. Any references would be useful here. thanks
Jim Frost says
I’m not aware of any specific guidelines. Just be on the lookout for some dramatically high CVs! It seems like you’re seeing that in action. Sorry, but I don’t have a reference.
Thank you so much, Jim! I just found your blog and I can see it becoming my “go-to” guide to Stats.
Jim Frost says
I’m glad you found my blog and found it to be helpful! 🙂
Is it any cut off value for good coefficient of variation?
Is it possible to classify Acceptable and Not acceptable CV ranges
Scott Richlen says
Sorry for the confusion. In reading this article in the second line the word “mean” is hyper-linked (or whatever the term is where additional text is shown). The apple example is in that link. You’ll see the error there.
Jim Frost says
Thanks for catching the typo. I’ll fix it! That’s a glossary term. I’d forgotten I used that example there!
Scott Richlen says
I don’t understand your definition of “mean”. Isn’t the average of 5,5,6,7,8 equal to 6.2? You have “For example, if the weights of five apples are 5, 5, 6, 7, and 8, the average apple weight is 6.4.”
Jim Frost says
I’m not sure what you’re referring to. I don’t write about the mean of apple weights in this post, which is about the coefficient of variation. The mean is the same as the average. The mean of the numbers you provide is 6.2. For more information about the mean, read my post about the measures of central tendency, of which this mean is one.
If you have more questions about the mean, please post them in the comments section of that post about central tendency so we can keep the comments relevant to each post. Thanks.
José Francisco dos Reis Neto says
Liked it! Very good and practical for my students of Agronomy. Thank you.
Awesome post! Thank you sir
Marian Martin says
Never see so cristal clear presentations. I am refering all your posts and books, not only this one.