• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar
  • My Store
  • Glossary
  • Home
  • About Me
  • Contact Me

Statistics By Jim

Making statistics intuitive

  • Graphs
  • Basics
  • Hypothesis Testing
  • Regression
  • ANOVA
  • Probability
  • Time Series
  • Fun

Coefficient of Variation in Statistics

By Jim Frost 28 Comments

The coefficient of variation (CV) is a relative measure of variability that indicates the size of a standard deviation in relation to its mean. It is a standardized, unitless measure that allows you to compare variability between disparate groups and characteristics. It is also known as the relative standard deviation (RSD).

In this post, you will learn about the coefficient of variation, how to calculate it, know when it is particularly useful, and when to avoid it.

How to Calculate the Coefficient of Variation

Calculating the coefficient of variation involves a simple ratio. Simply take the standard deviation and divide it by the mean.

Equation for the coefficient of variation

Higher values indicate that the standard deviation is relatively large compared to the mean.

For example, a pizza restaurant measures its delivery time in minutes. The mean delivery time is 20 minutes and the standard deviation is 5 minutes.

Coefficient of variation equation for the pizza delivery example.

Interpreting the Coefficient of Variation

For the pizza delivery example, the coefficient of variation is 0.25. This value tells you the relative size of the standard deviation compared to the mean.  Analysts often report the coefficient of variation as a percentage. In this example, the standard deviation is 25% the size of the mean.

If the value equals one or 100%, the standard deviation equals the mean. Values less than one indicate that the standard deviation is smaller than the mean (typical), while values greater than one occur when the S.D. is greater than the mean.

In general, higher values represent a greater degree of relative variability.

Absolute versus Relative Measures of Variability

Graph that displays normal distributions with different standard deviations.In another post, I talk about the standard deviation, interquartile range, and range. These statistics are absolute measures of variability. They use the variable’s unit of measurement to describe the variability.

For the five minute standard deviation in the pizza delivery example, we know that the typical delivery occurs five minutes before or after the mean delivery time.

That information is very useful! It tells us the variability in our data using, conveniently, the original measurement units. We can conceivably compare this delivery time variability to another pizza restaurant.

For more information, read my post about the standard deviation and other absolute measures of variability.

On the other hand, relative measurements use a standardization process that removes the original units of measurement. In the CV ratio, both the standard deviation and the mean use the same units, which cancels them out and produces a unitless statistic.

When would you want to use the coefficient of variation? Its unitless nature provides it with some advantages. Specifically, the coefficient of variation facilitates meaningful comparisons in scenarios where absolute measures cannot.

Use the coefficient of variation when you want to compare variability between:

  • Groups that have means of very different magnitudes.
  • Characteristics that use different units of measurements.

In these two cases, absolute measures can be problematic. Let’s learn more!

Using the Coefficient of Variation when Means are Vastly Different

When you measure a characteristic that has a wide range of values, you’d often expect the mean and standard deviation to change together. This phenomenon frequently occurs in cross-sectional data. In these cases, you want to know how the standard deviation compares relatively to the vastly different means.

Suppose you’re measuring household expenditures and want to compare the variability of spending among high-income and low-income households. These data are fictional.

Expenditures High Income Low Income
Mean $500,000 $40,000
Standard Deviation $125,000 $10,000

These values use the same unit of measurement (U.S. dollars), allowing you to compare the standard deviations. The variability in high-income household expenses is much greater than low-income households ($125,000 S.D. vs. $10,000 S.D.). However, given the vast difference in mean expenses, that’s not surprising.

However, if you want to compare variability while accounting for the disparate means, you need to use a relative measure of variability, such as the coefficient of variation. The table below shows that when you account for the differences in expenses, the low-income group actually has equal variability.

Coefficient of Variability High Income Low Income
25% 25%

Real-world examples

Analysts frequently use the coefficient of variability when their dataset has a broad range of means, as shown in the previous example.

Researchers use the CV for assessing the inequality of incomes across different countries. Average incomes by country vary greatly. There are affluent countries and impoverished countries. To consider inequality within each country while accounting for the vastly different mean incomes, analysts use the coefficient of variability. In this context, when a country has a larger coefficient of variability, it represents a greater degree of income disparity.

Similarly, financial analysts use the coefficient of variability to assess the volatility of returns for financial investments across a wide range of valuations. In this context, higher coefficients indicate a more significant risk.

The coefficient of variation is particularly helpful when your data follow a lognormal distribution. In these distributions, the standard deviation changes depending on the portion of the distribution you are assessing. However, the coefficient of variation remains constant throughout a lognormal distribution.

Using the Coefficient of Variation to Compare Measurements that Use Different Units

When measurements use different scales, you can’t compare them directly. Suppose you want to compare the variability in SAT scores to ACT scores? While these college entrance exams are similar in nature and purpose, they use different scales. Consequently, you can’t compare their standard deviations directly.

However, the coefficient of variation standardizes the raw data, which means you can compare the relative variability of SAT and ACT scores.

Furthermore, any time you want to assess the variability of inherently different characteristics, you’ll need to use a relative measure of variability, such as the coefficient of variability. For example, you might want to assess the variability of the operating temperature and speed of rockets. Or compare the variability of the weight and strength of material samples. You can’t meaningfully compare standard deviations that use different units, such as kilograms for weight and megapascals for strength!

However, if your kilograms variable has a higher coefficient of variability than megapascals, then you know weight is relatively more variable than strength.

These examples measure entirely different characteristics using different units. However, you can use the coefficient of variation to compare their relative variability!

Cautions About When Not to Use the Coefficient of Variability

While the coefficient of variability is extremely useful in some contexts, there are cases when you should not use it.

Do not use when the mean is close to zero

If the mean equals zero, the denominator of the ratio is zero, which is problematic! Fortunately, you’re not likely to have a mean that equals zero exactly. But when the mean is close to zero, the coefficient of variation can approach infinity, and its value is susceptible to small changes in the mean!

Do not use with interval scales

Use the coefficient of variation only when your data use a ratio scale. Don’t use it for interval scales.

Ratio scales have an absolute zero that represents a total lack of the characteristic. For example, zero weight (using the Imperial or metric system) indicates a complete absence of weight. Weight is a ratio scale.

However, temperatures in Fahrenheit and Celsius are interval scales. These measurement systems have a zero value, but those zeros don’t indicate an absence of temperature. (Kelvin has an absolute zero that does represent a lack of temperature. Kelvin is a ratio scale.)

Interval scales do not allow you to divide measurements in a meaningful fashion. For example, 10C is not 1/3 the temperature of 30C! Because the coefficient of variation involves division, this statistic is meaningless for interval scales.

Let’s see an example of the problem that occurs when using the coefficient of variation with interval scales!

The table below displays pairs of equivalent temperatures. You’d expect their coefficients of variation to be equal. Let’s check!

Coefficient of variation example with interval scale.

The CVs are quite different! That occurs because we are assessing interval scales.

Use the coefficient of variation only when you have a true absolute zero on a ratio scale!

Absolute versus Relative Measures in other Statistical Contexts

The need to choose between using an absolute measure (e.g., standard deviation) versus a relative, standardized measure (e.g., coefficient of variability) occurs elsewhere in statistics. For more information on this topic, read my posts about:

  • Standardizing values using the normal distribution.
  • Using standardized regression coefficients in regression analysis.
  • R-squared versus the standard error of the regression.

Share this:

  • Tweet

Related

Filed Under: Basics Tagged With: conceptual, distributions

Reader Interactions

Comments

  1. Dalia Feltman says

    December 11, 2022 at 10:35 pm

    Hi, Jim,
    Thanks so much for your helpful and quick response!
    Dalia

    Reply
  2. Dalia says

    December 11, 2022 at 10:21 am

    Hi, Jim,
    This explanation of COVs is very helpful. Once I have the COV’s, how can I compare them to say they are significantly different? Is there a test I should use? or do I just say one is greater than the other?
    Thank you,
    Dalia

    Reply
    • Jim Frost says

      December 11, 2022 at 7:57 pm

      Hi Dalia,

      Assuming that you’re using sample statistics and want to draw conclusions about the coefficient of variations for entire populations (i.e., inferential statistics), then, yes, you’d need to perform a hypothesis test to make that determination. Each sample estimate of the coefficient of variation has some sampling error. A hypothesis test accounts for that. Some statistical software include a hypothesis test for comparing coefficients of variation. The test will calculate a p-value, which you compare to your significance level. Forkman’s test is the one I’m most familiar with.

      Forkman (2009), “Estimator and Tests for Common Coefficients of Variation in Normal Distributions”, Communications in Statistics – Theory and Methods, Vol. 38, pp. 233-251.

      However, if you’re simply performing descriptive statistics and don’t need to generalize to populations, you can just compare the sample statistics and say that one sample has a higher COV than the other. In this case, you’re explicitly restricting your conclusions to the samples themselves.

      I hope that helps!

      Reply
  3. Ruta says

    May 4, 2022 at 4:34 am

    I would also love to hear thoughts on this topic. I also work in market research and often use Likert scales (1-7) to evaluate brands on various attributes. I want to make insights based on homogeneity of the respondents’ perception of a brand (i.e. lower CV means respondents agree on how well a brand is performing on a specific attribute).
    However, I’m afraid to commit to CV calculations because Likert scale isn’t a ratio, and even though the values are non-negative, there is no absolute zero.
    Can anyone help me understand better if CV is meaningful for comparing means derived from Likert scale scores?

    Reply
  4. Hasan Abdel says

    November 28, 2021 at 12:02 pm

    Thanks for your help. I’d like to ask a question please. Does it make sense to use the CV when the mean is negtaive?

    Reply
  5. FEKADU BEYENE CHEMO says

    November 16, 2021 at 7:40 am

    This really is quality teaching from an extremely knowledgeable person! Thank you very much,i tip my hat for you boss.

    Reply
    • Jim Frost says

      November 16, 2021 at 2:14 pm

      Thanks so much! 🙂

      Reply
  6. David A Gutting says

    June 14, 2021 at 10:57 pm

    I’m trying to determine if CV is a useful measure in a market research project I’ve been doing. We have consumers “grade” brands on a 100 point scale for how good a job they do in their product, their customer service, customer experience, how easy they are to find, and how relevant and memorable their advertising is. Some brands do well on some things, not so well on others.

    Our mean scores in all these categories are about 75 on a 100 scale–low scores are in the 50s, highs in the mid 80s, with SDs ranging from 9 to 16 and averaging about 11. CVs range from 9% to 24%. These scores are very good predictors of how well a brand will perform in the marketplace, with correlations between r=.75 to r=,80 with measurements like brand preference and brands bought most often.

    What I’ve found is that higher scoring brands (think Nike, Amazon, etc) tend to have lower CVs–meaning less variability in their performance. Weaker brands (think Spirit Airlines or Macy’s) tend to have higher CVs, sometimes twice as high as the stronger brands.

    Here’s my question: is this level of variance notable in any way? I know in the investment world, financial analysts use CVs to analyze risk–higher CVs are seen as more volatile and therefore riskier, lower CVs are more stable.

    Our theory in all this is that more holistic brands–ones that score high on multiple measures and do so consistently–are significantly more valuable than others (and the data seems to prove this). Low CV scores seem to indicate more consistent performance. This is important because traditionally in the marketing world, brands are told to focus on one or two things–such as product and advertising, and let other concerns go.

    What the CV data for the higher performing brands suggests to me is that it may prove that your brand will get demonstrably better results if it operates at full throttle on all dimensions. But–in your view, does that make sense? Are variances ranging for 9% to 24% notable enough? Here’s an example–Nike, with a CV of 9% and brand scores in the high 80s, is five times more effective in the external market than Spirit Airlines, with brand scores in the low 60s and a CV of 24%

    Sorry for the lengthy post here. I’ve talked about this to many of my colleagues (including Ph.D.s with strong statistical backgrounds), and they think the pattern is interesting but don’t seem to want to commit to a strong view. Would love to hear your thoughts.

    David Gutting

    Reply
  7. Gloria Lasu says

    June 1, 2021 at 10:31 am

    For example it is said that the CV in the lab experiments should not exceed 15 and 20 in the field experiment. I want to know the importance of why does it not exceed this percentage?

    Reply
    • Jim Frost says

      June 3, 2021 at 2:16 pm

      Hi Gloria,

      There is no universal value that is appropriate for all contexts. Instead, analysts need to determine thresholds that shouldn’t be crossed for each subject area. That can become an analysis all its own! Presumably, someone did that for the lab experiments to which you refer. Problems probably start occurring when variability exceeds those values. The nature of the problems and why they occur at those values is specific to the context. You’ll need to contact people familiar with the reasoning behind those values to understand why they were chosen.

      Reply
  8. Jill says

    May 19, 2021 at 1:33 pm

    Hi Jim, what does it mean when the coefficient of variation of a dataset of demand is higher than 1? I know that the standard deviation is higher than the mean in that case, but does this mean that the data set has a high variability? (in comparison with a coefficient of variation below 1)

    Reply
    • Jim Frost says

      May 20, 2021 at 1:29 am

      Hi Jill,

      Any CV greater than one just means that the variability is greater than the mean. In isolation, it’s hard to interpret. Whether that is large or small depends on the context, subject area, or comparison to another measure. So, I can tell you that the variability is greater than the mean but I don’t know if for your context it’s considered large. In comparison to a a measure with a CV less than one, yes, you’d say that that the relative variability is great for the greater than 1 measure than for the less than 1 measure.

      The coefficient of variation is a relative measure, which means you really need to be able to compare it to another measure to interpret it as high.

      Reply
  9. David M says

    May 17, 2021 at 1:43 pm

    there’s any convention on how to normalize or relativize CV of a measurement (in this case is always the same one but as CV is unitless it shouldn’t matter) with respect to the CV of a reference case? i’m thinking on evaluating intrinsic noise and CV should capture total noise, so assuming that extrinsic noise is constant the CV should tell me something about it. but i have doubts on if i should calculate the ratio or the difference between the sample and the reference CV, and which implications could derive from either of this options. any thoughts?
    David

    Reply
  10. david powell says

    April 2, 2021 at 10:43 am

    I am for means VERY close to zero but for means in the 2-5% range I’m also seeing CVs range from 50%-150%. Should I be concerned about that?

    Reply
  11. Andrés Limas says

    March 30, 2021 at 2:30 pm

    Hi, wuold you mind share with us, some of the literature that we could use to reference this value? Kind regards from Colombia¡¡

    Reply
    • Jim Frost says

      March 30, 2021 at 3:25 pm

      Hi Andrés,

      I don’t have a good reference offhand. However, it’s a fairly basic value so I’d imagine most Intro textbooks will cover it.

      Reply
  12. DP says

    March 27, 2021 at 9:53 pm

    Are there any guidelines that suggest how much above zero means should be before CVs become more reliable? I’m working with percentage data and just eyeballing my data it seems the CVs are quite high until the means reach about 10%. Any references would be useful here. thanks

    Reply
    • Jim Frost says

      March 29, 2021 at 3:07 pm

      Hi DP,

      I’m not aware of any specific guidelines. Just be on the lookout for some dramatically high CVs! It seems like you’re seeing that in action. Sorry, but I don’t have a reference.

      Reply
  13. Amruta says

    December 21, 2020 at 12:04 am

    Thank you so much, Jim! I just found your blog and I can see it becoming my “go-to” guide to Stats.

    Reply
    • Jim Frost says

      December 21, 2020 at 2:45 am

      Hi Amruta,

      I’m glad you found my blog and found it to be helpful! 🙂

      Happy reading!

      Reply
  14. finnstats says

    December 8, 2020 at 10:43 am

    Nice,
    Is it any cut off value for good coefficient of variation?
    Is it possible to classify Acceptable and Not acceptable CV ranges

    Reply
  15. Scott Richlen says

    November 3, 2020 at 4:51 pm

    Sorry for the confusion. In reading this article in the second line the word “mean” is hyper-linked (or whatever the term is where additional text is shown). The apple example is in that link. You’ll see the error there.

    Reply
    • Jim Frost says

      November 3, 2020 at 5:06 pm

      Thanks for catching the typo. I’ll fix it! That’s a glossary term. I’d forgotten I used that example there!

      Reply
  16. Scott Richlen says

    November 3, 2020 at 8:46 am

    I don’t understand your definition of “mean”. Isn’t the average of 5,5,6,7,8 equal to 6.2? You have “For example, if the weights of five apples are 5, 5, 6, 7, and 8, the average apple weight is 6.4.”

    Reply
    • Jim Frost says

      November 3, 2020 at 1:28 pm

      Hi Scott,

      I’m not sure what you’re referring to. I don’t write about the mean of apple weights in this post, which is about the coefficient of variation. The mean is the same as the average. The mean of the numbers you provide is 6.2. For more information about the mean, read my post about the measures of central tendency, of which this mean is one.

      If you have more questions about the mean, please post them in the comments section of that post about central tendency so we can keep the comments relevant to each post. Thanks.

      Reply
  17. José Francisco dos Reis Neto says

    October 27, 2020 at 8:00 am

    Liked it! Very good and practical for my students of Agronomy. Thank you.

    Reply
  18. Anoop says

    October 27, 2020 at 7:44 am

    Awesome post! Thank you sir

    Reply
  19. Marian Martin says

    October 27, 2020 at 5:43 am

    Never see so cristal clear presentations. I am refering all your posts and books, not only this one.
    Marian

    Reply

Comments and Questions Cancel reply

Primary Sidebar

Meet Jim

I’ll help you intuitively understand statistics by focusing on concepts and using plain English so you can concentrate on understanding your results.

Read More...

Buy My Introduction to Statistics Book!

Cover of my Introduction to Statistics: An Intuitive Guide ebook.

Buy My Hypothesis Testing Book!

Cover image of my Hypothesis Testing: An Intuitive Guide ebook.

Buy My Regression Book!

Cover for my ebook, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models.

Subscribe by Email

Enter your email address to receive notifications of new posts by email.

    I won't send you spam. Unsubscribe at any time.

    Follow Me

    • FacebookFacebook
    • RSS FeedRSS Feed
    • TwitterTwitter

    Top Posts

    • How to Interpret P-values and Coefficients in Regression Analysis
    • How To Interpret R-squared in Regression Analysis
    • How to do t-Tests in Excel
    • Multicollinearity in Regression Analysis: Problems, Detection, and Solutions
    • Z-table
    • Mean, Median, and Mode: Measures of Central Tendency
    • How to Find the P value: Process and Calculations
    • One-Tailed and Two-Tailed Hypothesis Tests Explained
    • Understanding Interaction Effects in Statistics
    • How to Interpret the F-test of Overall Significance in Regression Analysis

    Recent Posts

    • Using Scientific Notation
    • Selection Bias: Definition & Examples
    • ANCOVA: Uses, Assumptions & Example
    • Fibonacci Sequence: Formula & Uses
    • Undercoverage Bias: Definition & Examples
    • Matched Pairs Design: Uses & Examples

    Recent Comments

    • Morris on Validity in Research and Psychology: Types & Examples
    • Jim Frost on What are Robust Statistics?
    • Allan Fraser on What are Robust Statistics?
    • Steve on Survivorship Bias: Definition, Examples & Avoiding
    • Jim Frost on Using Post Hoc Tests with ANOVA

    Copyright © 2023 · Jim Frost · Privacy Policy