• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar
  • My Store
  • Glossary
  • Home
  • About Me
  • Contact Me

Statistics By Jim

Making statistics intuitive

  • Graphs
  • Basics
  • Hypothesis Testing
  • Regression
  • ANOVA
  • Probability
  • Time Series
  • Fun

Mean, Median, and Mode: Measures of Central Tendency

By Jim Frost 117 Comments

What is Central Tendency?

Measures of central tendency are summary statistics that represent the center point or typical value of a dataset. Examples of these measures include the mean, median, and mode. These statistics indicate where most values in a distribution fall and are also referred to as the central location of a distribution. You can think of central tendency as the propensity for data points to cluster around a middle value.

In statistics, the mean, median, and mode are the three most common measures of central tendency. Each one calculates the central point using a different method. Choosing the best measure of central tendency depends on the type of data you have. In this post, I explore the mean, median, and mode as measures of central tendency, show you how to calculate them, and how to determine which one is best for your data.


Locating the Measures of Central Tendency

Most articles about the mean, median, and mode focus on how you calculate these measures of central tendency. I’ll certainly to that, but I’m going to start with a slightly different approach. My philosophy throughout my blog is to help you intuitively grasp statistics by focusing on concepts. Consequently, I’m going to start by illustrating the central point of several datasets graphically—so you understand the goal. Then, we’ll move on to choosing the best measure of central tendency for your data and the calculations.

The three distributions below represent different data conditions. In each distribution, look for the region where the most common values fall. Even though the shapes and type of data are different, you can find that central tendency. That’s the area in the distribution where the most common values are located. These examples cover the mean, median, and mode.

Histogram that shows a continuous, symmetric distribution. The area of central tendency is circled.

Histogram that shows a continuous, skewed distribution. The area of central tendency is circled.

Bar chart of ice cream preference to illustrate the central tendency for categorical data.

As the graphs highlight, you can see where most values tend to occur. That’s the concept. Measures of central tendency represent this idea with a value. Coming up, you’ll learn that as the distribution and kind of data changes, so does the best measure of central tendency. Consequently, you need to know the type of data you have, and graph it, before choosing between the mean, median, and mode!

Related posts: Guide to Data Types and How to Graph Them

Whether you’re using the mean, median, or mode, the central tendency is only one characteristic of a distribution. Another aspect is the variability around that central value. While measures of variability is the topic of a different article (link below), this property describes how far away the data points tend to fall from the center. The graph below shows how distributions with the same central tendency (mean = 100) can actually be quite different. The panel on the left displays a distribution that is tightly clustered around the mean, while the distribution on the right is more spread out. It is crucial to understand that the central tendency summarizes only one aspect of a distribution and that it provides an incomplete picture by itself.

Graph that shows two distributions with more and less variability.

Related post: Measures of Variability: Range, Interquartile Range, Variance, and Standard Deviation

Mean

The mean is the arithmetic average, and it is probably the measure of central tendency that you are most familiar. Calculating the mean is very simple. You just add up all of the values and divide by the number of observations in your dataset.

{\displaystyle \frac {x_{1}+x_{2}+\cdots +x_{n}}{n}}

The calculation of the mean incorporates all values in the data. If you change any value, the mean changes. However, the mean doesn’t always locate the center of the data accurately. Observe the histograms below where I display the mean in the distributions.

Histogram of a symmetric distribution that shows the mean as an accurate measure of central tendency.

In a symmetric distribution, the mean locates the center accurately.

Histogram of a skewed distribution that shows how the outlier influence the mean as a measure of central tendency.

However, in a skewed distribution, the mean can miss the mark. In the histogram above, it is starting to fall outside the central area. This problem occurs because outliers have a substantial impact on the mean as a measure of central tendency. Extreme values in an extended tail pull the mean away from the center. As the distribution becomes more skewed, the mean is drawn further away from the center. Consequently, it’s best to use the mean as a measure of the central tendency when you have a symmetric distribution. More about this issue when we look at the mean vs median!

In statistics, we generally use the arithmetic mean, which is the type I discuss in this post. However, there are other types of means, such as the geometric mean. Read my post about the geometric mean to learn when it is a better measure. Use a weighted mean when you need to place differing importance on the values.

When to use the mean: Symmetric distribution, Continuous data

Related posts: Using Histograms to Understand Your Data and What is the Mean?

Median

The median is the middle value. It is the value that splits the dataset in half, making it a natural measure of central tendency.

To find the median, order your data from smallest to largest, and then find the data point that has an equal number of values above it and below it. The method for locating the median varies slightly depending on whether your dataset has an even or odd number of values. I’ll show you how to find the median for both cases. In the examples below, I use whole numbers for simplicity, but you can have decimal places.

In the dataset with the odd number of observations, notice how the number 12 has six values above it and six below it. Therefore, 12 is the median of this dataset.

Data set with an odd number of values for finding the median.

When there is an even number of values, you count in to the two innermost values and then take the average. The average of 27 and 29 is 28. Consequently, 28 is the median of this dataset.

Data set with an even number of observations for finding the median.

Outliers and skewed data have a smaller effect on the mean vs median as measures of central tendency. To understand why, imagine we have the Median dataset below and find that the median is 46. However, we discover data entry errors and need to change four values, which are shaded in the Median Fixed dataset. We’ll make them all significantly higher so that we now have a skewed distribution with large outliers.

Data set that shows how outliers have a smaller effect on the median as a measure of central tendency.

As you can see, the median doesn’t change at all. It is still 46. When comparing the mean vs median, the mean depends on all values in the dataset while the median does not. Consequently, when some of the values are more extreme, the effect on the median is smaller. Of course, with other types of changes, the median can change. When you have a skewed distribution, the median is a better measure of central tendency than the mean.

Related post: Skewed Distributions

Mean vs Median as Measures of Central Tendency

Now, let’s compare the mean vs median as measures of central tendency on symmetrical and skewed distributions to see how they perform. The histograms below allow us to compare these two statistics directly.

Histogram that shows a continuous, symmetric distribution. The mean and median are approximately equal and accurately locate the center of the distribution.

In a symmetric distribution, the mean and median both find the center accurately. They are approximately equal, and both are valid measures of central tendency.

Histogram that shows a continuous, skewed distribution. The outliers n the distribution tail pull the mean from the center. The median better represents the middle of this dataset.

In a skewed distribution, the outliers in the tail pull the mean away from the center towards the longer tail. For this example, the mean vs median differs by over 9000. The median better represents the central tendency for the skewed distribution.

These data are based on the U.S. household income for 2006. Income is the classic example of when to use the median instead of the mean because its distribution tends to be skewed. The median indicates that half of all incomes fall below 27581, and half are above it. For these data, the mean overestimates where most household incomes fall.

To learn more about incomes and their right-skewed distributions, read my post about Global Income Distributions.

Statisticians say that the median is a robust statistical while the mean is sensitive to outliers and skewed distributions.

When to use the median: Skewed distribution, Continuous data, Ordinal data

Related posts: Median Definition and Uses and What are Robust Statistics?

Mode

The mode is the value that occurs the most frequently in your data set, making it a different type of measure of central tendency than the mean or median.

To find the mode, sort the values in your dataset by numeric values or by categories. Then identify the value that occurs most often.

On a bar chart, the mode is the highest bar. If the data have multiple values that are tied for occurring the most frequently, you have a multimodal distribution. If no value repeats, the data do not have a mode. Learn more about bimodal distributions.

In the dataset below, the value 5 occurs most frequently, which makes it the mode. These data might represent a 5-point Likert scale.

A data set where 5 is the most common value, which makes it the mode.

Typically, you use the mode with categorical, ordinal, and discrete data. In fact, the mode is the only measure of central tendency that you can use with categorical data—such as the most preferred flavor of ice cream. However, with categorical data, there isn’t a central value because you can’t order the groups. With ordinal and discrete data, the mode can be a value that is not in the center. Again, the mode represents the most common value.

Bar chart of service quality to illustrate the mode as a measure of central tendency.

In the graph of service quality, Very Satisfied is the mode of this distribution because it is the most common value in the data. Notice how it is at the extreme end of the distribution. I’m sure the service providers are pleased with these results!

Learn more about How to Find the Mode.

Related post: Bar Charts: Using, Examples, and Interpreting

Finding the mode as the central tendency for continuous data

In the continuous data below, no values repeat, indicating this dataset has no mode for a measure of central tendency. With continuous data, it is unlikely that two or more values will be exactly equal because there are an infinite number of values between any two values.

Continuous data where no values repeat. This dataset does not have a mode.

When you are working with the raw continuous data, don’t be surprised if there is no mode. However, you can find the mode for continuous data by locating the maximum value on a probability distribution plot. If you can identify a probability distribution that fits your data, find the peak value and use it as the mode.

The probability distribution plot displays a lognormal distribution that has a mode of 16700. This distribution corresponds to the U.S. household income example in the median section.

A lognormal probability distribution plot that allows you to find the mode for continuous data.

When to use the mode: Categorical data, Ordinal data, Count data, Probability Distributions

What is the Best Measure of Central Tendency—the Mean, Median, or Mode?

When you have a symmetrical distribution for continuous data, the mean, median, and mode are equal. In this case, analysts tend to use the mean because it includes all of the data in the calculations. However, if you have a skewed distribution, the median is often the best measure of central tendency.

When you have ordinal data, the median or mode is usually the best choice. For categorical data, you must use the mode.

In cases where you are deciding between the mean vs median as the better measure of central tendency, you are also determining which types of statistical hypothesis tests are appropriate for your data—if that is your ultimate goal. I have written an article that discusses when to use parametric (mean) and nonparametric (median) hypothesis tests along with the advantages and disadvantages of each type.

Analysts frequently use measures of central tendency to describe their datasets. Learn how to Analyze Descriptive Statistics in Excel.

If you’re learning about statistics and like the approach I use in my blog, check out my Introduction to Statistics book! It’s available at Amazon and other retailers.

Cover of my Introduction to Statistics: An Intuitive Guide ebook.

Share this:

  • Tweet

Related

Filed Under: Basics Tagged With: conceptual, distributions, graphs

Reader Interactions

Comments

  1. Stanley says

    October 24, 2022 at 7:44 am

    Is it possible to have data with three different values of measures of central tendency?

    Reply
  2. Arjun Singh says

    February 3, 2022 at 12:13 am

    Hello Sir,
    I have been following you since 2019, and I love your way of explaining STAT. in simpler ways,

    Videos are best way to explain, I request you to kindly come up more videos of all basic stat. and advanced stat like ANOVA in R and response surface design, I would expect you to give some examples of Agricultural research.

    Thanks a lot.

    Reply
  3. Michelle Abbott says

    November 24, 2021 at 2:39 am

    hi Jim,
    Thank you for sharing your knowledge. Have read books on statistics, but your site and video have proved the most useful of all, the plain language used is really really helpful, understandable first time!.
    thanks a mil
    Michelle

    Reply
  4. Emikel Toote says

    November 10, 2021 at 1:48 pm

    Hello Jim,

    I wrote a comment about the Law of Large Numbers and Dependent Events on your “Law of Large Numbers” article, but you haven’t responded yet.
    Sincerely,
    Emikel

    Reply
    • Jim Frost says

      November 10, 2021 at 7:28 pm

      Hi Emikel, sorry for the delay! I’ve responded to your original comment in my post about the Law of Large Numbers because it was in the appropriate topic.

      Reply
  5. lia says

    November 8, 2021 at 1:29 am

    Which is considered the best measure of statistical data, is it the measures of central tendency or measures of variability?

    Reply
    • Jim Frost says

      November 8, 2021 at 12:38 pm

      Hi Lia,

      There’s no one best measure. It depends on your data and what you want to do with it. Usually, it’s best to use central tendency and variability together. Report both because each one is incomplete by itself.

      Reply
  6. Rebecca Caywood says

    September 24, 2021 at 12:15 pm

    How do I properly cite your work? I want to use some of the information I have gathered here in my Discussion Board post for my online class and want to properly give you your credit! What date was this published?

    Reply
    • Jim Frost says

      September 26, 2021 at 12:22 am

      Hi Rebecca,

      Here is a link to Purdue University’s Online Writing Lab page that describes how to cite electronic resources.

      For websites, you don’t need the publishing date but rather the date you accesses the page because they can change.

      I hope that helps! And thanks for citing my article! I appreciate that!

      Reply
  7. Akandwanaho Charles says

    September 5, 2021 at 4:28 am

    great

    Reply
  8. Mohd says

    July 27, 2021 at 12:39 pm

    Hi Jim, If we have a sample size of 1000+ data points, can we consider the data normal based on the central limit theorem?

    Reply
    • Jim Frost says

      July 29, 2021 at 11:27 pm

      Hi Mohd,

      The central limit theorem applies to the sampling distribution of the means, not the original data distribution. If you have a sample size of 1000, the original data might be non-normal (you’d have to check the distribution) but the sampling distribution of the means would follow a normal distribution, which allows you to use these data with hypothesis tests that assume normality even when your data don’t follow the normal distribution. For more details, read my post about the central limit theorem.

      Reply
  9. Manhal Alnajar says

    April 24, 2021 at 7:44 am

    HI SIR Please i have a question, if there are fields enclosed in the middle of the histogram and their values are zero, are they added to the denominator when finding the mean and the standard deviation

    Reply
  10. Nikhil says

    April 17, 2021 at 2:28 pm

    Thank you sir it was very helpful for my study.

    Reply
  11. natasha rose says

    April 14, 2021 at 8:58 am

    is it always necessary to group a set of data when finding its mean, median, or mode? why?

    Reply
    • Jim Frost says

      April 15, 2021 at 4:29 pm

      Hi Natasha, I don’t know what you mean by grouping a set of data? Do mean creating subgroups in the data? If so, no, you don’t always have to identify subgroups within a dataset. However, it can be informative in some cases. It depends on your data. For example, with heights, it is information to split the data by gender because men and women’s heights tend to have different distributions. Understand the difference helps you understand the subject area. However, your data might not have useful subgroups. You’ll need to use subject-area knowledge to make the determination.

      Reply
  12. Sadashiv Borgaonkar says

    January 20, 2021 at 12:10 am

    Superb information

    Reply
  13. davidwlocke says

    December 28, 2020 at 10:40 pm

    When the distribution is skewed, the mode will be the peak near the short tail. The short tail is opposite the long tail. Those tails would be the tails associated with the data from a single dimension.

    Reply
  14. Mansi says

    November 26, 2020 at 3:15 am

    Can I ask question? How can I get the average if the data is like this: 10/15, 4/5, 4/5, 3/10, 15/20, 16/20,16/20, 12/20, 4/10. Can you help with this?

    Reply
    • Jim Frost says

      November 27, 2020 at 1:24 am

      Hi Mansi,

      There’s two ways that come to mind.

      You can convert to decimals and take the average. Or, if you want keep them as fractions, convert them all to their equivalents with the lowest common denominator (LCD). Then, simply take the average of the numerators and place it over the LCD.

      I hope that helps!

      Reply
  15. sonia says

    November 10, 2020 at 9:27 pm

    Hello, is peak = mode?

    Reply
    • Jim Frost says

      November 10, 2020 at 10:19 pm

      Hi Sonia,

      Yes, on a bar chart or distribution plot, the peak is the mode.

      Reply
  16. Anna Yanycheva says

    November 5, 2020 at 5:01 pm

    Hi Jim! Thank you very much for very useful information. I have a question regarding interpreting of the mode. I have a left-skewed distribution of observations in my research so that the mean is not equal to the mode. For the mean, I have an explanation, i.e. “most people prefer..”. Have you an idea how I can interpret mode as the most frequent answer in the way I did it for the mean? Thank you in advance!

    Reply
  17. Karen says

    October 28, 2020 at 3:13 pm

    Hi Jim,
    I like to listen to books while I multi-task. I was wondering if your eBook has the read aloud capability?

    Reply
    • Jim Frost says

      October 28, 2020 at 10:27 pm

      Hi Karen,

      I don’t have a dedicated audiobook if that’s what you’re asking. I believe screen readers can read them.

      I’ve considered audio books, but I often use so many graphs and other statistical output that I wondered how effective they’d be.

      Reply
  18. nekay says

    October 27, 2020 at 5:06 am

    HI Mr. Jim. I have aquestion. Why is it necessary to have more than 1 method in measuring central tendency? Thank you in advance.

    Reply
  19. amanda says

    September 30, 2020 at 3:12 pm

    What information does the central tendency leave out about the distribution?

    Reply
    • Jim Frost says

      September 30, 2020 at 3:50 pm

      Hi Amanda,

      Notably, the central tendency leaves out information about the variability around the center. Read my post about measures of variability for more information.

      Reply
  20. RP Deka says

    September 25, 2020 at 4:08 am

    Thank you Sir
    Very clear and comprehensive

    Reply
  21. Md Afzal khan says

    September 22, 2020 at 2:22 pm

    Very relevant information about central tendency.

    Thanks

    Reply
  22. Emikel says

    July 29, 2020 at 11:20 am

    Hello Jim,

    Could you help me out? I posted a question for you to answer on July 8th. Your help would be greatly appreciated.

    Sincerely,
    Emikel

    Reply
    • Jim Frost says

      July 29, 2020 at 2:53 pm

      Hi Emikel, sorry for the delay! I’ve replied to it!

      Reply
  23. KECHLER POLYCARPE says

    July 14, 2020 at 1:06 am

    Hey Jim how’s your day going? Hope you’re healthy and well.

    You said this… ” Consequently, you need to know the type of data you have, and graph it, before choosing a measure of central tendency!” Is this a rule to follow for all descriptive statistics and inference statistics tests that you must visualize/graph before solving the statistical test? Or is it only when doing descriptive statistics central tendency problems?

    -Thank you

    Reply
    • Jim Frost says

      July 14, 2020 at 2:30 pm

      Hi Kechler,

      Thanks! Doing well here! I hope all is well with you too!

      Graphing is always crucial. In fact, I always say that statistics work the best when you use graphs in conjunction with numerical output. That’s a point that I make throughout my Introduction to Statistics ebook, which would be a helpful read!

      Reply
  24. MahNoor Ashrif says

    July 9, 2020 at 5:02 am

    im really disappointed my comments are not uploading here

    Reply
    • Jim Frost says

      July 9, 2020 at 3:59 pm

      Hi MahNoor, I looked through the comments I have and found that there was one that wasn’t approved. I’ve approved that and will answer it shortly. Sorry, sometimes a few comments slip through the cracks.

      Reply
  25. Emikel says

    July 8, 2020 at 4:26 pm

    Hello Jim,

    I need your help for a private project I’m working on. I’m using inferential statistics for this project. I have five samples, which are of similar sizes. The 3rd sample is the largest sample with 58 items. The 1st sample is the smallest sample with 36 items. The 2nd sample has 52 items, the 4th sample has 56 items, and the 5th sample has 42 items. The five samples’ total amounts, when graphed, like look a normal distribution. All five samples come from the same population. For the measures of central tendency, only the 3rd sample have a distribution that is close to a normal distribution. I determined this by looking at the 3rd sample graph and the mean, median, and mode are almost the same. What is interesting is that for the first sample the median and mode are less than or to the left of the mean. The median and mode continues to increase as I move from one sample to the next sample in order (1st sample, 2nd sample, 3rd sample, etc…). In the 5th sample, the median and mode are greater than or to the right of the mean. So from the 1st sample to the 5th sample, the median and mode moved from the left of the mean to the right of the mean. For the measures of variation, the 1st sample, when compared to the other four samples using the coefficient of variation (Standard deviation divided by the mean), has the highest variation. The coefficient of variation decreases as I move from one sample to the next sample in order (1st sample, 2nd sample, 3rd sample, etc…). So, the 1st sample has the highest coefficient of variation and the 5th sample has the lowest coefficient of variation. What is strange to me is that the 3rd sample has a higher coefficient of variation, therefore more variation, than the 5th sample, even though the 3rd sample has an almost normal distribution. The 5th sample graph/distribution is not even close to a normal distribution. The 5th sample graph/distribution is highly skewed to the left. The 1st sample graph/distribution is highly skewed to the right. Did I make an error while preparing these samples? Also, how do I connect the measures of central tendency to the measures of variation (range, interquartile range, standard deviation, and coefficient of variation) for each sample? More importantly, how do I connect all five samples together to make a prediction? Is there some statistical or mathematical equation available for me to use? I clearly see some patterns and trends in the five samples, but I’m having a really difficult time connecting the patterns and trends in the five samples together to make a prediction. Any help you provide would be greatly appreciated.

    Sincerely,
    Emikel

    Reply
    • Jim Frost says

      July 29, 2020 at 2:53 pm

      Hi Emikel,

      Sorry about the delay in replying! Without knowing the specifics, there’s no way I can tell for sure whether an error occurred while preparing these samples. You say these are drawn from the same population. Do you have reason to believe that using your sampling method that the samples should represent the population? Were these samples collected at different points in time? If so, do you have any reason to believe the population itself is changing over time?

      It’s not unusual that successive random samples drawn from the same population will have different properties. In fact, a key idea in inferential statistics is that the specific sample a study draws from a population is only one of an infinite number of samples that it could have obtained. Hypothesis testing incorporates this into its calculations.

      So, you need to determine if what you’re observing falls within the range of normal fluctuations between samples or are they significantly different? If you think the samples should follow a normal distribution, use a normality test to see if some are truly different. Perform one-way ANOVA to see if their means are significantly different. That sort of thing. You can also perform a variances test to see if their standard deviations are different. Again, some differences are entirely expected.

      I’m not sure what the sample preparation method involves. However, if you’re seeing a successive change in each sample, that is concerning. You should investigate that process. Understand how the preparation process could influence the data. There wouldn’t be a statistical test that tells you how errors in the preparation method could affect the results.

      So, try a mix of the statistical tests that I recommend and investigations of the preparation method. And, bear in mind that some differences between random samples drawn from the same population are entirely expected. You need to know determine whether the differences you observe go beyond what is expected by random chance.

      I hope that helps!

      Reply
  26. Harrem Khalid says

    July 7, 2020 at 3:03 pm

    https://www.pewresearch.org/methods/u-s-survey-research/questionnaire-design/

    Like this survey they use both open ended and close ended response. Can you please guide how this open ended response can affect central tendency ! Its my assignment question actually! And i am unable to understand it….

    Reply
    • Jim Frost says

      July 7, 2020 at 3:58 pm

      Hi Harrem,

      That document seems to describe what you’d need to know. They even show an example where having an open-ended vs. closed-ended question affect the results. I’d read that document more closely. It looks like a great document to me.

      In terms of the central tendency, it seems to me with an open ended document that the biggest risk is that not all respondents will provide a value in their responses (i.e., missing data). Missing data will, at the very least, increase the margins of error around the sample estimates because of the smaller sample sizes. However, if the missing values don’t occur randomly across all respondents, they can actually bias the estimates.

      Reply
  27. Harrem Khalid says

    July 7, 2020 at 2:46 pm

    Hey Jim. You did not answer my question!
    Can you please guide me.
    How an open ended response can affect measure of central tendency ? How it can be calculated in such cases?

    Reply
    • Jim Frost says

      July 7, 2020 at 2:52 pm

      Hi Harrem,

      I’m not sure that I understand your question. Are you asking whether if people write a response rather than just entering a value, how to calculate the central tendency? It might not be possible!

      If you want to get a precise answer in a precise format (such as a value), it’s best practice to ask a very specific question. If the question is open ended, you might not get the information you need to calculate what you want.

      Reply
  28. Hajra says

    July 7, 2020 at 5:34 am

    Hi jim

    Thank you. This artcle helps me alot.
    Benish, zeshan, naila etc your Question belonging to B.ed exam can no where be in exact words. You need to understand the article. I am also through the same exam and did it very well. GOOD LUCK

    Reply
  29. Michelle says

    July 2, 2020 at 2:14 pm

    Hi Jim,
    I designed a likert 5 scale questionnaire for my research. My topic was “investigating effect of feedback on students in online and physical classes” . Basically study is comparative in nature. After getting the responses I did the frequency analysis by using SPSS software to analyze how many students agreed to my statements. But now my instructor is asking for the mean analysis. I am confused because every questions has a different likert ( for 1 question it is strongly agree, for other it is disagree) I need to justify my analysis. If I get average mean of one likert value (strongly agree etc) it will invalidate my results.
    Can you please guide how can I do the mean analysis, I am from social sciences background with minimum knowledge of statistics.
    Thank you.

    Reply
  30. NailaRizwan says

    July 2, 2020 at 6:07 am

    hi jim
    measure of central tendency cannot give complete picture of data for interpretation.what kind of information is necessary to make sense of measure of central tendency?mean life expectancy of a citizen of pakistan is 58 years.what does it means.explain statistical knowledge

    Reply
    • Jim Frost says

      July 3, 2020 at 4:18 pm

      Hi, means don’t capture the variability around the mean. I show several examples of that in this post. By looking at only the mean, you don’t know how far away from the mean any given observation is likely to fall. For life expectancy, this indicates that the mean you’d expect someone to live is 58 years old. However, how closely do people fall to this mean?

      Also, you can also refine the mean with additional information. For example, that value might be for all Pakistanis. However, if you knew a person was male of female, those subpopulations probably have different means. Additionally, people with various health conditions will have different life expectancies. Also, you’d want to know how old a person is because that affects how much longer they’re expected to live.

      So, at the very least, you’d want to know the variability around the life expectancy. You’d also want to know additional information about a person to calculate their life expectancy.

      Reply
  31. Aduni says

    June 30, 2020 at 12:05 pm

    those three measures are defining central tendency then why do we need three measures

    Reply
    • Jim Frost says

      June 30, 2020 at 1:21 pm

      Hi Aduni,

      In this article, I talk about the strengths and weaknesses of each measure of central tendency. I describe the distributions and data types where each measure is either particularly good or bad. I won’t retype what I wrote throughout this article. So, just look for each measure’s strengths and weakness, when to use them, in this article. Your answers are there!

      Reply
  32. zeeshan naeem says

    June 28, 2020 at 3:14 pm

    hi jim
    measure of central tendency cannot give complete picture of data for interpretation.what kind of information is necessary to make sense of measure of central tendency?mean life expectancy of a citizen of pakistan is 58 years.what does it means.explain statistical knowledge

    Reply
  33. Arshad says

    June 28, 2020 at 8:11 am

    What kind of information is necessary to make sense of measureof central tendency?? Plzzz solve this question

    Reply
  34. benish says

    June 27, 2020 at 10:55 am

    Can anyone help me to solve this question.
    What kind of information is necessary to make sense of measure of central tendency ?

    Reply
    • Jim Frost says

      June 27, 2020 at 3:20 pm

      Hi Benish,

      It depends what you mean by “make sense.” There’s the mathematical definition of each that I describe in this article. In terms of distributions, certain measures are better for different types of distributions. I also cover those considerations in this article in detail. Read through it more carefully. Then, if you have any more specific questions, please post them here.

      Reply
  35. KECHLER POLYCARPE says

    June 23, 2020 at 6:40 pm

    Hey, Jim its Kechler thank you for your advice.

    1.When do you use distribution in Measures of central tendency?

    2.What’s the difference between Measures of Central Tendency and distribution/ empirical rule ?

    3.Is Distribution in measures of central tendency only good when doing probability or forecasting for a business?

    4. If I want one on one skype counseling sessions how much do you charge?

    Reply
    • Jim Frost says

      June 27, 2020 at 4:26 pm

      Hi Kechler,

      Thanks for writing. Distribution plays a role for which type of measure is best for your data. I cover that for each type of central tendency. So, read through and look for that.

      The empirical rule applies to how far data falls from the mean when your data follow the normal distribution. I cover that in my article about the Normal Distribution.

      Central tendencies are useful any time you want to summarize the central location of a dataset using a single value.

      Sorry, but I currently don’t have any spare time for counseling sessions. I have way too much stuff going on right now! I hope you understand.

      Reply
  36. aleeha irfan says

    June 19, 2020 at 3:19 am

    what is the role of central tendency in biostatistics?

    Reply
  37. Shehab Walid says

    June 5, 2020 at 9:12 pm

    where is the reference? and thanks for this useful information

    Reply
    • Jim Frost says

      June 5, 2020 at 9:33 pm

      What reference? If you mean that you want to cite my article, learn how at Purdue University’s webpage about citing electronic resources. Scroll down to the “A Page on a Web Site” section.

      Reply
  38. Mohd says

    May 8, 2020 at 9:40 pm

    Thank you so much Jim.

    Reply
  39. JF Labrie says

    May 8, 2020 at 3:29 am

    Hi Jim. This is so clear and intuitive! I’ll reference you website and your book to my students in my commodity finance class.
    Would you have such an intuitive explanation to compare weighted average and weighted median? I had to explain weighted median to my students lately. Took me a while to make it clear in my head before being able to build some clear slides about it.
    It was my first time hearing about it when I read in the CME Group website: “Partition prices are defined as the size-weighted median price for all trades executed during the partition.”

    Cheers to your great work!

    JF

    Reply
  40. Mohd says

    May 6, 2020 at 2:09 pm

    Hi Jim,

    In your book you have mentioned for “skewed distribution, the median is better measure of central tendency. It makes sense to pair it with interquartile range or other percentile based range”.

    Can u explain how i can pair it with interquartile range for below mentioned data as an example.

    Min – 1000
    Max – 1216
    Q1 – 1008
    Q2 – 1024 – Median
    Q3 – 1050
    IQR = Q3-Q1 = 42

    Regards,
    Mohd.

    Reply
    • Jim Frost says

      May 7, 2020 at 4:21 pm

      Hi Mohd,

      Thank you so much for supporting my ebook. I really appreciate it!

      What I meant is that with a normal distribution, you know that approximately 95% of the values will fall between the mean and +/- 2*SD. So, just knowing the mean and SD is very helpful in that regard. However, that doesn’t necessarily work with non-normal distributions. What you’d need to do to come up with the equivalent information is the median, along with the 2.5th percentile and 97.5th percentile. 95% of the population should fall between those percentiles (97.5 – 2.5 = 95).

      For your data, you could supply the median (1024), but you would need to calculate those two other percentiles. And that gives you equivalent information as knowing the mean and SD for a normal distribution. To calculate those percentiles, you’ll need to determine which distribution your data follow.

      Reply
  41. Asma says

    April 18, 2020 at 6:52 pm

    Hi Jim,
    thank you for your article ! it’s really helpful for me. I would like to ask you for some help for the project I’m working on.
    I have a group of multidimensional data for exemple:
    d1={12,85,23,70,6}
    d2={4,60,8,45,20}
    d3={19,20,10,14,30}
    d4={4,16,32,65,11}
    I would like to repsent this data by a single vector ! wich realy repsent this data !
    I think that the mean is not very repsentatif for a population, so is there any method better repsentatif for a population in this case !!

    Thank you for ansering.

    Reply
  42. Esther Camacho says

    April 6, 2020 at 9:24 pm

    Very Helpful. high school student.

    Reply
  43. Aliraza says

    March 30, 2020 at 10:08 am

    Thank You Sir

    Reply
  44. Sagar Baravkar says

    March 28, 2020 at 7:18 am

    Hi Jim Sir,

    As I was searching for why we cannot calculate the median for two different classes…..I got your blog….which is very useful….can you give me some information about it…. definitely will follow another topics also…👍☺️

    Reply
    • Jim Frost says

      March 29, 2020 at 3:00 am

      Hi, I’m not sure why you think that you couldn’t calculate medians for different classes?

      Reply
  45. Itzel says

    March 20, 2020 at 5:39 pm

    I rarely ever comment on blog posts but I really wanted to tell you that this has been by far the clearest explanation I’ve found on the net. Thank you so much, Jim! I just bought your book on regression 🙂

    All the best!

    Reply
    • Jim Frost says

      March 21, 2020 at 2:46 am

      Hi Itzel,

      Thank you so much for taking the time to comment. Your kind words mean a lot to me! It makes my day!

      Also, thanks so much for supporting my ebooks. I really appreciate it!

      Reply
  46. Rubel parvej says

    November 22, 2019 at 11:28 pm

    Thanks a lot… SIR
    for your kind information….
    I’m from Bangladesh.

    Reply
  47. sandeep pendela says

    November 14, 2019 at 1:03 pm

    those three measures are defining central tendency then why do we need three measures

    Reply
    • Jim Frost says

      November 15, 2019 at 11:24 pm

      Hi Sandeep,

      Read the blog post because it explains why. Spoilers but some work with particular types of data and others worked better with skewed data. It’s all in the post!

      Reply
  48. Misbah Memon says

    October 29, 2019 at 11:54 am

    when 2 variables have mean, median and modes that differ substantially from each other. What can you infer from this?

    Reply
    • Jim Frost says

      October 29, 2019 at 2:26 pm

      Hi Misbah,

      It means those variables center on different values. However, you can’t really say more without additional information.

      Suppose the variables are height and weight. Of course, they’ll have different mean, medians, and modes because they are not measuring the same thing.

      However, if they do measure the same property for similar items, such as the heights of men and women, then you might be able to conclude that those subpopulations have different properties.

      It really depends on the nature of those two variables. Statistical analyses and the conclusions that you draw are very context and subject-area sensitive.

      Reply
  49. Phil says

    October 17, 2019 at 5:13 pm

    It was so clear and understandable. I appreciate. Was really helpful. Thanks.

    Reply
  50. Ooko John says

    October 7, 2019 at 8:46 pm

    Jim,

    This is a wonderful article. I would only suggest that you consider paraphrasing this portion by possibly qualifying it: ” … Unlike the mean, the median value doesn’t depend on all the values in the dataset. ..”.

    Technically, the median depends on all the values in the dataset. This is why your explanation of its calculation reads, “To find the median, order your data from smallest to largest, and then find the data point that has an equal amount of values above it and below it”.

    Regards.

    Reply
    • Jim Frost says

      October 8, 2019 at 11:30 am

      Hi Ooko,

      Thanks for writing. I understand what you’re saying. However, there’s a large difference between how the median uses values compared to the mean. For the median, while you do sort from smallest to largest, the values above and below the median are literally just placeholders. For example, you can take any value that is above the median and change it to any other value that is above the median, and the median won’t change at all. You can do the same with values below the median. Conversely, with mean, you make a change to any value, and it affects the mean. Consequently, the median value does not depend on all the values in the dataset. You can literally change them and not affect the median. I show an example of that in this post.

      Reply
  51. Tamadur says

    September 26, 2019 at 5:44 am

    Hi
    This is my first time to read an article by you
    It is clear wonderfull article and delivered it smoothy. I could see your proffession and love to statistic.

    Reply
  52. Devanathan says

    September 8, 2019 at 7:35 pm

    I kept foraging the web for explanations on these topics and i didnt find any article as simple yet so informative and understandable.

    Reply
    • Jim Frost says

      September 8, 2019 at 7:39 pm

      Thank you for your kind words. They make my day! I’m also happy to hear that it was helpful!

      Reply
  53. Habtamu says

    August 17, 2019 at 9:11 am

    I really appreciate that! You made very clear to me!

    Reply
  54. ARIKNICE says

    July 30, 2019 at 12:37 pm

    Abena that so funny of you. i pray your lecturer wont read your comment.

    Reply
  55. surbhi Kakar says

    May 13, 2019 at 11:02 am

    Hi Jim. I would like to acknowledge for for the wonderful blog you have written. Most of the blogs/tutorials cover the basic stuff and the important stuff( like when to use which measure) are all scattered up. This blog helped me to collate everything at one place. Thank you very much!

    Reply
  56. Rasool says

    May 11, 2019 at 5:55 pm

    Thank you so much, very good explination.

    Reply
  57. andile says

    May 3, 2019 at 4:30 pm

    What are the things to include when presenting about descriptive quantitative data analysis

    Reply
  58. Joshua Okala says

    April 30, 2019 at 8:22 am

    Thank you very much

    Reply
  59. Javed Mansuri says

    April 14, 2019 at 7:48 am

    Thank you very much.

    Reply
  60. Uendel Rocha says

    March 29, 2019 at 8:58 am

    Bom dia Jim,

    Meu segundo dia de leitura do seu post. Parabéns! Estou aprendendo muito. Comprei o seu livro sobre regressão linear para meus trabalhos com ciência de dados.

    Muito obrigado.

    Reply
  61. Shams says

    March 27, 2019 at 12:34 am

    Awesome @Jim Frost!. The histogram blog was the exact answer I was searching for. Thanks again for putting together!.

    Reply
  62. Abena says

    March 24, 2019 at 9:54 pm

    Jim, Thank so much! My lecturer can never explain the difference between mean and median.

    Reply
  63. Hazel says

    March 24, 2019 at 12:53 am

    Dear sir,
    Thank you so much for your blog. It was so easy to understand. As addition can you explain the properties of good measures of central tendency?

    Reply
  64. Long Nguyen says

    March 23, 2019 at 10:24 pm

    Many thanks for the helpful article. You have given a clear explanation of the central tendency measures, and a guide where best to use each of them.

    Reply
  65. Shams says

    March 12, 2019 at 6:22 pm

    Great resource!. What about the central tendency data with double hump distribution where two data set in which the first data set has lower hump frequency higher and second data has the second one high. How the median or mean helps the central tendency? Is there any other method for such scenario?

    Reply
    • Jim Frost says

      March 21, 2019 at 11:30 am

      Hi Shams,

      Thank you! I’m glad my blog has been helpful!

      The technical name for a double hump distribution is a bimodal distribution. More generally, a distribution with more than one peak is a multimodal distribution.

      When you find a multimodal distribution, consider whether underlying subpopulations are producing it. For example, the heights of men and women have different means. It’s almost a bimodal distribution. In these cases, you might want to graph the separate distributions for each subgroup and identify each group’s central tendency.

      In other cases, subgroups don’t explain the multiple peaks. It’s just the natural shape of the distribution. In those cases, graphs become extra important because no measure of central tendency will convey the true nature of the distribution.

      For more information about multimodal distributions, please see my blog post about histograms.

      Reply
  66. uchechi says

    January 22, 2019 at 6:20 am

    thanks very helpful indeed

    Reply
  67. Sherree says

    January 16, 2019 at 4:07 pm

    Thank you so much for this article. I am in the second week of my first attempt at taking a Statistics class! This was so very helpful.

    Reply
  68. Viral S says

    December 7, 2018 at 12:11 am

    Thanks Jim!
    Quite simple and informative.
    Looking ahead to read more of your articles.

    Reply
  69. Ben says

    November 14, 2018 at 1:16 pm

    Very informative…

    Reply
  70. Carol says

    November 13, 2018 at 8:08 pm

    Hi Jim,
    thanks for your blog. I found this searching for information, and so far it’s been the easiest to read and understand! I’m attempting to do a course on using and interpreting data in schools and I get so confused with it all! When someone asks about the relationship between the mean and the median, what are they asking for?

    Reply
    • Jim Frost says

      November 14, 2018 at 10:48 am

      Hi Carol,

      Thanks! I’m glad to hear it’s been helpful! I strive to make my blog as easy to understand as possible.

      That is a bit of a vague question, but I hope the context in which it’s being asked helps.

      What they might be asking for is a description of how the mean and median tend to be approximate equal for symmetric distributions. As the distribution becomes more skewed, the difference between the mean and median increases, with the mean being pulled towards the long tail.

      Maybe that’s what they’re asking for? I hope this helps!

      Reply
  71. Khursheed Ahmad Ganaie says

    November 5, 2018 at 9:16 am

    Hllo sir ..
    I am ur biggest fan .
    Gve ur posts on Distributions of Probability
    …hope I wll get ths soon

    Reply
    • Jim Frost says

      November 5, 2018 at 10:04 am

      Hi Khursheed, I’ve already written that post. You can find it here: Understanding Probability Distributions.

      Reply
  72. learning262 says

    October 21, 2018 at 7:46 am

    Jim , It has been a while since my last stat class and i needed a refresh on those basic unfortunately a mooc the cost hundereds of dollars could not help . Your articles helped my greatly and i love the intuitive approach . I am going to go through all your articles , Please do keep writing more

    Reply
  73. Sakshi Sharma says

    October 12, 2018 at 3:20 pm

    A very nice and to the point data!!

    Reply
    • Jim Frost says

      October 12, 2018 at 3:23 pm

      Thank you, Sakshi!!

      Reply
  74. christian says

    October 4, 2018 at 7:26 am

    why it is called a measure of central tendencies?

    Reply
    • Jim Frost says

      October 4, 2018 at 9:19 am

      Hi Christian,

      In many distributions, there are values that are more likely and less likely to occur. A measure of central tendency identifies where values are more likely to occur–or where they *tend* to occur. Hence, “tendency.”

      Central is more applicable to the mean and median. Both of these measures identify a central point in the distribution. This central point is where the values are more likely to occur.

      As we saw in the post with categorical data, there is no central value. Consequently, central doesn’t really apply for the mode. But, we still use the terminology.

      Reply
  75. Manas says

    September 23, 2018 at 7:45 am

    So nicely described..Its worth.

    Reply
    • Jim Frost says

      September 24, 2018 at 10:35 am

      Thank you, Manas!

      Reply
  76. photonsquared says

    February 26, 2018 at 10:05 am

    Jim, how do you handle data spread when not using the mean?

    Reply
    • Jim Frost says

      February 26, 2018 at 10:14 am

      Hi, you must be psychic! I’m writing a post about different measures of variability right now! If you’re not using the mean because your data are skewed, I find that using the median for the central tendency and interquartile range (IQR) for the variability goes together nicely. The median splits that data in half and the IQR tells you where the middle half of the data fall. The wider the IQR, the greater the spread the data spread. You can also use percentiles to determine the spread for other proportions. For example, 95% of the data fall between the 2.5th and 97.5th percentiles.

      Reply
  77. Chuck Wynn says

    February 14, 2018 at 1:39 pm

    Thanks for that response Jim. I have one more quick question. I would think that the mode for continuous data would be important when it comes to distributions that have two (bimodal) or more (multi-modal) peaks. In these cases, where one has more than one center of tendency, it would seem to me that the mode measure of central tendency becomes the more important piece of information than either the mean or median. Would this be accurate? Or is the answer, “It depends”? 🙂

    Reply
    • Jim Frost says

      February 21, 2018 at 12:01 pm

      Hi Chuck! Apologies for the delay in getting back to you. I’ve been on vacation!

      I agree with what you say about multimodal continuous distributions. In fact, if you have a multimodal distribution, it’s often crucial that you make that determination. Suppose that you use a histogram to display the distribution of body heights. You notice that there are two peaks. There are at least three important issues here.

      1) If you are trying to identify the best probability distribution for your data, you won’t succeed!
      2) You also know that there is something else of interest for you to learn about your data. For our example, the two peaks might indicate separate measures of central tendencies for males and females. You can then better understand your data and how to analyze it.
      3) As you mention, the mean and median are less meaningful for the single multimodal distribution. You’ll probably want to identify the subpopulations (if they exist) and change your analysis.

      Graphing is always important for understanding your data. In this case, you do want to know about multimodal distribution because it affects how you interpret the measure of central tendency and could very well change how you analyze your data. It can actually point you to understanding something new about your data. In the example about the heights, we learned that males and females each have their own distribution. Gender is a relevant variable in our analysis. That’s a fairly obvious example. However, in other cases, it might lead you to something that you didn’t already consider. It’s a bit like being a detective and looking for clues!

      Thanks for the great question and good insight!

      Reply
  78. Chuck Wynn says

    February 12, 2018 at 4:03 pm

    Hi Jim,

    Yet another helpful article! I did have two questions:

    1) It seems like a good tie-in to this article would be one that describes box plots and how to understand the information that they provide. Is there an article that you’ve written on box plots that could be linked to this?

    2) Unless I’m mistaken, the central tendency of a distribution and variability around that central value tie into the concepts of accuracy and precision. Any chance that you could speak to those concepts in a future article?

    Reply
    • Jim Frost says

      February 12, 2018 at 4:27 pm

      Hi Chuck,

      Thank you very much! Those are both great ideas too.

      I definitely plan to write a more comprehensive post about how the various aspects of distributions work together–which would be a natural place to show box plots. I haven’t written that yet but it is on my list of things to write about this spring.

      As for accuracy and precision, we definitely have very specific definitions for those terms in statistics. While in everyday English they are often considered synonyms, in statistics they’re very different. And, you’re correct, they do tie into those two concepts. These terms often come up in measurement system analysis.

      If you measure parts repeatedly and the average or central tendency of the measurements are unbiased (on target on average), you have an accurate measurement system. However, if the measurements are biased (systematically too high or too low), your measurement system is inaccurate.

      If you measure the same part multiple times and the variability between measurements is low, your measurement system is precise. However, if the measurements vary quite a bit, your system is imprecise.

      You can have any combination of accuracy and precision. Accurate and precise. Accurate but not precise. Not accurate but precise. Neither accurate nor precise.

      Reply
  79. John says

    February 12, 2018 at 1:24 pm

    Very informative

    Reply
    • Jim Frost says

      February 12, 2018 at 2:16 pm

      Thank you, John!

      Reply
  80. Khursheed Ahmad Ganaie says

    February 12, 2018 at 11:47 am

    Thnks a lot …..

    Reply

Comments and Questions Cancel reply

Primary Sidebar

Meet Jim

I’ll help you intuitively understand statistics by focusing on concepts and using plain English so you can concentrate on understanding your results.

Read More...

Buy My Introduction to Statistics Book!

Cover of my Introduction to Statistics: An Intuitive Guide ebook.

Buy My Hypothesis Testing Book!

Cover image of my Hypothesis Testing: An Intuitive Guide ebook.

Buy My Regression Book!

Cover for my ebook, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models.

Subscribe by Email

Enter your email address to receive notifications of new posts by email.

    I won't send you spam. Unsubscribe at any time.

    Follow Me

    • FacebookFacebook
    • RSS FeedRSS Feed
    • TwitterTwitter

    Top Posts

    • How to Interpret P-values and Coefficients in Regression Analysis
    • How To Interpret R-squared in Regression Analysis
    • Mean, Median, and Mode: Measures of Central Tendency
    • Multicollinearity in Regression Analysis: Problems, Detection, and Solutions
    • How to Interpret the F-test of Overall Significance in Regression Analysis
    • Choosing the Correct Type of Regression Analysis
    • How to Find the P value: Process and Calculations
    • Interpreting Correlation Coefficients
    • How to do t-Tests in Excel
    • Z-table

    Recent Posts

    • Fishers Exact Test: Using & Interpreting
    • Percent Change: Formula and Calculation Steps
    • X and Y Axis in Graphs
    • Simpsons Paradox Explained
    • Covariates: Definition & Uses
    • Weighted Average: Formula & Calculation Examples

    Recent Comments

    • Dave on Control Variables: Definition, Uses & Examples
    • Jim Frost on How High Does R-squared Need to Be?
    • Mark Solomons on How High Does R-squared Need to Be?
    • John Grenci on Normal Distribution in Statistics
    • Jim Frost on Normal Distribution in Statistics

    Copyright © 2023 · Jim Frost · Privacy Policy