Median Definition and Uses

In statistics, the median is the value that splits an ordered list of data values in half. Half the values are below it and half are above—it’s right in the middle of the dataset. The median is the same as the second quartile or the 50th percentile. It is one of several measures of central tendency.

How to Find the Median

Finding the median is easy. Simply sort your data from smallest to largest. Then find the value that has the same number of data points above it and below it.

Suppose the heights of five trees are 5, 5, 6, 7, and 8. The median tree height is 6 because it is the middle value. There are two values above it and two below it.

Finding the middle value differs somewhat depending on whether your data have an even or odd number of values. I’ll show you how to find it for both scenarios. In the following examples, I use integers for simplicity, but the data can have decimal places.

Odd number of data points

When a dataset has an odd number of observations, a middle value exists. This dataset has 13 values. Notice how the number 12 has six data points above it and six below. Consequently, 12 is the median.

Even number of data points

When your data have an even number of values, there is no single data point precisely in the middle. Instead, there is a central pair. In this case, count to the two central values and average them.

27 and 29 are the innermost values in this dataset because six values are above and below this pair. The average of these two values is 28. Consequently, 28 is the median.

When to Use the Median?

The median is less sensitive to skewed data and outliers than the mean. Extreme values pull the mean away from the center of the distribution, making it potentially misleading. It might not be near the most common values in the distribution.

For instance, the mean is not a good statistic for summarizing annual income because that is a right-skewed distribution. A few highly affluent people can increase the mean dramatically, giving a misleading view of yearly incomes. For this type of data, the median is more accurate.

To understand why outliers and skewed data affect the median less, consider the dataset below. Its median is 46. However, imagine we discover data entry errors and correct four values, which I shaded in the fixed dataset. I’ll make them all significantly larger, making it a skewed distribution with severe outliers.

As you can see, the median did not change. It’s still 46. Unlike the mean, it doesn’t depend on all values in the dataset. Therefore, when some values become more extreme, their effect on it is lessened.

Use the median when you have skewed data that are continuous and when you have ordinal data. Learn more about Ordinal Data: Definition, Examples & Analysis.

Related posts: Finding Outliers and What is the Mean in Statistics?

Use my Mean, Median, and Mode Calculator to find these three measures of central tendency for your dataset along with a histogram of it!

Comparing the Mean and Median

Let’s compare the mean and median using symmetrical and skewed distributions to see how they perform.

In this symmetric distribution, both statistics locate the center correctly. They are approximately equal.

In this skewed distribution, the extreme values in the tail pull the mean from the center. It’s outside the area that contains the most typical values. Conversely, the median is near the most common values, which is appropriate when measuring central tendency.

This dataset describes U.S. household incomes for 2006 and illustrates why the mean is not appropriate for incomes. The two statistics differ by over $9,000. The mean overestimates typical household incomes.

For these data, the median indicates that half of all incomes are above $27,581, and half are below.

Statisticians consider the median to be a robust statistic, while the mean is sensitive to skewed distributions and outliers. To learn more, read my post, What are Robust Statistics?

To learn more about the other measures of central tendency, how they compare, and when to use each one, read my post: Measures of Central Tendency: Mean, Median, and Mode.

Comments

Ghanimah says

July 10, 2024 at 6:20 am

Hi Jim,
How about when the ordinal scale consists of even levels. My scale is (0,1,25,50,75,100), meaning the median is: 37.5, but what does that mean? Especially since my options are discrete & thats not an option.

Loading...

Rafea says

June 30, 2022 at 6:05 pm

Consider the statements about Median
1. It is good measure of central tendency if the difference between the minimum and maximum values are not large.
2. It is meaningful only if the average is taken of population that belongs to same class.
Which statement is correct??

Loading...

- Jim Frost says
  
  June 30, 2022 at 11:46 pm
  
  Hi Rafea,
  
  Neither of those statements really makes sense about the median.
  
  Loading...
  
Vijay Sambhaji Pawar says

December 6, 2021 at 12:23 pm

Hii Jim,

How to calculate median for ordinal data?

Loading...

- Jim Frost says
  
  December 7, 2021 at 12:11 am
  
  Hi Vijay,
  
  You use the same method as for continuous data! There is one wrinkle, however.
  
  Technically, the mean is not valid with ordinal data. And when you calculate the median for any type of data, including ordinal, and you have an even number of data points, you need to take the mean of the two innermost values. However, you can’t use the mean with ordinal data. Consequently, some statisticians will say that you should not use the median for ordinal datasets with an even number of observations. However, I’d say it’s probably ok.
  
  Loading...
  
Jeremy Fronda says

September 1, 2021 at 12:30 am

Yes, sorry for my mistake. Thank you for answering.

Loading...

Jeremy Fronda says

August 31, 2021 at 6:49 pm

Good day, how about multimodal data? How can I determine the median in a multimodal data? For example:
1,2,2,2,3,4,5 or
1,2,3,4,4,4,5

Loading...

- Jim Frost says
  
  August 31, 2021 at 7:51 pm
  
  Hi Jeremy,
  
  The method for calculating the median does not change for unimodal vs. multimodal data. So, just use the same method I describe in this post.
  
  I should note that neither of your example datasets are multimodal. They both have one mode. The first has a mode of 2 (median also equals 2) and the second has a mode of 4 and a median of 4.
  
  You can read about the mode in my post about measures of central tendency.
  
  Loading...
  
Dessie says

August 31, 2021 at 9:10 am

excellent explanation , thanks

Loading...