• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar
  • My Store
  • Glossary
  • Home
  • About Me
  • Contact Me

Statistics By Jim

Making statistics intuitive

  • Graphs
  • Basics
  • Hypothesis Testing
  • Regression
  • ANOVA
  • Probability
  • Time Series
  • Fun

5 Number Summary: Definition, Finding & Using

By Jim Frost 6 Comments

What is the 5 Number Summary?

The 5 number summary is an exploratory data analysis tool that provides insight into the distribution of values for one variable. Collectively, this set of statistics describes where data values occur, their central tendency, variability, and the general shape of their distribution.

The five number summary provides this information using various descriptive statistics. These statistics are all order statistics—each one describes where a particular value falls in the distribution. The five statistics in this summary are the following, from highest to lowest data values:

  • Highest value in the dataset.
  • Third quartile (Q3)—greater than 75% of the values in the dataset
  • Median or second quartile (Q2)—splits the dataset in half.
  • First quartile (Q1)—greater than 25% of the values.
  • Lowest value in the dataset.

Before we interpret an example, let’s briefly understand why the 5 number summary contains these particular statistics.

Related posts: Descriptive vs. Inferential Statistics and Descriptive Statistics in Excel

Why Are These Statistics in the 5 Number Summary?

Why does the 5 number summary contain these statistics instead of more familiar ones, such as the mean and standard deviation? These five statistics provide similar types of information as other statistics while having advantages over them.

Keep in mind that the purpose of the five number summary is to provide a preliminary sense of your data during the exploratory phase of analysis. At this point, you probably don’t know much about the dataset, its distribution, or whether it contains outliers. Statisticians picked these five statistics because they are less sensitive to skewed distributions and outliers. The statistics in the 5 number summary are more robust than the mean and standard deviation.

In other words, you can trust the five number summary with a wider variety of distributions and before you’ve had a chance to identify and remove outliers. That’s extremely helpful for analyses you perform when you’re just starting to understand your data.

Additionally, as I mentioned, these are all order statistics, which means that they are valid with continuous and ordinal data—giving you greater flexibility.

In short, the 5 number summary contains a good, solid set of robust statistics you can use with a variety of distributions shapes and data types before identifying characteristics that could adversely affect other statistics.

To learn more about the concept of robust statistics, what makes them robust, and examples, read my post, What are Robust Statistics?

Interpreting the Five Number Summary

Let’s look at what these statistics tell you by working through an example dataset. For this example, we’ll look at body fat percentages in middle school girls. You might not be familiar with this subject area, but the 5 number summary can quickly help you get your bearings. You can download this CSV dataset: body_fat.

Many of the statistics in the five-number summary are quartiles, which are special percentiles. Learn more about Percentiles.

Here is the 5 number summary for these data.

Statistical output that displays the five number summary for the body fat data.

Median

The median is the second quartile and, like the mean, it is a measure of central tendency. It finds the center of your distribution, which is the location of the most common values. Notably, the median is more robust to skewed distributions and outliers than the mean. Read more about The Median.

For the body fat percentage data, the median is 27.35%. Half the values in the sample are above this value, while half are below it.

Related post: Measures of Central Tendency

Minimum and Maximum Values

The minimum and maximum values in the 5 number summary indicate where all the data values occur.

For these data, these values tell us that all body percentages in this sample are between 16.8 and 46.8%. Additionally, if you take the maximum and subtract the minimum, you get the statistical range of the data, 46.8 – 16.8 = 30. The range is a measure of variability where large values indicate the data spread out further.

Related post: Range of the Data

Interquartile Range (Q3 – Q1)

The Interquartile Range (IQR) is the distance between the third and first quartile and it is an integral part of the 5 number summary. This range indicates where the middle 50% of the data fall. Conversely, you also know that 50% falls outside this range, 25% above and 25% below the IQR. Like the range, the IQR is also a measure of variability. However, the IQR is a more robust statistic than either the range or standard deviation. Again, larger values represent greater sample variability. Read more about The Interquartile Range.

In our example dataset, the interquartile range extends from 23.05 to 33.63%. Half the data values are in this range. Furthermore, 25% are less than 23.05, and 25% are greater than 33.63.

Related post: Measures of Variability

Shape of the Distribution

The five number summary can give you a general sense of whether the distribution is symmetrical or skewed. To make this determination, compare the median to Q1 and Q3. When the median is:

  • Approximately halfway between Q1 and Q3, your data are symmetrical.
  • Closer to Q1, your data are right-skewed.
  • Closer to Q3, your data are left-skewed.

The median body fat percentage of 27.35% is closer to Q1 (23.05) than Q3 (33.63). Therefore, the distribution of values is right-skewed.

To gain a clearer picture of the distribution of these data, you should graph them with a histogram or stem-and-leaf plot. Additionally, I’ve determined the specific distribution these data follow in my post about identifying the distribution of your data.

Related post: Skewed Distributions

Boxplots Graphically Display the 5 Number Summary

Conveniently, boxplots display the 5 number summary in graphical form. The image below displays the boxplot for the body fat example. Notice how the five parts of the boxplot correspond to the summary values!

Boxplot that graphically displays the five number summary.

Related post: Boxplots

Reference

Hoaglin, David C.; Mosteller, Frederick; Tukey, John W., eds. (21 December 1982). Understanding Robust and Exploratory Data Analysis. Wiley Series in Probability and Statistics (1st ed.). Wiley.

Share this:

  • Tweet

Related

Filed Under: Basics Tagged With: analysis example, distributions, interpreting results

Reader Interactions

Comments

  1. Milan Timilsina says

    March 1, 2022 at 6:51 am

    How to find lower value in 5 number summary in continuous data

    Reply
    • Jim Frost says

      March 1, 2022 at 4:14 pm

      Hi Milan,

      The lowest number in the five-number summary is the minimum value in your dataset. If your software doesn’t tell you the minimum value, you can just sort the data in ascending order and easily pick out the lowest value in the first row.

      Reply
  2. Jereesh K Elias says

    December 15, 2021 at 12:03 am

    Thanks for the inputs that you are sharing. It is helping me to understand statistics from the very basics. The explanations are easy to understand.
    One doubt regarding the boxplot shown. Isn’t it Q1 to be written instead of Q2?

    Reply
    • Jim Frost says

      December 15, 2021 at 12:16 am

      Hi Jereesh,

      You’re very welcome. I’m thrilled to hear that my website has been helpful!

      And you’re absolutely correct. That was an error on the boxplot, which I have fixed. Thanks!

      Reply
  3. Jess says

    December 9, 2021 at 9:47 pm

    This helps a lot! I have often struggled with understanding IQR and this makes it much clearer, especially having the example to work through.

    Reply
    • Jim Frost says

      December 12, 2021 at 11:38 pm

      Hi Jess,

      I’m so glad to hear that it was helpful!!

      Reply

Comments and Questions Cancel reply

Primary Sidebar

Meet Jim

I’ll help you intuitively understand statistics by focusing on concepts and using plain English so you can concentrate on understanding your results.

Read More...

Buy My Introduction to Statistics Book!

Cover of my Introduction to Statistics: An Intuitive Guide ebook.

Buy My Hypothesis Testing Book!

Cover image of my Hypothesis Testing: An Intuitive Guide ebook.

Buy My Regression Book!

Cover for my ebook, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models.

Subscribe by Email

Enter your email address to receive notifications of new posts by email.

    I won't send you spam. Unsubscribe at any time.

    Follow Me

    • FacebookFacebook
    • RSS FeedRSS Feed
    • TwitterTwitter

    Top Posts

    • How to Interpret P-values and Coefficients in Regression Analysis
    • How To Interpret R-squared in Regression Analysis
    • How to Interpret the F-test of Overall Significance in Regression Analysis
    • Multicollinearity in Regression Analysis: Problems, Detection, and Solutions
    • How to do t-Tests in Excel
    • Z-table
    • How to Find the P value: Process and Calculations
    • Mean, Median, and Mode: Measures of Central Tendency
    • Understanding Interaction Effects in Statistics
    • F-table

    Recent Posts

    • Sampling Frame: Definition & Examples
    • Probability Mass Function: Definition, Uses & Example
    • Using Scientific Notation
    • Selection Bias: Definition & Examples
    • ANCOVA: Uses, Assumptions & Example
    • Fibonacci Sequence: Formula & Uses

    Recent Comments

    • Morris on Validity in Research and Psychology: Types & Examples
    • Jim Frost on What are Robust Statistics?
    • Allan Fraser on What are Robust Statistics?
    • Steve on Survivorship Bias: Definition, Examples & Avoiding
    • Jim Frost on Using Post Hoc Tests with ANOVA

    Copyright © 2023 · Jim Frost · Privacy Policy