• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar
  • My Store
  • Glossary
  • Home
  • About Me
  • Contact Me

Statistics By Jim

Making statistics intuitive

  • Graphs
  • Basics
  • Hypothesis Testing
  • Regression
  • ANOVA
  • Probability
  • Time Series
  • Fun

Skewed Distribution: Definition & Examples

By Jim Frost 2 Comments

What is a Skewed Distribution?

A skewed distribution occurs when one tail is longer than the other. Skewness defines the asymmetry of a distribution. Unlike the familiar normal distribution with its bell-shaped curve, these distributions are asymmetric. The two halves of the distribution are not mirror images because the data are not distributed equally on both sides of the distribution’s peak.

People are sometimes less comfortable with asymmetrical distributions, but they are a fact of life in some subject areas. They have logical reasons for occurring, such as when natural limits skew the results away from the boundary. We’ll get to that shortly.

In this post, learn about left and right skewed distributions, how to tell the differences in histograms and boxplots, the implications of these distributions, why they occur, and how to analyze them.

How to Tell if a Distribution is Left Skewed or Right Skewed

Let’s start by contrasting characteristics of the symmetrical normal distribution with skewed distributions.

Symmetric

Normal distribution
The normal distribution has a central peak where most observations occur, and the probability of events tapers off equally in both the positive and negative directions on the X-axis. Both halves contain equal numbers of observations. Unusual values are equally likely in both tails.

However, that’s not the case with asymmetrical distributions where probabilities decrease more slowly in one direction relative to the other. In other words, extreme values that fall further away from the peak are more likely to occur in one tail than the other. That’s why you’ll hear about left and right skewed distributions, also known as negatively and positively skewed distributions.

Right Skewed (Positively Skewed)

Right skewed distribution. Also known as positively skewed.
Right skewed distributions occur when the long tail is on the right side of the distribution. Analysts also refer to them as positively skewed. This condition occurs because probabilities taper off more slowly for higher values. Consequently, you’ll find extreme values far from the peak on the high end more frequently than on the low.

Left-Skewed (Negatively Skewed)

Left skewed distribution. Also known as negatively skewed.

Left skewed distributions occur when the long tail is on the left side of the distribution. Statisticians also refer to them as negatively skewed. This condition occurs because probabilities taper off more slowly for lower values. Therefore, you’ll find extreme values far from the peak on the low side more frequently than the high side.

The crucial point to keep in mind is that the direction of the long tail defines the skew because it indicates where you’ll find the majority of exceptional values.

Related post: Normal Distribution

What Skewed Distributions Look Like in Graphs

Identifying asymmetric distributions is straightforward in graphs. It’s just a matter of finding the longer tail. Let’s see how to do that in histograms and boxplots. Here’s what they look like in graphs.

Histograms

The two histograms below display asymmetric distributions. Histograms make it easy to see the longer tails. You can also see these positively and negatively skewed characteristics in the similar stem and leaf plot.

Histogram displays an asymmetrical distribution of the body fat data.
This histogram displays a right-skewed distribution of body fat data.

Histogram that displays a left-skewed distribution.

Boxplots

In boxplots, you’ll need to look more closely than in histograms, but you can still identify the asymmetry. I use the same data in the boxplots as I do for the histograms so you can compare them.

You have a symmetrical distribution when the box centers around the median line and the upper and lower whiskers have approximately equal lengths.

When the median is closer to the box’s lower values and the upper whisker is longer, it’s a right skewed distribution. Notice how the longer tail extends into the higher values, making it positively skewed.

Boxplot displays right-skewed distribution.

When the median is closer to the box’s higher values and the lower whisker is longer, it’s a left skewed distribution. Notice that the longer tail extends towards the lower values, making it negatively skewed.

Boxplot of a left-skewed distribution.

Related posts: Using Histograms to Understand Your Data and Boxplots vs. Individual Value Plots for Comparing Groups

Skewed Distributions and the Mean, Median, and Mode

The mean, median, and mode are all equal in the normal distribution and other symmetric distributions.

Distribution plot displays a symmetric distribution where the mean, median, and mode are all equal.

However, when you have a asymmetric distribution, it affects the relationship between these measures of central tendency. The mean is sensitive to extreme values. Consequently, the longer tail in an asymmetrical distribution pulls the mean away from the most common values.

The graphs below shows how these measures compare in different distributions.

Right skewed: The mean is greater than the median. The mean overestimates the most common values in a positively skewed distribution.

Distribution plot that displays measures of central tendency for right-skewed data.

Left skewed: The mean is less than the median. The mean underestimates the most common values in a negatively skewed distribution.

Distribution plot that displays the measures of central tendency for left-skewed data.

Because the mean over or underestimates the most frequently occurring values in asymmetric distributions, analysts often use the median in these cases. The median is a more robust statistic in the presence of extreme values.

Related post: What are Robust Statistics?

Examples of Right-Skewed Distributions

Right skewed distributions are the more common form. These distributions tend to occur when there is a lower limit, and most values are relatively close to the lower bound. Values can’t be less than this bound but can fall far from the peak on the high end, causing them to skew positively.

For example, right skewed distributions can occur in the following cases:

  • Time to failure cannot be less than zero, but there is no upper bound.
  • Wait and response times cannot be less than zero, but there are no upper limits.
  • Sales data cannot be less than zero but can have unusually large values.
  • Humans have a minimum viable weight but can have large extreme values.
  • Income cannot be less than zero, but there are some extremely high incomes.

For example, income and wealth are classic examples of right skewed distributions. Most people earn a modest amount, but some millionaires and billionaires extend the right tail into very high values. Meanwhile, the left tail cannot be less than zero. This situation creates a positive skew. Consequently, reports frequently refer to median incomes because the mean overestimates the most common values.

Histogram of right-skewed income data.

These data are based on the U.S. household income for 2006. Notice how the mean is greater than the median.

To learn more about incomes and their right skewed distributions, read my post about Global Income Distributions.

Examples of Left-Skewed Distributions

Left skewed distributions occur less frequently than their right-handed counterparts, but they exist. Frequently, they occur when there is an upper limit that values cannot exceed, and most scores are near that limit. Values can’t exceed the cap, but they can extend relatively far from the peak on the lower side, causing a negative skew.

For example, left skewed distributions can occur in the following cases:

  • Purity cannot exceed 100%, but there is room on the low side for extreme values.
  • Maximum test scores cannot exceed 100%.
  • Ages of death tend to occur around 70-80. It’s possible to live a little longer, but extreme values are more likely to appear on the lower end.

Skewed Probability Distributions and Hypothesis Tests

When data are asymmetrical, they cannot follow a normal distribution. You might need to use a distribution test to identify the distribution of your data. The following probability distributions are skewed:

  • Gamma
  • Exponential
  • Weibull
  • Lognormal
  • Beta

Click the links to learn more about why those distributions are asymmetrical and the properties they can model.

Many hypothesis tests assume your data follow the normal distribution. However, many are valid with non-normal distributions when your sample size is large enough. You can thank the central limit theorem!

However, when you have an asymmetrical distribution, the median might be a better measure. To learn about hypothesis tests for the mean and median and when to use each type, read my post, Parametric vs. Nonparametric Tests.

Share this:

  • Tweet

Related

Filed Under: Basics Tagged With: conceptual, distributions, graphs

Reader Interactions

Comments

  1. Mohamed says

    October 9, 2022 at 7:31 pm

    Thank you.
    Your way is simple and great.

    Reply
  2. Mohd Shehzoor Hussain says

    February 17, 2022 at 2:28 am

    Hi Jim, why is it recommended to use percentiles for skewed data? Can we use any other statistical methodology for skewed data?

    Reply

Comments and Questions Cancel reply

Primary Sidebar

Meet Jim

I’ll help you intuitively understand statistics by focusing on concepts and using plain English so you can concentrate on understanding your results.

Read More...

Buy My Introduction to Statistics Book!

Cover of my Introduction to Statistics: An Intuitive Guide ebook.

Buy My Hypothesis Testing Book!

Cover image of my Hypothesis Testing: An Intuitive Guide ebook.

Buy My Regression Book!

Cover for my ebook, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models.

Subscribe by Email

Enter your email address to receive notifications of new posts by email.

    I won't send you spam. Unsubscribe at any time.

    Follow Me

    • FacebookFacebook
    • RSS FeedRSS Feed
    • TwitterTwitter

    Top Posts

    • How to Interpret P-values and Coefficients in Regression Analysis
    • How To Interpret R-squared in Regression Analysis
    • Mean, Median, and Mode: Measures of Central Tendency
    • Multicollinearity in Regression Analysis: Problems, Detection, and Solutions
    • How to Interpret the F-test of Overall Significance in Regression Analysis
    • Choosing the Correct Type of Regression Analysis
    • How to Find the P value: Process and Calculations
    • Interpreting Correlation Coefficients
    • How to do t-Tests in Excel
    • Z-table

    Recent Posts

    • Fishers Exact Test: Using & Interpreting
    • Percent Change: Formula and Calculation Steps
    • X and Y Axis in Graphs
    • Simpsons Paradox Explained
    • Covariates: Definition & Uses
    • Weighted Average: Formula & Calculation Examples

    Recent Comments

    • Dave on Control Variables: Definition, Uses & Examples
    • Jim Frost on How High Does R-squared Need to Be?
    • Mark Solomons on How High Does R-squared Need to Be?
    • John Grenci on Normal Distribution in Statistics
    • Jim Frost on Normal Distribution in Statistics

    Copyright © 2023 · Jim Frost · Privacy Policy