• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar
  • My Store
  • Glossary
  • Home
  • About Me
  • Contact Me

Statistics By Jim

Making statistics intuitive

  • Graphs
  • Basics
  • Hypothesis Testing
  • Regression
  • ANOVA
  • Probability
  • Time Series
  • Fun

Cumulative Distribution Function (CDF): Uses, Graphs & vs PDF

By Jim Frost Leave a Comment

What is a Cumulative Distribution Function?

A cumulative distribution function (CDF) describes the probabilities of a random variable having values less than or equal to x. It is a cumulative function because it sums the total likelihood up to that point. Its output always ranges between 0 and 1.

CDFs have the following definition:

CDF(x) = P(X ≤ x)

Where X is the random variable, and x is a specific value. The CDF gives us the probability that the random variable X is less than or equal to x. These functions are non-decreasing. As x increases, the likelihood can either increase or stay constant, but it can’t decrease.

Both probability density functions (PDFs) and cumulative distribution functions provide likelihoods for random variables. However, PDFs calculate probability densities for x, while CDFs give the chances for ≤ x. Learn about Probability Density Functions.

Cumulative distribution functions exist for both continuous and discrete variables. Continuous functions find solutions using integrals, while discrete functions sum the probabilities for all discrete values that are less than or equal to each value. Statisticians refer to discrete functions as Probability Mass Functions.

Read on to learn why you’d use a cumulative distribution function, graph them, and learn more about how a CDF vs PDF differs.

Learn more about Cumulative Frequencies: Finding & Interpreting.

Using Cumulative Distribution Functions

Cumulative distribution functions are excellent for providing probabilities that the next observation will be less than or equal to the value you specify. This ability can help you make decisions that incorporate uncertainty.

Additionally, these cumulative probabilities are equivalent to percentiles. A cumulative probability of 0.80 is the same as the 80th percentile. So, CDFs are great for finding percentiles. Learn more about Percentiles: Interpretations and Calculations.

For example, consider the height of an adult male in the United States. We can use the cumulative distribution function to find the probability that a person is less than or equal to 6 feet tall.

For CDF’s, we need to specify the type of distribution (e.g., normal, Weibull, binomial, etc.) and its parameters—just like we do for PDFs.

Adult males in the U.S. have heights that follow a normal distribution with a mean of 69.2 inches and a standard deviation of 2.66 inches. Consequently, we’ll need to use a normal CDF with these parameters to answer our question. Because we’re working in inches, I’ll enter 72 inches for 6 feet.

The typical CDF statistical output from your software or online calculator will look like the following:

Statistical output for the cumulative distribution function example.

The probability that an adult male will be 6 feet tall or shorter is 0.853745. Equivalently, you can say that a 6’ tall adult male is at the 85.4th percentile.

Related post: Normal Distribution

Comparing Distributions

Cumulative distribution functions are fantastic for comparing two distributions. By comparing the CDFs of two random variables, we can see if one is more likely to be less than or equal to a specific value than the other. That helps us make decisions about whether one is more likely to have a particular property.

Imagine we’re a clothing manufacturer and want to compare the prevalence of 6’ tall men to women.

Next, we’ll use the normal CDF to find the probability that an adult woman will be 6’ tall or less. Women’s heights follow a normal distribution with a mean of 64.3 inches and a standard deviation of 2.58 inches.

Statistical output for the normal CDF example.

The statistical output for the normal CDF indicates that women have a probability of 0.99858 for being ≤ 6’. That’s equivalent to the 99.9th percentile.

85.4% of men and 99.9% of women are shorter than 6’. By dividing the inverse probabilities (1 – p), we find that men more than 6 feet tall are 103 times more likely to occur than women. As a clothing manufacturer, knowing that is helpful. A woman more than 6 feet tall is a rarity!

Graphing Normal CDFs

I always think graphs bring statistical concepts to life. So, let’s graph a cumulative distribution function to see it. We’ll return to the normal CDF for men’s heights.

On a cumulative distribution function plot, the horizontal axis displays the x values, while the vertical axis displays cumulative probabilities or percentiles. The curve represents corresponding pairs of x values and cumulative probabilities. For normal CDFs, the function sums from negative infinity up to the value of x, which is (-∞, x] in interval notation. Continuous variables produce a smooth curve, like below, while discrete variables produce a stepped function.

Cumulative distribution function graph.

On the CDF graph for men’s heights, I’ve added a reference line at 6’ (i.e., 72”) to show the corresponding probability of 0.854, matching the earlier answer with rounding. Using these graphs, you can easily find probabilities and percentiles for other values. For instance, 70 inches (5′ 10″) is around the 60th percentile.

For comparison, the women’s chart is below. While the graph ends at 72 inches, the distribution actually extends to infinity in both directions.

Normal CDF graph of female heights.

A height of 6 feet is in the tail of the distribution.

CDF vs PDF

A cumulative distribution function (CDF) and a probability distribution function (PDF) are two statistical tools describing a random variable’s distribution. Both functions display the same probability information but in a different manner. In simple terms, the PDF represents the shape of the distribution, while the CDF represents the accumulation of probabilities as the value of the random variable increases. Learn more about Probability Distribution: Definition & Calculations.

PDFs can find cumulative probabilities by calculating the likelihood for a range up to a particular value. The PDF below shows the probability for the shaded area representing male heights up to 6’ (72”).

PDF graph of male heights.

It finds the same probability as the CDF, showing how they present the same underlying information in a different format.

Now, imagine that you started with the shaded area to the left side of the PDF and systematically move it to the right while recording the cumulative probabilities—that produces the CDF!

The PDF gives the probability density, the likelihood of the random variable falling close to a value. In comparison, the cumulative distribution function sums the probability densities leading up to each value.

In this manner, the probability density on a PDF is the rate of change for the CDF. Consequently, the ranges where the PDF curve has relatively high probability densities correspond to areas on the CDF curve with steeper slopes. Lower PDF densities correspond to shallower CDF slopes. As the PDF’s curve approaches its peak at the mean, the CDF’s slope increases to its maximum steepness. After the PDF’s peak, the CDF slope flattens.

Learn more about Empirical Cumulative Distribution Function Plots. These graphs help you compare an observed cumulative distribution to a fitted distribution.

Share this:

  • Tweet

Related

Filed Under: Probability Tagged With: analysis example, conceptual, distributions, graphs, interpreting results

Reader Interactions

Comments and Questions Cancel reply

Primary Sidebar

Meet Jim

I’ll help you intuitively understand statistics by focusing on concepts and using plain English so you can concentrate on understanding your results.

Read More...

Buy My Introduction to Statistics Book!

Cover of my Introduction to Statistics: An Intuitive Guide ebook.

Buy My Hypothesis Testing Book!

Cover image of my Hypothesis Testing: An Intuitive Guide ebook.

Buy My Regression Book!

Cover for my ebook, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models.

Subscribe by Email

Enter your email address to receive notifications of new posts by email.

    I won't send you spam. Unsubscribe at any time.

    Follow Me

    • FacebookFacebook
    • RSS FeedRSS Feed
    • TwitterTwitter

    Top Posts

    • How to Interpret P-values and Coefficients in Regression Analysis
    • How To Interpret R-squared in Regression Analysis
    • Z-table
    • How to do t-Tests in Excel
    • Multicollinearity in Regression Analysis: Problems, Detection, and Solutions
    • How to Find the P value: Process and Calculations
    • Mean, Median, and Mode: Measures of Central Tendency
    • How to Interpret the F-test of Overall Significance in Regression Analysis
    • Understanding Interaction Effects in Statistics
    • One-Tailed and Two-Tailed Hypothesis Tests Explained

    Recent Posts

    • Probability Mass Function: Definition, Uses & Example
    • Using Scientific Notation
    • Selection Bias: Definition & Examples
    • ANCOVA: Uses, Assumptions & Example
    • Fibonacci Sequence: Formula & Uses
    • Undercoverage Bias: Definition & Examples

    Recent Comments

    • Morris on Validity in Research and Psychology: Types & Examples
    • Jim Frost on What are Robust Statistics?
    • Allan Fraser on What are Robust Statistics?
    • Steve on Survivorship Bias: Definition, Examples & Avoiding
    • Jim Frost on Using Post Hoc Tests with ANOVA

    Copyright © 2023 · Jim Frost · Privacy Policy