• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar
  • My Store
  • Glossary
  • Home
  • About Me
  • Contact Me

Statistics By Jim

Making statistics intuitive

  • Graphs
  • Basics
  • Hypothesis Testing
  • Regression
  • ANOVA
  • Probability
  • Time Series
  • Fun

Binomial Distribution: Uses, Calculator & Formula

By Jim Frost 2 Comments

What is a Binomial Distribution?

The binomial distribution is a discrete probability distribution that calculates the likelihood an event will occur a specific number of times in a set number of opportunities. Use this distribution when you have a binomial random variable. These variables count how often an event occurs within a fixed number of trials. They have only two possible outcomes that are mutually exclusive.

For example, the binomial probability distribution can answer the following questions. What is the probability of getting:

  • Six heads when you toss the coin ten times?
  • 12 women in a sample size of 20?
  • Three defective items in a batch of 100?
  • Two flu infections over 20 years?

In this post, learn how to use the binomial distribution and its cumulative form, when you can use it, its formula, and how to calculate binomial probabilities by hand. I also include a binomial calculator that you can use with what you learn. I’ll walk you through the formulas for calculating the mean, variance, and probabilities for the binomial probability distribution.

Related post: Understanding Probability Distributions

Binomial Probabilities

Photo of a die for the binary distribution examples.The binomial distribution models the probabilities for exactly X events occurring in N trials when the probability of an event is known for a binomial random variable. Let’s get into some examples because that brings it to life!

I’ll start by using statistical software to calculate the binomial probabilities and create distribution plots. This process will help you understand what you can learn from it. Then we’ll move on to the binomial distribution formula.

Suppose you’re playing a game where rolling sixes on a die is really good. You want to know the probability of rolling exactly three sixes in ten die rolls. In this example, the number of events is 3 (X), the number of trials is 10 (N), and the probability (p) is 1/6 = 0.1667.

My software tells me that the likelihood is:

Numerical results for the binomial distribution example.

The binomial probability distribution calculates a likelihood of 0.155095 for rolling precisely three sixes in ten rolls.

That’s interesting but perhaps not so helpful by itself. We’re also interested in the chances for rolling other numbers of sixes. Seeing the distribution of probabilities for different numbers of sixes is much more helpful.

Binomial Distribution Graph

The binomial distribution graph is useful because it displays the probability of differing numbers of successes (Xs) out of the total number of trials (N). In the graph below, the distribution plot finds the likelihood of rolling exactly no sixes, 1 six, 2 sixes, 3 sixes, . . ., and up to 10 sixes in the ten die rolls. Using this approach, the binomial distribution graph covers the complete range of possible successes up to the total number of trials.

I like these graphs because they emphasize how we’re working with a distribution, and it’s easy to see which values happen more frequently.

In the chart, each bar represents the probability of rolling a specific number of sixes out of ten die rolls. The graph does not show the chances for seven and higher because the likelihoods of that many sixes in just ten rolls are too low to display on the chart.

Distribution plot for a random binomial variable.

The binomial distribution graph indicates the probability of rolling no sixes is about 16%. The highest chance is rolling one six (32%). Although, rolling two sixes occurs almost as frequently. Probabilities drop off quickly starting with three sixes. Additionally, the bar for three sixes matches our earlier result of 0.155095.

Binomial Cumulative Distribution Function

The binomial probability distribution is excellent for understanding the likelihood of obtaining an exact number of events (X) within a certain number of trials (N). However, many times you’re not interested in just one specific value for a binomial random variable. For example, in the die rolling example above, you might know from experience that rolling three or more sixes within ten rolls means you’re doing well. So, you actually want to learn the probability of rolling at least three sixes.

Let me introduce you to the binomial cumulative distribution function.

Technically, the binomial cumulative probability calculates the likelihood of obtaining less than or equal to X events in N trials. If you need to obtain a ≥ probability, use the inverse cumulative distribution. These days, statistical software will generally let you specify the direction of the cumulative function for the binomial distribution from the start. I’ll use the binomial distribution graph again to show you how it works.

For our example, we want to know the chances of rolling ≥ 3 sixes in 10 rolls. Below, the shaded region shows the inverse cumulative probability of rolling at least three sixes in ten die rolls.

Cumulative binomial distribution graph.

The likelihood for rolling three or more sixes in ten rolls is 0.2249, not quite 1 in 4.

For a real-world example, see how I’ve used the binomial distribution to model the number of flu infections (X) for the vaccinated vs. unvaccinated over 20 years (N).

For more information about how to use binary data, read my posts, Maximize the Value of Your Binary Data, the Negative Binomial Distribution, the Geometric Distribution, and the Hypergeometric Distribution.

Binomial Random Variable Assumptions and Notation

The binomial distribution models the probabilities for a binomial random variable having exactly X successes occurring in N trials. Your variable must satisfy the following requirements to be a binomial random variable. The binomial distribution is appropriate only for data that fulfill these assumptions.

  • There must be only two possible outcomes per trial. For example, defective or not defective, sale or no sale, pass or fail, etc.
  • The trials are independent. One trial’s outcome does not affect the subsequent trial. For instance, one coin toss doesn’t affect the result of the following coin toss.
  • The probability remains constant over time. In some areas, this assumption is true due to the physical characteristics of the process, such as coin tosses and die rolls. However, the probability won’t necessarily remain constant in other contexts. For example, the likelihood that a manufacturing process creates defective parts can change over time. If the probability can change, use the P chart (a control chart) to confirm this assumption.

Bernoulli Trials

Typically, you’ll use the binomial distribution when you have Bernoulli Trials, also known as Binomial Experiments. These trials involve binomial random variables that satisfactorily follow the assumptions above. In these trials, analysts label one of the possible outcomes as a success and the other outcome a failure.

A Bernoulli trial contains a set number of trials where the probability of a success is constant. The experiment counts the number of successes (X) out of the total number of trials (N).

You can think of the binomial probability distribution as modeling the number of successes (X) in a sample size of N.

Parameters and Notation

The binomial distribution has two parameters, n and p.

  • n: the number of trials.
  • p: the event or success probability.

You denote a binomial distribution as b(n,p).

Alternatively, you can write X∼b(n,p), which means that your binomial random variable X follows a binomial probability distribution with n trials and an event probability of p.

The previous examples assess probabilities corresponding with rolling sixes in a series of 10 die rolls. In this scenario, success is rolling a six, while a failure is rolling anything other than a six. The probability of rolling a six is 1/6 = 0.1667.

If rolling sixes is our random variable X, and we roll the die ten times, we can use the following notation for the binomial distribution:

X∼b(10,0.1667)

Binomial Distribution Calculator

Use this binomial distribution calculator to calculate the binomial probabilities and cumulative probabilities. Note that it uses “events” to indicate the number of trials (n).

Omni


Let’s use this calculator to recreate the preceding die examples. In the calculator, enter Number of events (n) = 10, Probability of success per event (p) = 16.67%, choose exactly r successes, and Number of successes (r) = 3. The calculator displays a binomial probability of 15.51%, matching our results above for this specific number of sixes.

Next, change exactly r successes to r or more successes. The calculator displays 22.487, matching the results for our example with the binomial inverse cumulative distribution.

Now, try one yourself. Imagine you’re drawing a random sample of 20 from a population where 10% are statisticians. You’re hoping that your study will have 3 or fewer statisticians because they’ll gang up and ask too many pesky questions about your study design. What is the likelihood of obtaining ≤ 3 statisticians?

See the correct answer at the end of this post. Next, onto the formula for those who want to calculate the probabilities manually.

Binomial Distribution Formula

Typically, you’ll use statistical software or online calculators to calculate the probabilities for the binomial distribution. However, I’ll show you the binomial distribution formula to calculate them manually. The following formulas show you how to calculate the mean, variance, and probabilities for binomial distributions. Additionally, I’ll walk you through the formulas with worked examples.

Mean of Binomial Distribution

Let’s start with the formula for the mean of the binomial distribution.

n * p

Multiply the number of trials by the success probability. This value represents the average or expected number of successes.

For example, we roll the die ten times, and the probability of rolling a six is 0.1667.

10 * 0.1667

The mean for this binomial distribution is 1.667. On average, we’d expect to roll that many sixes in ten rolls. Of course, the actual counts of successes will always be either zero or a positive integer.

Variance of Binomial Distribution

The formula for the variance of the binomial distribution is the following:

σ2 = npq

As before, n and p are the number of trials and success probability, respectively. Q is the failure probability, which equals 1-p.

Notice that the variance of the binomial distribution is at its maximum when the probabilities for success and failure are both 0.5. As those probabilities move away from 0.5 in opposite directions, the variance decreases. Additionally, the variance also increases as the number of trials increase.

For our die example we have n = 10 rolls, a success probability of p = 0.1667, and a failure probability of q = 0.833.

10 * 0.1667 * 0.8333 = 1.3891

The variance for this binomial distribution is 1.3891.

The variance of the binomial distribution represents the variability of the probabilities around the mean of the binomial distribution. Variances use squared units. Learn more about Variances. The standard deviation is the square root of the variance of the binomial distribution.

Binomial Probability Formula

The binomial distribution formula is the following:

Binomial distribution formula that applies to a binomial random variable.

where:

  • n is the number of trials.
  • X is the number of successes
  • p is the probability of a success.

Use this formula to calculate the binomial probability for X successes occurring in n trials.

nCx is the number of ways to obtain samples with the specified number of successes occurring within the set number of trials where the order of outcomes does not matter. Specifically, it’s the number of combinations without repetition. For more information, read my post about Finding Combinations.

The binomial distribution formula takes the number of combinations, multiplies that by the probability of success raised by the number of successes, and multiplies that by the probability of failures raised by the number of failures.

Let’s work through an example calculation to bring the formula to life!

Worked Example of Finding a Binomial Probability

We’ll use the binomial distribution formula to calculate the chances of rolling exactly three sixes in ten die rolls for this example. Here are the values to enter into the formula:

  • n = 10
  • X = 3
  • p = 0.1667

For the number of combinations, we have:

Example calculations for the number of combinations.

Now, let’s enter our values into the binomial distribution formula.

Worked example of using the binomial distribution formula to calculate probabilities for a random binomial variable.

This calculation by hand confirms the previous statistical software results within rounding error.

If you need to calculate a cumulative probability for a binomial random variable, calculate the likelihood for each individual outcome and then sum them for all outcomes of interest.

For example, if you want to calculate the probability of ≥ 3 sixes in 10 rolls, calculate the likelihoods for three sixes, four sixes, etc., on up to ten sixes. Then sum that set of binomial probabilities.

In the calculator example, there is an 86.7% chance of having ≤ 3 statisticians in your sample of 20 people.

Finally, the binomial and beta distributions are closely related. Click the link to learn more!

Share this:

  • Tweet

Related

Filed Under: Probability Tagged With: distributions, graphs

Reader Interactions

Comments

  1. Paul G says

    August 25, 2022 at 9:55 pm

    Is there some way to combine binomial distributions? Here’s an example. Ann, Bob, and Carol are shooting threes on a basketball court. Ann takes 50 shots and has a 30% success rate. Bob takes 30 shots and has a 20% success rate. Carol takes 20 shots and has a 10% success rate. I can use the cumulative binomial distribution to calculate the chance that Ann makes 10 or more shots or that Bob makes 10 or more shots. How do I calculate the probability that the three of them combine to make 20 or more shots?

    Reply
  2. comefindme4b5e3b5254 says

    August 25, 2022 at 6:45 pm

    Would binomial distributions be suitable for determining the probability of a prisoner re-offending once released from prison? Thank you.

    Reply

Comments and Questions Cancel reply

Primary Sidebar

Meet Jim

I’ll help you intuitively understand statistics by focusing on concepts and using plain English so you can concentrate on understanding your results.

Read More...

Buy My Introduction to Statistics Book!

Cover of my Introduction to Statistics: An Intuitive Guide ebook.

Buy My Hypothesis Testing Book!

Cover image of my Hypothesis Testing: An Intuitive Guide ebook.

Buy My Regression Book!

Cover for my ebook, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models.

Subscribe by Email

Enter your email address to receive notifications of new posts by email.

    I won't send you spam. Unsubscribe at any time.

    Follow Me

    • FacebookFacebook
    • RSS FeedRSS Feed
    • TwitterTwitter

    Top Posts

    • How to Interpret P-values and Coefficients in Regression Analysis
    • How To Interpret R-squared in Regression Analysis
    • Mean, Median, and Mode: Measures of Central Tendency
    • Multicollinearity in Regression Analysis: Problems, Detection, and Solutions
    • How to Interpret the F-test of Overall Significance in Regression Analysis
    • Choosing the Correct Type of Regression Analysis
    • How to Find the P value: Process and Calculations
    • Interpreting Correlation Coefficients
    • How to do t-Tests in Excel
    • Z-table

    Recent Posts

    • Fishers Exact Test: Using & Interpreting
    • Percent Change: Formula and Calculation Steps
    • X and Y Axis in Graphs
    • Simpsons Paradox Explained
    • Covariates: Definition & Uses
    • Weighted Average: Formula & Calculation Examples

    Recent Comments

    • Dave on Control Variables: Definition, Uses & Examples
    • Jim Frost on How High Does R-squared Need to Be?
    • Mark Solomons on How High Does R-squared Need to Be?
    • John Grenci on Normal Distribution in Statistics
    • Jim Frost on Normal Distribution in Statistics

    Copyright © 2023 · Jim Frost · Privacy Policy