What Is Marginal Probability?
Marginal probability is the chance that an event will happen without considering other variables. Statisticians write this as p(A), denoting the probability of event A. You can think of it as an unconditional probability. It tells you how likely something will happen on its own, independently of other variables.
For example, consider a bakery that sells bread and other products. Suppose you only care about how often customers buy bread — not whether they also buy coffee, pastries, or other products. The marginal probability of buying bread will give you this answer!
Marginal probabilities usually come from a joint probability distribution. This type of distribution shows the chances of two things happening together. When you look at just one part of the joint event, you find the marginal probability — the likelihood of that event by itself. If you know the joint distribution of two things (like bread and coffee sales), you can get the marginal distribution for bread by adding up all the combinations that include bread sales regardless of coffee.
The term “marginal probability” comes from the margins of contingency tables, where the totals for each variable appear. These marginal totals represent the likelihood of each outcome on its own, summed across the other variable.
Marginal probabilities help us understand one part of the bigger picture. Analysts use them in many fields, like economics, medicine, psychology, and marketing. Researchers and businesses rely on them to make decisions, predict outcomes, and understand data without getting lost in details.
In this post, you will learn what marginal probability is and how to calculate these probabilities with worked examples.
Related post: Probability Definition and Fundamentals
Marginal Probability Formula
The correct formula for calculating marginal probability depends on whether you’re working with discrete or continuous random variables. These formulas pull out the likelihood of one variable independently from a joint distribution that includes two or more variables. Below, we’ll look at how the formulas work for each data type with two variables.
Learn more about Discrete vs. Continuous Data: Differences & Examples and Joint Probability: Definition, Formula & Examples.
Discrete Variables
For variables you can count (like number of items sold), you calculate the marginal probability for a specific value by adding the joint probabilities across all combinations where the variable of interest equals the value you’re interested in.
The marginal probability formula for discrete variables is the following:

In the formula, X and Y are the names of the random variables, while x and y are the specific values those variables can take. On the left-hand side of the equation, we’re calculating the probability for a particular value of X.
On the right-hand side of the equation, the y beneath the summation sign, called the summation index, tells you to plug in the joint probabilities for all y values paired with X = x and add up the results.
Taken together, finding the marginal probability of X equaling a specific value involves adding the joint probabilities for all X-Y combinations where X equals your value of interest. In short, you’re ignoring Y and focusing on X.
Don’t worry, this process is simple, and we’ll work through it in an example below!
Continuous Variables
For variables measured on a continuous scale (like height and weight), you calculate the marginal probability density for a specific value by integrating (adding up tiny slices) over the joint densities across all values of the other variable.
The marginal probability formula for continuous variables is the following:

In the formula, X and Y are the names of the random variables, while x and y are the specific values they can take. On the left-hand side, we’re calculating the probability density for a particular value of X.
On the right-hand side, the dy under the integral sign tells you to integrate the joint density over all y-values paired with X = x, accumulating their contributions across the full range of Y.
Taken together, to find the marginal probability density of X at a specific value, you integrate the joint density over all X-Y combinations where X equals your value of interest. Again, you’re ignoring Y and focusing on X.
The continuous formula may look more complex, but the basic idea resembles the discrete case. With discrete variables, we sum the joint probabilities across all y values paired with X = x. For continuous variables, we can’t just add individual probabilities because there are infinitely many possible y values.
Instead, we use integration, which works like a continuous version of summing. Integration adds up all the tiny contributions across the range of Y, effectively “collapsing” Y just like we do when summing in the discrete case. So, even though the continuous formula looks more advanced, it’s an extension of the same idea: we’re focusing on one variable by collapsing across the other.
How to Calculate Marginal Probability
Step 1: Identify the Joint Probability Distribution
Start with the full picture. Find the joint probabilities for the two variables you’re studying. This information can come from a contingency table or a mathematical formula.
Step 2: Sum or Integrate Over the Other Variable
- For discrete variables: Add the joint probabilities across values of the other variable.
- For continuous variables: Integrate the joint probability density function over the range of the other variable.
Step 3: Interpret the Marginal Probability
The result indicates the likelihood of an event happening independently, without worrying about the other variable.
Marginal Probability Worked Examples
Before we jump into examples, it’s helpful to remember the difference between discrete and continuous probabilities. With discrete variables, we calculate the likelihood of specific values—for example, the chance a customer buys apples.
With continuous variables, we work with ranges of values because the chance of hitting an exact value is zero. Consequently, we use the marginal probability distribution for a continuous variable to find the likelihood that a variable falls within a specific range, like between two heights.
Discrete probabilities follow a probability mass function, while continuous probabilities follow a probability density function (PDF). Both types of functions describe how the chances are distributed across all possible values.
Learn more about Discrete and Continuous Probability Distributions: Definition & Calculations.
Discrete Variables
Suppose a store tracks whether customers buy apples (A) and bananas (B). You have this joint probability table:
| Apples (A) | Bananas (B) | P(A, B) |
| Yes | Yes | 0.2 |
| Yes | No | 0.3 |
| No | Yes | 0.1 |
| No | No | 0.4 |
In this example, the third column, P(A, B), shows the joint probability of each combination of apple and banana sales. For instance, P(A = Yes, B = Yes) = 0.2 indicates there’s a 20% chance a customer buys both apples and bananas.
We want to calculate the marginal probability that a customer buys apples. That is the total likelihood of buying apples regardless of whether they buy bananas.
The discrete marginal probability formula for this example is the following:

We need to find all the combinations that include A = Yes and sum them. We skip the combinations where A = No because we only want the probabilities where A equals the value of interest (Yes). So, we add the following:
0.2 (apples + bananas) + 0.3 (apples only) = 0.5
So, the marginal probability of buying apples is 0.5.
Continuous Variables
Imagine you’re studying the height and weight of a group of people. You have a joint probability density function for height (H) and weight (W). This distribution tells you the likelihood of seeing any specific combination of height and weight in the group.
To get the marginal probability density for height, you integrate over the joint densities across all weights paired with the specific height you’re interested in:

This formula means you’re adding up all the tiny pieces (densities) across every weight where height equals your value of interest. This process produces a height probability density that disregards weight.
In practice, a computer usually handles this calculation. But the key idea is simple: you are “averaging over” or “collapsing across” the weight variable to isolate the probability density function for height.
This process yields the marginal probability density function for height regardless of the weight. Because this is a continuous distribution, you can use the marginal density function to find the likelihood that a person’s height falls within a specific range, such as between 5.5 and 6 feet.

Comments and Questions