What is Joint Probability?
Joint probability is the likelihood that two or more events will coincide. Knowing how to calculate them allows you to solve problems such as the following. What is the probability of:
- Getting two heads in two coin tosses?
- Consecutively drawing two aces from a deck of cards?
- The next customer being a woman who buys a Mac computer?
- A bike rental customer getting both a flat front tire and a flat rear tire?
Statisticians use the notation of P(A ∩ B) to indicate the joint probability of events “A” and “B” occurring together. For example, P(F ∩ Mac) denotes the likelihood of a female buying a Mac. Equivalent variants of this notation include P(A and B) and P (A,B).
The notation includes the symbol “∩,” which signifies an intersection. This intersection specifies how two or more events, such as A and B, coincide. Consequently, joint probability is also known as the intersection of events.
A Venn diagram is a useful visual tool for understanding intersections because it shows the overlap between sets or events.
There are several ways to find joint probabilities. The following sections discuss three standard methods: tables, independent, and dependent events.
Related post: Probability Definition and Fundamentals
Joint Probability Table
When dealing with multiple events, creating a table to organize the likelihoods can be helpful. A joint probability table lists the chances of event combinations at each row and column intersection.
Remember how ∩ represents an intersection? That makes sense in a table!
For example, suppose a survey asks people about their favorite color and animal. The researchers organize the results in the table below:
Cat | Dog | Other | |
Red | 0.10 | 0.05 | 0.03 |
Green | 0.08 | 0.12 | 0.05 |
Blue | 0.03 | 0.06 | 0.02 |
We want to find the likelihood that someone chooses red as their favorite color and a dog as their favorite animal. We can locate the intersection of the “Red” row and the “Dog” column, which is 0.05.
Therefore, the joint probability is:
P(Red ∩ Dog) = 0.05
If you have a contingency table that displays frequencies rather than likelihoods, you can use it to calculate joint probabilities. For instance, the previous table might have started as a regular contingency table. Learn how in my post, Using Contingency Tables to Calculate Probabilities.
Now let’s see how to calculate joint probabilities when you know the event likelihoods.
Joint Probability Formula for Independent Events
When two events are independent, the occurrence of one event does not affect the chances of the other event occurring. In this case, we can find the joint probability by multiplying the likelihood of one event by the likelihood of another.
The joint probability formula for independent events is the following:
P(A ∩ B) = P (A) * P(B)
For example, suppose we have a coin that we flip twice. We want to find the chances of getting heads on both the first and second flips. Because each flip is independent, the probability of the first heads is 1/2, and the likelihood of heads on the second flip is also 1/2. Therefore, the joint probability is the following:
P(H1 ∩ H2) = P(heads on first flip) x P(tails on second flip)
= 1/2 x 1/2
= 1/4
Similarly, the joint probability of rolling two sixes on six-sided dice is the following:
P(6_{1} ∩ 6_{2}) = P(6 on first roll) x P(6 on second roll)
= 1/6 X 1/6 = 1/36
Related post: Independent Events
Formula for Dependent Events
We can use the general multiplication rule to calculate joint probabilities for dependent events. This rule allows us to factor in how the occurrence of one event affects the likelihood of the other event. Learn more about the Multiplication Rule.
The joint probability formula for dependent events is the following:
P(A ∩ B) = P(A) * P(B|A)
Here, P(A) represents the chances of event A occurring, while P(B|A) represents the conditional probability of event B occurring, given that event A has already happened. By multiplying these two likelihoods, we can calculate the joint probability of both events coinciding.
To solve this type of problem, you must know how the first event affects the likelihood of the second event.
Related post: Conditional Probability: Definition, Formula & Examples
Example
Suppose we need to calculate the likelihood of drawing two aces consecutively from a standard deck of 52 cards when we don’t replace the cards. Initially, the deck contains four aces, so the likelihood of drawing an ace on the first draw is 4/52 or 1/13. If we draw an ace (event A1), only three aces and 51 cards remain in the deck. Consequently, the conditional probability of drawing another ace (event A2) is now 3/51.
Using the general multiplication rule, we can find the joint probability of drawing two aces in a row:
P(A1 ∩ A2) = P(A1) * P(A2|A1)
P(A1) = 4/52 = 1/13
P(A2|A1) = 3/51
P(A1 ∩ A2) = (1/13) * (3/51) = 3/663 = 1/221
So, the joint probability of drawing two aces in a row is 1/221 or 0.0045.
In conclusion, joint probability is a powerful tool in statistics. They can model complex systems and help us make more informed decisions. Choosing the correct method to calculate them depends on the specific problem at hand.
Woody says
Hi Jim
I am a bit confused about the representation of the probabilities of two independent events. If two events are independent, isn’t the intersection of their sample spaces supposed to be an empty set? Then the corresponding probability should always be 0 ？ P(A|B) = P(A) is much understandable …
Jim Frost says
When we say that two events, A and B, are independent, it doesn’t imply that their intersection is empty. In fact, if the intersection were empty (meaning A and B cannot happen at the same time), they would be mutually exclusive, not independent.
Independence between two events means that the occurrence of one event does not affect the probability of occurrence of the other event. Mathematically, we express this as:
𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴) × 𝑃(𝐵)
The formula you mentioned:
P(A∣B)=P(A)
Is a way to express the independence of two events. It means that the probability of A occurring, given that B has occurred, is just the probability of A occurring by itself—further evidence that B’s occurrence has no effect on A.
Tuan Vu says
Hi,
I have problem while comparing the probability calculated by Venn diagram and by joint formula for a survey. In the survey, 100 students were asked about their favorite color and pet. Among them, 15 students like red color, 25 students like dogs and 5 students like both red and dogs. Using joint formula, I have the probability of picking a student who likes both red color and dogs: P(A and B) = P(A) x P(B) = 0.15 x 0.25 = 0.0375. Meanwhile, using Venn diagram, I have P(A and B) = 5/100 = 0.05. Please help me to explain why the results are different. Thank you very much.
Best regards,
Tuan
Tom Kendall says
Hi Jim,
Not sure that this is the proper venue for this question. On December 8, 2023 two jackpot winning tickets in the Mega Millions lottery game were sold at the same retail location, specifically a Chevron gas station in Los Angeles. Would you discuss the statistics involved in such an occurrence? As background, the odds of one ticket winning is one in 302,575,350. While having two winning tickets in a draw is a rather remote possibility, having them both being sold at the same retail location is beyond incredible.
Harmeet says
Hi, when did you write this article, I need to reference in my assignment, could you please help me with the year?
Jim Frost says
Hi!
When citing online resources, you typically use an “Accessed” date rather than a publication date because online content can change over time. For more information, read Purdue University’s Citing Electronic Resources.
Dr Peter Altman says
Hi – I have a question related to the Birthday Paradox. In a room of 200 people what are the chances that 2 people will share the same specific birthday and year. for example 10 October 1941? Many thanks.
Jim Frost says
Hello, that’s a rather complicated question. I can tell you that the probability of having at least two people having a given birthday of month and year (e.g., October 10) in a group of 200 is 0.42. We’ll use that value later.
However, to add in the year is difficult. I’d need to know the distribution of birth years in the overall population. And you’d have to assume that the people in the room were randomly selected. However, a group of 200 will often have some reason for the years being non-random in birth years, such as reunions, anniversary parties, and classes. In those cases, you’d expect truncated distributions centered around different values.
I would think using the birth year distribution for the general population, 1941 would be relatively rare to begin with. So having two people from that year is more rare but then also share the Month and Day would be even rarer. However, if it was during say a class reunion, then most people might be born that year for a certain reunion!
Let’s go with the general population distribution for the US. I did some quick research and back of the envelope calculations. Apparently, there were about 2.7 million people born in the US in 1941. About half are alive today. So, given the current population of the US is 332 million, 1.35m / 332m = 0.004. Hence, approximately 0.4% of the current population were born in 1941. So, to calculate the probability we just multiply 0.004 * 0.004 * 0.42 = 0.00000672. That’s your probability assuming you drew randomly from the US population. That’s miniscule!
However, if you’re not working with a situation where you’d expect the general population distribution or you’re outside the US, the results can be markedly different. For instance, f you’re working with a situation where you’d expect more older people (e.g., a class reunion or wedding anniversary for an older couple), then the probability might be notably higher. Also some states have older populations. Maine and Florida have the highest percentages of those over 65. If you’re in one of those states, the probability will be higher.
So, I can’t give you a precise answer but you have an idea of the factors at play!
Jennie says
Why would you multiply 0.004 twice?
Jim Frost says
Hi Jennie,
That’s because 0.004 is the probability that one person drawn randomly from the U.S. population was born in 1941. The original question askes about two people being born that year, so we need to multiply the two values.