An odds ratio (OR) quantifies the relationship between a variable and the likelihood of an event occurring. A common use for odds ratios is identifying risk factors by assessing the relationship between exposure to a risk factor and a medical outcome. For example, is there an association between exposure to a chemical and a disease?
To use an odds ratio, you must have a binary outcome. And you’ll need either a grouping variable or a continuous variable that you want to relate to your event of interest. Then, use an odds ratio to assess the relationship between your variable and the likelihood that an event occurs.
When you have a grouping variable, an odds ratio answers the question, is an event more or less likely to occur in one condition or another? It measures the odds of an outcome occurring in one context relative to a baseline or control condition. For example, your grouping variable can be a subject’s exposure to a risk factor—yes or no—to see how that relates to disease status.
With a continuous variable, an odds ratio can determine whether the odds of an event occurring change as the continuous variable changes.
In this post, learn about odds and odds ratios, how to calculate and interpret them, different ways to arrange them for several types of studies, and how to interpret their confidence intervals and p-values.
What Are Odds in Statistics?
Before you can understand an odds ratio, you must know what the odds of an event represents. In common usage, people tend to use odds and probability interchangeably. However, in statistics, odds have an exact definition. It is a specific type of probability.
Odds relate to a binary outcome where the outcome either occurs or does not occur. For example, study subjects were either infected or not infected. A person graduates or does not graduate from college. You win a game, or you lose.
Odds definition: The probability of the event occurring divided by the probability of the event not occurring.
As you can see from the formula, odds tell you how likely an event is to occur relative to it not happening. For example, imagine playing a die-rolling game where a six is very good. Your odds of rolling a six are the following:
Your odds of rolling a six is 0.20 or 1 in 5. Because the number of die outcomes is a constant six, you can replace 1/6 and 5/6 in the formula with a 1 and 5 to derive the same answer (1/5 = 0.20). I’ll use that format in the examples throughout this post.
Imagine you’re playing a game. If your odds of winning are 2 (or 2 wins to 1 loss), that indicates you are twice as likely to win as to lose. On the other hand, if your odds of winning are 0.5 (or 1 win to 2 losses), you’re half as likely to win as to lose.
As you can see, the odds of an event occurring is a ratio itself. Therefore, an OR is a ratio of two ratios.
Related post: Probability Fundamentals
Odds Ratios for Two Conditions
Odds ratios with groups quantify the strength of the relationship between two conditions. They indicate how likely an outcome is to occur in one context relative to another.
The formula below shows an odds ratio for conditions A and B.
The odds in the denominator (condition B) are the baseline or control group. Consequently, the ratio tells you how much more or less likely the numerator events (condition A) are likely to occur relative to the denominator events.
If you have a treatment and control group, the treatment will be in the numerator while the control group is in the denominator. This arrangement will tell you how your treatment group fares compared to the controls.
For example, a study assesses infections in a treatment and control group. Infections are the events for the binary outcome. By calculating the following OR, analysts can determine how likely infections are in the treatment group relative to the control group.
If the treatment is effective, the odds of infections in the treatment group will be lower than the control group, which produces an odds ratio of less than one.
Let’s move on to more interpretation details!
Related post: Control Groups in Experiments
How to Interpret Odds Ratios
Due to the nature of odds ratios, the value of one becomes critical because it indicates both conditions have equal odds. Consequently, analysts always compare their OR results to one when interpreting the results. As the odds ratio moves away from one in either direction, the association between the condition and outcome becomes stronger.
Odds Ratio = 1: The ratio equals one when the numerator and denominator are equal. This equivalence occurs when the odds of the event occurring in one condition equal the odds of it happening in the other condition. There is no association between condition and event occurrence.
Odds Ratio > 1: The numerator is greater than the denominator. Hence, the event’s odds are higher for the group/condition in the numerator. This is often a risk factor.
Odds Ratio < 1: The numerator is less than the denominator. Hence, the probability of the outcome occurring is lower for the group/condition in the numerator. This can be a protective factor.
In the hypothetical infection experiment, the researchers hope that the odds ratio is less than one because that indicates the treatment group has lower odds of becoming infected than the control group.
Caution: Odds ratios are a type of correlation and do not necessarily represent causal relationships!
How to Calculate an Odds Ratio
The equation below expands the earlier formula for calculating an odds ratio with two conditions (A and B). Again, it’s the ratio of two odds. Hence, the numerator and denominator are also ratios.
In the infection example above, we assessed the relationship between treatment and the odds of being infected. Our two conditions were the treatment (condition A) and the control group (B). On the right-hand side, we’d enter the numbers of infections (events) and non-infections (non-events) from our sample for both groups.
Example Calculations for Two Groups
Let’s use data from an actual study to calculate an odds ratio. The North Carolina Division of Public Health needed to identify risk factors associated with an E. Coli breakout. We’ll calculate the OR for one risk factor, but they assessed multiple possibilities in their study.
In this study, the event is an exposure to a risk factor for E. coli infection. Our two conditions are those who are sick versus not sick. It’s an example of a case-control study, which analysts use to identify candidate risk factors using odds ratios.
Got Sick (Cases) | Did Not Get Sick (Controls) | |
Visited Petting Zoo | 36 | 64 |
Did Not Visit | 9 | 123 |
By plugging these numbers into the odds ratio formula, we can calculate the odds ratio to assess the relationship between visiting a petting zoo and becoming infected by E. coli. In case-control studies, all infected cases go in the numerator while the uninfected controls go in the denominator. The next section explains why.
The OR indicates that those who became infected with E. coli (cases) were 7.7 times more likely to have visited the petting zoo than those without symptoms (controls). That’s a big red flag for the petting zoo being the E. coli source!
This study also assessed whether awareness of the disease risk from contacting livestock was a protective factor. For this factor, the study found the following OR:
This odds ratio of 0.1 indicates that those who became infected with E. coli were only one-tenth as likely to be aware of the disease risk from contacting livestock as those who were not infected. Knowledge is power! Presumably, those who were aware of the risk took precautions!
Related post: Case-Control Studies
Different Odds Ratios Arrangements
You might have noticed differences between the treatment and control group experiment and the case-control study’s odds ratio arrangements. Different types of studies require specific odds ratios.
For the experiment, we put the treatment group odds in the numerator and the control group odds in the denominator. Both odds in the ratio relate to infections and divide the number of infections by the number of uninfected. This arrangement allows you to see how the odds of disease in the treatment group compare to the control group.
However, in case-control studies, you put only the cases (sick) in the numerator and the controls (healthy) in the denominator. Both odds in the ratio relate to exposure rather than illness. You take the number of exposures and divide it by the non-exposures for both the case and control groups. Case-control studies use this arrangement because they start with the disease outcome as the basis for sample selection, and then the researchers need to identify risk factors.
Odds Ratios for Continuous Variables
When you perform binary logistic regression using the logit transformation, you can obtain odds ratios for continuous variables. Those calculations are more complex and go beyond the scope of this post. However, I will show you how to interpret them.
Unlike the groups in the previous examples, a continuous variable can increase or decrease in value. Fortunately, the interpretation of an odds ratio for a continuous variable is similar and still centers around the value of one. When an odd ratio is:
- Greater than 1: As the continuous variable increases, the event is more likely to occur.
- Less than 1: As the variable increases, the event is less likely to occur.
- Equals 1: As the variable increases, the likelihood of the event does not change.
Example for Continuous Variables
In another post, I performed binary logistic regression and obtained odds ratios for two continuous independent variables. Let’s interpret them!
In that post, I assess whether measures of conservativeness and establishmentarianism predict membership in the Freedom Caucus within the U.S. House of Representatives in 2014.
Here’s how you interpret these scores:
- Conservativeness: Higher scores represent more conservative viewpoints.
- Establishmentarianism: Higher scores represent viewpoints that favor the political establishment.
For this post, I’ll focus on the odds ratios for this binary logistic model. For more details, read the full post: Statistical Analysis of the Republican Establishment Split.
The odds ratio for conservativeness indicates that for every 0.1 increase (the unit of change) in the conservativeness score, a House member is ~2.7 times as likely to belong to the Freedom Caucus.
Conversely, the odds ratio for establishmentness indicates that for every 0.1 increase in the establishmentarianism score, a House member is only ~73% as likely to belong to the Freedom Caucus.
Taking both results together, House members who are more conservative and less favorable towards the establishment make up the Freedom Caucus.
Confidence Intervals and P-values for Odds Ratios
So far, we’ve only looked at the point estimates for odds ratios. Those are the sample estimates that are a single value. However, sample estimates always have a margin of error thanks to sampling error. Confidence intervals and hypothesis tests (p-values) can account for that margin of error when you’re using samples to draw conclusions about populations (i.e., inferential statistics). Sample statistics are always wrong to some extent!
As with any hypothesis test, there is a null and alternative hypothesis. In the context of ORs, the value of one represents no effect. Hence, these hypotheses focus on that value.
- Null Hypothesis: The odds ratio equals 1 (no relationship).
- Alternative Hypothesis: The odds ratio does not equal 1 (relationship exists).
If the p-value for your odds ratio is less than your significance level (e.g., 0.05), reject the null hypothesis. The difference between your sample’s odds ratio and one is statistically significant. Your data provide sufficient evidence to conclude that a relationship between the variable and the event’s probability exists in the population.
Alternatively, you can use the confidence interval for an odds ratio to draw the same conclusions. If your CI excludes 1, your results are significant. However, if your CI includes 1, you can’t rule out 1 as a likely value. Consequently, your results are not statistically significant.
The confidence intervals for the two Freedom Caucus odds ratios both exclude 1. Hence, they are statistically significant.
Additionally, the width of the confidence interval indicates the precision of the estimate. Narrower intervals represent more precise estimates.
Related posts: Descriptive vs. Inferential Statistics and Hypothesis Testing Overview
Comments and Questions