What is Survivorship Bias?
Survivorship bias, or survivor bias, occurs when you tend to assess successful outcomes and disregard failures. This sampling bias paints a rosier picture of reality than is warranted by skewing the mean results upward.
Survivorship bias is a sneaky problem that tends to slip into analyses unnoticed. For starters, it feels natural to emphasize success, whether it’s businesses, entrepreneurs, or survivors of a medical condition. We focus on and share these stories more than the failures. Additionally, successful cases are usually more visible and easier to contact than unsuccessful cases. Reaching long-running businesses, successful mutual funds, and living patients is simpler.
However, focusing on the high-performing successes and disregarding other cases introduces survivorship bias. After all, you’re leaving out a significant part of the picture by not assessing the failures! Incomplete data can distort the results and lead you astray.
Survivorship bias also plays on our tendency to confuse correlation with causation. In this manner, it is like being swayed by anecdotal evidence. You see successful examples with particular attributes (correlation) and incorrectly assume that those attributes cause the success. You do not see the other cases with similar characteristics that didn’t perform well.
In short, survivorship bias produces an inaccurate sample, causing you to jump to incorrect conclusions.
Let’s dig into some survivorship bias examples, see how it works, and learn how to avoid it.
Survivorship Bias and Planes
A famous and early example of survivorship bias involves planes returning from missions during World War Two. The military wanted to put armor on the aircraft to protect vulnerable spots. However, they couldn’t place armor everywhere because it would be too heavy.
They looked at the bullet holes on the planes that returned, the survivors in this example. The military’s first inclination was to reinforce locations with the most hits. That seems to make sense. However, Abraham Wald, a mathematician, realized that survivorship bias was at work here.
The surviving planes got hit in the observed locations and still returned. Consequently, strengthening these locations aren’t top priorities. Instead, it’s critical to infer the missing data about where the non-returning planes were hit. Wald realized they needed to reinforce the locations on returning planes that were not hit. Clearly, the aircraft that got hit in those areas did not return!
Survivorship Bias Examples
There are plenty of survivorship bias examples that cover a range of areas. We saw the military example with the planes above. Let’s look at some more examples! For each of these examples of survivorship bias, notice how crucial information is downplayed or even excluded entirely, like the non-returning planes.
Consequently, you’re missing a big part of the picture when assessing these matters.
Businesses and Mutual Funds
Business and financial analysts frequently assess the financial health of firms and investment return information for mutual funds. In both cases, survivorship bias plays a role. Successful, long-running businesses and mutual funds are more prominent and easier to contact than their defunct counterparts. Psychologically, we want to see how successful cases managed to do well.
By assessing only surviving businesses and mutual funds, analysts will record positively biased financial and investment information. They’re missing an entire segment of the population, defunct companies, and mutual funds. They also won’t know if other firms and funds with similar characteristics as the successful ones also failed.
Famous College Dropout Entrepreneurs
Think about the famous college dropouts who became highly successful, such as Mark Zuckerberg, Steve Jobs, and Bill Gates. These successful examples might make you think a college degree isn’t beneficial. However, that’s survivorship bias at work!
These famous individuals are at the forefront of media reports. You hear more about them because they are unusual. You’re not considering the millions of other college dropouts that aren’t rich and famous. You need to assess their outcomes as well.
Products and Buildings
Survivorship bias affects our perception of products and buildings that survive for a long time. You might look at an old building, car, or other item and think, “they don’t make them like they used to.”
However, you’re basing that on relatively few survivors! To get an accurate picture, you need to assess the items that didn’t survive until the present. Obviously, that’s much harder. The lower-quality goods and buildings are the missing data that are long gone, skewing our picture of the overall quality of older items.
Severe Disease in Medical Studies
Survivorship bias has even occurred in medical studies about severe diseases. Younger, healthier, and more fit patients tend to survive a disease’s initial diagnosis more frequently. Hence, they are more likely to join medical studies. Conversely, older, weaker patients are less likely to survive long enough to participate in studies.
Consequently, these studies overestimate successful disease outcomes because they are less likely to include those who die shortly after diagnosis.
Medical survivorship bias can occur on a more informal basis outside of studies. You’ll occasionally hear an older person say they smoked all their life and are still alive. Perhaps it’s not so dangerous.
You know the drill now. What are the missing data?
The smokers who die from it are not around to talk about it! That can also apply to people who don’t wear seatbelts or helmets.
Peer Reviewed Journals
Survivorship bias even plays a role in peer-reviewed journals. These journals tend to publish only studies with statistically significant results. However, using the standard significance level of 0.05, 5% of all studies that don’t have a real effect will have statistically significant results by chance. Journals are likely to publish articles about these “false positive” studies. Consequently, when there are enough studies on a topic, there are bound to be statistically significant results, leading to articles.
What’s missing here? The larger number of non-significant studies on the topic.
Journal readers will think there’s a significant effect for the research question, but they don’t know about all the non-significant studies. They’re missing the results from the studies that didn’t survive the journal review process. Those non-significant studies are the non-returning planes!
Learn more about the Significance Level.
How to Avoid Survivorship Bias
Survivorship bias occurs because there’s a selection process that makes it harder to collect data from the less successful members of a population.
These are things like the following:
- Planes that didn’t survive their mission.
- Businesses and mutual funds that fail.
- College dropouts that you don’t hear about.
- Potential medical study subjects who die before participating.
- Studies that produce non-significant results.
You end up considering only the population members that make it through the selection process. Samples containing only successful examples don’t represent the entire population and create a distorted view.
To minimize the impact of survivorship bias, critique your process to determine if it is occurring. Consider what might be missing.
If survivorship bias affects your data, find ways to draw a representative sample from the population, not just a successful subset. That process might entail more expense and effort, but you’ll get better results.
Learn about other types of Sampling Bias.
Jennifer says
Great article and examples. The working definition of statistical significance is a bit off. Each study permits a 5% chance the results were obtained by chance. That’s not the same as saying 5% of all studies have results obtained by chance.
Jim Frost says
Hi Jennifer,
I don’t discuss statistical significance in detail in the post, but perhaps you’re referring to one of the articles I link to?
At any rate, you’ve gotten it backwards. 😉
The 5% error rate applies to a group (or class) of studies and NOT an individual study. And that’s different from saying that the 5% error rate applies to an individual study. It’s incorrect to apply the error rate to an individual study.
Why?
When you’re looking at the results from an individual study, it’s either correct or incorrect. Hence, the probability of the results being correct are either 100% or 0%. However, when you look at the results for a group of studies that all have a significance level of 0.05, that group will have a mixture of correct and incorrect that will average out to a 5% error rate.
Additionally, I see that wording about the results “being obtained by chance” frequently. I’m not a big fan of that wording. Technically, it’s not incorrect but only because it is so vague. It does involve chance. But all inferential statistics results involve chance. After all, we’re dealing with random samples and the procedures factor in random error. Basically, chance is at the core of these methods. So, it’s not incorrect but doesn’t really explain the role of chance in the outcomes, which is extremely specific for the significance level.
Here’s how it works in this context:
For a group studies where the null hypothesis is correct, the significance level is the probability of false positives. In other words, in a group of studies where the null is true, chance produced 5% of the random samples with properties that caused us to reject the null when we should have failed to reject the null (because it is true). Again, that applies to a group of studies, not an individual study, and only studies where the null is true.
I hope that clarifies these issues for you! They’re a bit tricky! 🙂
Steve says
We used control charts to determine acceptance criteria for QC samples at the labs where I worked. The problem was that QC samples that did not meet criteria were not reported so the next round of calculations on the control charts only included results that passed. The result was that the control limits kept getting tighter and tighter and tighter…
Wes McFee says
Throughout grad school, you have been my recurring dose of a much-needed learning reinforcement on topics I’d forgotten or those that could benefit from a refresher, and even a fresh spin on a topic I know well. Now that grad school is coming to end, I plan on using your writings as a source of ongoing learning and concept refresh. Maybe it sounds corny, but I’m grateful that you do what you do. Kudos and a hearty thank you.
P.S. – The WWII planes example is an old favorite of mine; thanks for dusting off this important lesson.
Jim Frost says
Hi Wes,
Thanks so much for writing. Your comment made my day! I’m thrilled to hear that my website has been helpful. And congratulations on finishing graduate school! 🙂
Andrew says
Zuckerberg, Jobs, and Gates all dropped out of college. They all finished high school.
Jim Frost says
Hi Andrew,
Thanks for catching that! I’ve fixed the text.