Heterogeneity is defined as a dissimilarity between elements that comprise a whole. When heterogeneity is present, there is diversity in the characteristic under study. The parts of the whole are different, not the same. It is an essential concept in science and statistics. Heterogeneous is the opposite of homogeneous.

In chemistry, a heterogeneous mixture has a composition that varies. For example, oil and vinegar, sand and water, and salt and pepper are all heterogeneous mixtures. Multiple samples of these mixtures will contain different proportions of each component.

In statistics, heterogeneity is a vital concept that appears in various contexts, and its definition varies accordingly. Heterogeneity can indicate differences within individual samples, between samples, and between experimental results in a meta-analysis. It also applies to an assumption violation regarding errors in linear models. This post focuses on these statistical definitions of heterogeneity and shows you how to identify and test it statistically.

## Heterogeneity in Individual Samples

When you take a sample from a population, you can assess its heterogeneity. Do the individual items in a sample tend to be relatively similar (homogeneous) or dissimilar? Do your data contain variability? If so, how much?

You can use a measure of dispersion to assess heterogeneity. For example, higher standard deviation values indicate the sample is more diverse. Conversely, lower values indicate the items tend to be similar. When there is perfect homogeneity, all the objects in the sample are the same, and the standard deviation equals zero.

You can also plot your data to evaluate heterogeneity. In the histogram below, Group C is more diverse than Group A because the items in Group C spread out further. This broader spread represents greater heterogeneity.

**Related posts**: Standard Deviation, Measures of Variability, and Using Histograms to Evaluate Your Data

## Heterogeneity Between Samples

You can also consider whether the properties of different samples, or groups in your data, are heterogeneous. When you collect multiple samples, do they tend to be similar or different? In this context, you need to be careful to define the properties that you are assessing. Some properties of the different samples can be heterogeneous, while others are homogeneous. In this section, I show you how to assess heterogeneity between samples for continuous and categorical data.

### Continuous data

With continuous data, you can assess the heterogeneity between sample means and variability. Using boxplots, you can display their characteristics and determine whether they differ.

In the boxplot below, the groups have roughly homogeneous means and standard deviations.

The samples below have heterogeneous means but homogenous variability.

In the graph below, the groups have the homogenous means but heterogeneous variability.

While these graphs visually depict heterogeneity, you can test these properties using statistical hypothesis tests.

For instance, ANOVA compares the means of multiple samples. It tests the heterogeneity of group means. However, the F-test ANOVA assumes that the variability of the groups are equal. In other words, you can use ANOVA when group means are heterogeneous, but the variability should be homogeneous.

To determine whether the group means are statistically heterogeneous, use hypothesis tests such as t-tests and one-way ANOVA. To evaluate whether variability differs by group, use a variances test.

**Related post**: Boxplots vs. Individual Value Plots for Comparing Groups

### Categorical data

For categorical data, you can assess the heterogeneity of the categories. We’ll consider M&M candies for these examples, which have six colors: brown, yellow, green, red, orange, and blue.

Again, note the difference between heterogeneity within a sample versus between samples.

A single M&M sample will be homogeneous if it contains only one color. The sample grows increasingly heterogeneous as the number of colors increases.

However, for multiple samples, homogeneity occurs when the number and proportions of colors are the same between them. Heterogeneous batches will have different color ratios.

The pie charts below display pairs of homogeneous and heterogeneous samples of M&M colors.

You can test this statistically for categorical data using the chi-square test for homogeneity. When your p-value is low, reject the null hypothesis (homogeneity) and conclude that the samples are heterogeneous. The differences between the category proportions are dissimilar enough to be statistically significant.

The calculations for the chi-square test of homogeneity are the same as the test for independence. The difference between them lies in the hypotheses, testing logic, and sampling methods.

**Related post**: Chi-square Test of Independence

## Heterogeneity Between Scientific Studies

When you consider a series of scientific studies that all attempt to answer the same research question, you can assess the heterogeneity of their results. Meta-analysis does more than simply report the mean effect size for a set of studies. This type of analysis also considers the variability of effect sizes from the individual studies around the overall mean effect—which is where heterogeneity comes in!

Ideally, the study results are all similar (i.e., homogeneous). When that’s true, they’re all painting the same picture, giving you confidence about the real effect. However, if the results are heterogeneous, you’ll need to proceed carefully and understand the differences between the findings. You’ll also want to evaluate the degree of heterogeneity. Do the studies differ greatly, or only slightly?

I’ll show you a graphical and numeric way to evaluate heterogeneity in a meta-analysis.

### Forest plots

A forest plot, also known as a blobbogram, is a specialized plot designed to display the results of different studies in a meta-analysis. These plots depict effect sizes on the horizontal axis and include a reference line for no effect. For each experiment, it displays a point estimate for the effect and a confidence interval (CI). You can use a forest plot to evaluate heterogeneity in a meta-analysis.

The forest plot below displays 13 studies and their estimates of the effectiveness of a Bacillus Calmette-Gúerin (BCG) vaccine in preventing tuberculosis (TB).

Overall, the studies favor the treatment group that received the vaccine over the control group which did not receive it. However, there are differences between the studies. Studies have CIs of different widths. Some CIs include the null value of zero (no effect), while others do not. One study’s point estimate even favors the control group! Several other estimates fall right on the no effect line.

While the graph displays heterogeneity in the meta-analysis, we need to quantify it. This necessity brings us to the I^{2} statistic, which I’ve circled on the forest plot.

**Related post**: Control Groups in Experiments

### I² Statistic

The I^{2} statistic quantifies the degree of heterogeneity in a series of studies within a meta-analysis. This statistic is a percentage that ranges from 0 – 100%. It indicates the proportion of variation around true effect sizes other than sampling error.

Statisticians commonly use the following benchmark values to assess the degree of heterogeneity:

- 25%: Small
- 50%: Moderate
- 75%: Large

On the forest plot above, the value is 92.22%. These studies have considerable heterogeneity. We must proceed with caution when assessing the overall effectiveness of the BCG vaccine. They are not telling a consistent story!

## Heterogeneous Errors in Linear Models

Linear models assume that the errors are homogeneous. When you plot the residuals, you want to see dispersion that remains consistent throughout the entire range. Unfortunately, that’s not always the case. Statisticians refer to heterogeneous residuals as heteroscedasticity, which violates the assumption. The residual plot below shows this condition.

Notice how the spread of the residuals increases as you move to higher fitted values. Fortunately, there are several ways to address this condition.

**Related post**: Heteroscedasticity in Regression Analysis

Habib says

Good explanations of all the terms used. Appreciated your efforts dear Jim.

I suggest and recommend for individauls intersted learn Statistics in a simple and easy way.

Rodrigo Campos says

Clear and useful presentation about the issue! Good for teaching. Thanks Jim

Jim Frost says

Thanks, Rodrigo!

Bal Ram Bhui says

I enjoyed reading it. It makes the concept clear so that I can better understands background around the t-test, ANOVA and assumption of linear regression.

Anoop says

I would just add the the hetrogenity shown is in magnitude and not in direction in the meta analysis. Almost all studies show benifits, So it is less of a concern I think

Jim Frost says

Hi Anoop,

I’d agree that for these studies they do show a benefit overall, as I mention in the article itself. However, if you wanted to estimate the effect, there’d be a relatively widespread of possibilities. So, you’ll gain some benefits but it would be impossible to say precisely how much. Indeed, it might be difficult to determine whether the benefits are practically meaningful as opposed to just statistically significant.