What is the Mann Whitney U Test?
The Mann Whitney U test is a nonparametric hypothesis test that compares two independent groups. Statisticians also refer to it as the Wilcoxon rank sum test.
If you’re involved in data analysis or scientific research, you’re likely familiar with the t-test. But did you know there’s another method for comparing two independent samples? That method is the Mann-Whitney U Test.
It is a nonparametric test named after two statisticians, H.B. Mann and D.R. Whitney. Because it is nonparametric, it makes fewer assumptions about your data than its parametric counterparts.
Many analysts use the Mann Whitney U test to determine whether the difference between the medians of two groups is statistically significant. However, it’s important to note that it only tells us about the median in certain situations. Interpreting the test results can be tricky. More on this later!
Learn more about Parametric vs. Nonparametric Tests and Hypothesis Testing Overview.
What Does the Mann Whitney U Test Tell You?
If you search the Internet, you’ll find the following two common interpretations for a statistically significant Mann Whitney U test:
- The difference between the medians is significant.
- The groups come from populations with different distributions.
Unfortunately, neither of these interpretations are necessarily correct—although they can be true some of the time.
In the strictest technical sense, the Mann Whitney U test indicates whether one population tends to produce higher values than the other population. This correct interpretation relates to the two I list above but doesn’t directly translate to them.
Let’s quickly examine how this test works to understand why this is true.
The test ranks all the sample data from low to high. Then it sums the ranks for both groups. If the results are statistically significant, one group tends to have higher ranking values than the other.
This analysis doesn’t involve medians or other distributional properties—just the ranks.
Special Case for Same Shapes
However, when the shapes of the two distributions are similar, the Mann Whitney U test does tell us about the median. That’s not a property of the test itself but logic. If two distributions have the same shape, but one is shifted higher, its median must be higher. But we can only draw that conclusion about the medians when the distributions have the same shapes.
In essence, the Mann Whitney U test rolls up both the location and shape parameters into a single evaluation of whether one distribution tends to produce higher values, preventing you from drawing conclusions about the location specifically (e.g., the medians). However, when you hold the shape constant, you can make inferences about the location.
These two distributions have the same shape, but the red one is shifted right to higher values. Wherever the median falls on the blue distribution, it’ll be in the corresponding position in the red distribution. In this case, the test can assess the medians.
The Mann-Whitney U test has a set of assumptions like any other statistical test.
- Independent Groups: Each group has a distinct set of subjects or items.
- Independent Observations: Each observation should be independent of others. Essentially, what happens to one shouldn’t affect the others.
- Continuous or Ordinal Data: It can handle continuous or ordinal data because it works with ranks. However, it can’t use categorical data.
- Same Distribution Shape: This assumption applies only when you want to draw inferences about the medians. If this assumption holds, the test can provide insights about the medians.
When to Use this Test?
Consider using the Mann Whitney U test when your data follow a nonnormal distribution, and you have a small sample size. Learn more about the Normal Distribution.
Alternatively, use it if understanding the median is more pertinent to your subject area than the mean and the distribution shapes are the same.
If you have more than 15 observations in each group, you might want to use the t-test even when you have nonnormal data. The central limit theorem causes the sampling distributions to converge on normality, making the t-test an appropriate choice.
Independent samples t-tests have several advantages over the Mann Whitney U test, including the following:
- More statistical power to detect differences.
- Can handle distributions with different shapes (Use Welch’s t-test).
- Avoids the interpretation issues discussed above.
In short, use this nonparametric test when you’re specifically interested in the medians, or you can’t use the t-test because you have a small, nonnormal sample.
Mann Whitney U Test Example
Suppose you’re a paint supplier, and you’re testing the median number of months that two paints last on test surfaces.
Let’s perform the analysis! Download the CSV dataset to try it yourself: MannWhitney.
The best way to report values for a Mann Whitney U test is to include both the p-value and confidence interval.
For this test, the p-value is 0.0019 and it is statistically significant. The 95.5% confidence interval for the median difference is [-3.000 -0.901]. Because the confidence interval excludes zero, it further illustrates that the results are statistically significant.
Be aware that these interpretations about the median are valid only if the two distributions have the same shape.
For another Mann Whitney U test example, read my post where I use it to analyze data from the Mythbuster’s Battle of the Sexes TV episode.
Mann-Whitney test is not just a test of medians: differences in spread can be important,
Comments and Questions