Nonparametric Tests vs. Parametric Tests

By Jim Frost 134 Comments

Nonparametric tests don’t require that your data follow the normal distribution. They’re also known as distribution-free tests and can provide benefits in certain situations. Typically, people who perform statistical hypothesis tests are more comfortable with parametric tests than nonparametric tests.

You’ve probably heard it’s best to use nonparametric tests if your data are not normally distributed—or something along these lines. That seems like an easy way to choose, but there’s more to the decision than that.

In this post, I’ll compare the advantages and disadvantages to help you decide between using the following types of statistical hypothesis tests:

Parametric analyses to assess group means
Nonparametric analyses to assess group medians (sometimes)

In particular, I’d like you to focus on one key reason to perform a nonparametric test that doesn’t get the attention it deserves! If you need a primer on the basics, read my hypothesis testing overview.

Related Pairs of Parametric and Nonparametric Tests

Nonparametric tests are a shadow world of parametric tests. In the table below, I show linked pairs of statistical hypothesis tests.

Parametric tests of means	Nonparametric tests of medians
1-sample t-test, Paired t-test	Sign test, Wilcoxon signed rank test
2-sample t-test	Mann-Whitney U test
One-Way ANOVA	Kruskal-Wallis test, Mood’s median test
Factorial DOE with a factor and a blocking variable	Friedman test

Additionally, Spearman’s correlation is a nonparametric alternative to Pearson’s correlation. Use Spearman’s correlation for nonlinear, monotonic relationships and for ordinal data. For more information, read my post Spearman’s Correlation Explained!

For this topic, it’s crucial you understand the concept of robust statistical analyses. Learn more in my post, What are Robust Statistics?

Advantages of Parametric Tests

Advantage 1: Parametric tests can provide trustworthy results with distributions that are skewed and nonnormal

Many people aren’t aware of this fact, but parametric analyses can produce reliable results even when your continuous data are nonnormally distributed. You just have to be sure that your sample size meets the requirements for each analysis in the table below. Simulation studies have identified these requirements. Read here for more information about these studies.

Parametric analyses	Sample size requirements for nonnormal data
1-sample t-test	Greater than 20
2-sample t-test	Each group should have more than 15 observations
One-Way ANOVA	For 2-9 groups, each group should have more than 15 observations For 10-12 groups, each group should have more than 20 observations

You can use these parametric tests with nonnormally distributed data thanks to the central limit theorem. For more information about it, read my post: Central Limit Theorem Explained.

Advantage 2: Parametric tests can provide trustworthy results when the groups have different amounts of variability

It’s true that nonparametric tests don’t require data that are normally distributed. However, nonparametric tests have the disadvantage of an additional requirement that can be very hard to satisfy. The groups in a nonparametric analysis typically must all have the same variability (dispersion). Nonparametric analyses might not provide accurate results when variability differs between groups.

Conversely, parametric analyses, like the 2-sample t-test or one-way ANOVA, allow you to analyze groups with unequal variances. In most statistical software, it’s as easy as checking the correct box! You don’t have to worry about groups having different amounts of variability when you use a parametric analysis.

Related post: Measures of Variability

Advantage 3: Parametric tests have greater statistical power

In most cases, parametric tests have more power. If an effect actually exists, a parametric analysis is more likely to detect it.

Related post: Statistical Power and Sample Size

Advantages of Nonparametric Tests

Advantage 1: Nonparametric tests assess the median which can be better for some study areas

Now we’re coming to my preferred reason for when to use a nonparametric test. The one that practitioners don’t discuss frequently enough!

For some datasets, nonparametric analyses provide an advantage because they assess the median rather than the mean. The mean is not always the better measure of central tendency for a sample. Even though you can perform a valid parametric analysis on skewed data, that doesn’t necessarily equate to being the better method. Let me explain using the distribution of salaries.

Salaries tend to be a right-skewed distribution. The majority of wages cluster around the median, which is the point where half are above and half are below. However, there is a long tail that stretches into the higher salary ranges. This long tail pulls the mean far away from the central median value. The two distributions are typical for salary distributions.

Two right skewed distributions that have equal medians but different means. — These two distributions have roughly equal medians but different means.

In these distributions, if several very high-income individuals join the sample, the mean increases by a significant amount despite the fact that incomes for most people don’t change. They still cluster around the median.

In this situation, parametric and nonparametric test results can give you different results, and they both can be correct! For the two distributions, if you draw a large random sample from each population, the difference between the means is statistically significant. Despite this, the difference between the medians is not statistically significant. Here’s how this works.

For skewed distributions, changes in the tail affect the mean substantially. Parametric tests can detect this mean change. Conversely, the median is relatively unaffected, and a nonparametric analysis can legitimately indicate that the median has not changed significantly.

You need to decide whether the mean or median is best for your study and which type of difference is more important to detect.

Advantage 2: Nonparametric tests are valid when our sample size is small and your data are potentially nonnormal

Use a nonparametric test when your sample size isn’t large enough to satisfy the requirements in the table above and you’re not sure that your data follow the normal distribution. With small sample sizes, be aware that normality tests can have insufficient power to produce useful results.

This situation is difficult. Nonparametric analyses tend to have lower power at the outset, and a small sample size only exacerbates that problem.

Advantage 3: Nonparametric tests can analyze ordinal data, ranked data, and outliers

Parametric tests can analyze only continuous data and the findings can be overly affected by outliers. Conversely, nonparametric tests can also analyze ordinal and ranked data, and not be tripped up by outliers. Learn more about Ordinal Data: Definition, Examples & Analysis.

Sometimes you can legitimately remove outliers from your dataset if they represent unusual conditions. However, sometimes outliers are a genuine part of the distribution for a study area, and you should not remove them.

You should verify the assumptions for nonparametric analyses because the various tests can analyze different types of data and have differing abilities to handle outliers.

If you’re using a Likert scale and you want to compare two groups, read my post about which analysis you should use to analyze Likert data.

Advantages and Disadvantages of Parametric and Nonparametric Tests

Many people believe that choosing between parametric and nonparametric tests depends on whether your data follow the normal distribution. If you have a small dataset, the distribution can be a deciding factor. However, in many cases, this issue is not critical because of the following:

Parametric analyses can analyze nonnormal distributions for many datasets.
Nonparametric analyses have other firm assumptions that can be harder to meet.

The answer is often contingent upon whether the mean or median is a better measure of central tendency for the distribution of your data.

If the mean is a better measure and you have a sufficiently large sample size, a parametric test usually is the better, more powerful choice.
If the median is a better measure, consider a nonparametric test regardless of your sample size.

Lastly, if your sample size is tiny, you might be forced to use a nonparametric test. It would make me ecstatic if you collect a larger sample for your next study! As the table shows, the sample size requirements aren’t too large. If you have a small sample and need to use a less powerful nonparametric analysis, it doubly lowers the chance of detecting an effect.

Generally, I recommend using a bootstrap test as the best way to evaluate medians because you don’t need to satisfy that pesky same shapes assumption! For two medians, use my free online 2-sample Median Bootstrap Test Calculator!

If you’re learning about hypothesis testing and like the approach I use in my blog, check out my Hypothesis Testing book! You can find it at Amazon and other retailers.

Comments

maria says

December 6, 2024 at 8:10 am

Thank you so much, that explanation really helped! What types of graphs would be best to test the hypothesis: Retnlg is a key protein involved in the maintenance of LT-HSC number and function. Therefore, a total knockout of Retnlg protein in a murine model will cause a reduction in the absolute LT-HSC number in the bone marrow?

I have data on LT-HSC %, bone marrow count, absolute LT-HSC count (calculated as bone marrow count × % LT-HSC), and blood counts such as hemoglobin, platelet, and neutrophil levels.

Loading...

Reply
maria says

December 4, 2024 at 6:02 pm

Our hypothesis states that knocking out the gene reduces the absolute number of LT-HSCs. To test this hypothesis, we were given data from 16 knockout mice and 16 normal (wild-type) mice, with their absolute LT-HSC counts. Additionally, each group includes 8 males and 8 females. While the hypothesis doesn’t specifically address gender, I thought it might be interesting to compare both gender and genotype, so I used a two-way ANOVA.

However, I realized there was an error in my earlier question: variance was found to be equal, but the data distribution was not normal. Given this, which test do you think would be most appropriate in this situation?

Loading...

Reply
- Jim Frost says
  
  December 4, 2024 at 7:12 pm
  
  Hi Maria,
  
  Thanks for the updated information.
  
  Because your sample size per group is larger (16 per group) for the 2-sample t-test, you can safely waive the normality assumption. These results are fairly trustworthy. Because your nonparametric test agrees with the 2-sample t-test, it suggests those results are fairly robust. Unfortunately, that’s all non-significant. But right now those are your strongest results. Both those tests are appropriate for your situation.
  
  I’m considering the combination of significant and non-significant results you mentioned in your first comment. It’s possible that you’re not finding significant results in the 2-sample tests because genders are lumped together. If combining genders increases the variability in the 2-sample tests, that could explain your lack of significance with them. Conversely, by accounting for gender’s variability in the two-way ANOVA model, that could explain why it obtained significance. Possibly, but that’s not a definite. You’ll need to consider whether that makes theoretical sense. Does gender contribute to the variability of your outcome?
  
  Unfortunately, your sample sizes are two small to waive the normality assumption for two-way ANOVA. So, I wouldn’t trust those results.
  
  I suggest using a rank-based ANOVA or a permutation test. These methods provide a robust alternative for testing genotype and gender effects simultaneously without assuming normality.
  
  I hope that helps!
  
  Loading...
  
  Reply
maria says

December 4, 2024 at 3:57 pm

Hi, I have data for two independent groups, with 16 samples in each group. The assumptions of normality and equal variance were not met. When I performed a two-way ANOVA, I found a significant difference between the groups. However, when I conducted a t-test or Wilcoxon rank-sum test, there was no significant difference.

I’m confused about which statistical test is appropriate in this situation. Can you help clarify?

Loading...

Reply
- Jim Frost says
  
  December 4, 2024 at 4:54 pm
  
  Hi Maria,
  
  It’s not surprising that those three tests came to different conclusions because they assess different models and properties. A two-way ANOVA has two categorical factors, which is an entirely different model than a 2-sample t-test, which has one categorical factor. The Wilcoxon rank-sum test doesn’t assess the means at all. Instead, it assesses ranks and sometimes can evaluate medians.
  
  The appropriate test depends on the nature of your data and your goals. You mention having two independent groups. So I’m not sure how you could use two-way ANOVA because you’d have four groups based on two categorical factors. It’s hard to give more concrete advice without additional information and clarification.
  
  Loading...
  
  Reply
Mabel says

October 16, 2024 at 4:48 am

Hi Jim,

I would like to know if there is a method to detect outliers in a small dataset of about 10-15 values, when we do not if the data is normally distributed or skewed.

Loading...

Reply
- Jim Frost says
  
  October 17, 2024 at 10:31 pm
  
  Hi Mabel,
  
  Detecting outliers in small datasets can be challenging, especially when the distribution of the data is unknown.
  
  I have two recommendations for this situation.
  
  First and foremost, I’d recommend creating a boxplot of your data. These plots use a nonparametric method for finding outliers. That means you don’t need to know the distribution. Make the graph and look for outliers beyond the whiskers. Click the link if you need to know more about boxplots. I particularly like this method for finding outliers in general too.
  
  Secondly, you can just graph your data on a scatterplot and look for any data points that fall far from the others.
  
  Using any other method when you have so few observations and are not sure about the distribution could be unreliable.
  
  I hope that helps!
  
  Loading...
  
  Reply
Scarlet Figueroa says

September 13, 2024 at 3:51 pm

Hello, I am doing a secondary data analysis of a large sample with 2 independent variables and one dependent variable. The sample size is over 1000 and the data is right skewed. Do I need to log transform the data & perform a non-parametric test?

Loading...

Reply
- Jim Frost says
  
  September 13, 2024 at 4:27 pm
  
  Because you’re talking about independent and dependent variables, you’re presumably using regression analysis. For regression, you need to assess the normality of the residuals and not the IVs or DVs. Assess those after fitting the model. In one of my analyses, I had a right-skewed DV but after fitting the model with a squared term, the residuals look fine. That’s not to say that you should necessarily use the same approach but it made sense for my data. Transforming data should be the last resort. Try obtaining normal residuals and if that doesn’t work, you can consider other options then.
  
  Loading...
  
  Reply
Thejas says

September 2, 2024 at 4:13 pm

Hi Jim,

Love this article and wish I read this before I did a project on experimentation and wanted to conclude the results obtained.

Just one question….let’s say the variable of interest in treatment and control group is continuous and non-normal. Would it be fair to bin (group) them and convert them into categorical variable (with the same labels for treatment and control) and run a chi-square test to check if there is a significant difference between the values of the two groups? Would the sample size matter if we try to convert the numeric variable into categorical variable? I read that for smaller sample size, Fishers test is used and for larger sample size, chi-square test. I apologise if my hypothesis for chi-square was written incorrectly.

Love all your posts, so informative. You’re a true helper.

Thanks!

Loading...

Reply
- Jim Frost says
  
  September 11, 2024 at 2:10 pm
  
  Hi Thejas,
  
  Generally speaking, you use a lot of information when you convert continuous variables into categorical variables. I’d definitely be hesitant to do that. Tests for categorical data typically require larger sample sizes for equivalent power.
  
  Loading...
  
  Reply
Anh Vuong says

September 2, 2024 at 4:09 am

Thank you for such productive and meaningful information. Can I ask for a favor please? I’m a pharmacy student. In my case, if I have skewed or non-normal distributed variables, is there any way to transform them into normal ones? Or I just need to input all data in the statistical package, guaranteeing to meet the criteria you mentioned in the table?

Loading...

Reply
- Jim Frost says
  
  September 11, 2024 at 2:14 pm
  
  Yes, you can transform them to follow a normal distribution. Be aware that the results apply to the transformed data. You can back transform to get the values in the original units. However, you can also use the untransformed data thanks to the central limit theorem if your sample sizes met the guidelines and are not extremely skewed. If they are so skewed that you can’t really on the central limit theorem, then the mean probably isn’t a good measure of central tendency anyway and you probably should consider a nonparametric test or bootstrapping to compare the medians. You’ll need to use your subject-area knowledge to make that determination.
  
  Loading...
  
  Reply
Anita Vasavada says

June 10, 2024 at 5:24 pm

Dear Jim,

Thanks so much for this website. I have data for 8 subjects, with 4 groups of data. I would consider this a one-way repeated measures ANOVA. However, the data are non-normal, and in some cases the variances are unequal. From my understanding, I think I should use a Friedman’s test (non-parametric version of repeated-measures ANOVA). I came to this conclusion because Welch’s ANOVA can be used for unequal variances but requires normal distribution (at least for small samples sizes). Kruskal-Wallis did not seem to be appropriate, as it assumes similar distributions. Do you agree, or do you have a different suggestion? If I use Friedman’s test, do you have any thoughts about post-hoc tests, if necessary?

Thank you,
Anita

Loading...

Reply
- Jim Frost says
  
  June 12, 2024 at 9:01 pm
  
  Hi Anita,
  
  You’re very welcome for the website! It makes my day hearing that it was helpful!
  
  Yes, your scenario does sound like it requires Friedman’s Test. I agree with you on that.
  
  As for post-hoc tests, consider using the Nemenyi test after Friedman’s test.
  
  Best wishes on your analysis! 🙂
  Jim
  
  Loading...
  
  Reply
Nemani Satish says

May 18, 2024 at 3:18 am

Hi Jim,

Thank you for your insightful response. I appreciate your detailed explanations and guidance. I followed your guidance and found that data is not shaped distribution, making it not justifiable to use non parametric test. I used arithmetic mean return data, but later I learned that its better to use log returns. After using log returns, I did not get contradictory result. Both parametric and non parametric tests produce similar results.
My sample size is large, I selected 360 companies 10 years daily closing prices.

Loading...

Reply
Nemani Satish says

May 15, 2024 at 3:09 am

How should I interpret contradictory results from parametric and non-parametric tests when my data is not normally distributed?

I’m conducting a profitability analysis using ROA and ROE portfolios. I have applied both parametric tests (t-tests and ANOVA) and non-parametric tests (Kruskal-Wallis H-test) to compare the returns across different portfolios. Here are my findings:

Parametric Tests (t-tests and ANOVA):
The results indicate no statistically significant difference in mean returns across the portfolios.
Non-Parametric Test (Kruskal-Wallis H-test):
The results show a significant difference in the distribution of returns across the portfolios (p-values < 0.05).
Given that my data does not meet the normality assumption required for parametric tests, I am inclined to rely on the non-parametric test results. However, I'm concerned about presenting seemingly contradictory results in my thesis.

My Questions:
Which test results should I prioritize given the non-normal distribution of my data?
How can I justify the choice of non-parametric tests over parametric tests in my analysis?
Are there any authoritative references or guidelines that discuss the appropriateness of using non-parametric tests in such situations?
Any advice or references would be greatly appreciated!

Loading...

Reply
- Jim Frost says
  
  May 15, 2024 at 2:59 pm
  
  Hi Nemani,
  
  There are several complicating issues relating to your case. I discuss them in this article, so I’d encourage you to reread the points I indicate.
  
  What’s your sample size? If you have sufficiently large sample, normality is not an issue, particularly when your data are not extremely skewed. Read the Advantage 1 of Parametric Tests section about how parametric tests can handle nonnormal data. If your sample size is lower than the guidelines I list, you’d probably want to use a nonparametric test.
  
  Have you graphed the data to see what they look like for each group? Do they have the shaped distribution? A nonparametric test will be valid for your data but it only tests the median under the specific condition that all groups in your data have the same shaped distribution (which can be nonnormal). That’s a hard and fast assumption that you can’t get around with a large sample size. Read the Advantage 2 of Parametric Tests section for more about this issue. Also, my Kruskal-Wallis article discusses it in a bit more detail and explains what you can conclude if the groups don’t have the same shaped distribution.
  
  It’s interesting that you’ve received different results. I would lean towards the nonparametric but with the caveat that to interpret the results correctly you need to look at the distribution shapes for you groups and see if they’re the same or different. The results are valid either way but you need to know how to interpret the significant results. Nonparametric tests are designed so they don’t require a specific distribution, such as a nonnormal. So you won’t have any problem justifying its usage.
  
  Loading...
  
  Reply
Lisa says

February 8, 2024 at 6:54 am

Hi Jim,

I am trying to compare two groups with 19 observations each. Some of the dependent variables are normally distributed, some are not. I am thinking of using a t-test for those that are normally distributed and the mann-whitney-u test for those that are not normally distributed. What do you think of mixing statistical analysis within the same study but for different hypothesis? I was also trying to transform the data, but I have z-scores (so negative values, so logarithmic and square root transformation won’t work.
What do you think?
Any help is highly appreciated.

Greets,
Lisa

Loading...

Reply
- Jim Frost says
  
  February 8, 2024 at 7:28 pm
  
  Hi Lisa,
  
  Your approach seems generally reasonable. Mixing methods isn’t a problem. Just be sure to explain why. Ideally, you explain and describe the analysis phase process before you analyze any data so you can avoid cherry picking.
  
  A couple of caveats.
  
  You can use nonparametric tests with nonnormal data. However, these tests evaluate the mean only if the various groups follow the same distribution (which can be nonnormal). I explain the reasoning for that limitation in more detail in my posts specifically about the Mann-Whitney U and Kruskal Wallis tests. If the distributions are not the same, you can’t draw conclusions about the median specifically but you can say that one distribution tends to have higher values.
  
  If you have the raw data, those might not have negative values allowing to use those transformation. Alternatively, try the Box-Cox transformation that can handle negative values.
  
  I hope that helps! Best of luck with your project!
  
  Loading...
  
  Reply
Rainer Düsing says

October 25, 2023 at 9:49 am

Dear Jim,

many thanks for your article(s), but I would slightly disagree with you regarding the median test in non-parametric analyses. For example, theMW-U test does not, without further assumptions, test for differences of the medians of the two groups! As you can see here in the example (https://stats.oarc.ucla.edu/other/mult-pkg/faq/general/faq-why-is-the-mann-whitney-significant-when-the-medians-are-equal/), the medians may be truely 0 and 0, but the U test indicates a significance. But what is significant now, it cannot be the difference of the medians?
So, it rather tests a stochastical superiority of one group over the other. Under the very strict additional assumptions that both distributions have the same form, the same spread and only differ by a location shift, we could conclude that the medians are different (but then this is also true for the mean).
Since non-parametric tests do not test the same hypotheses as roughly comparable parametric tests, I wonder how good it is to see them as alternatives. They do not answer the same questions.

All the best from Germany,
Rainer

Loading...

Reply
- Jim Frost says
  
  October 25, 2023 at 3:23 pm
  
  Hi Rainer,
  
  We actually agree on that completely. Notice in this post I specifically indicate that nonparametric tests assess the medians only sometimes–in very specific cases. I also highlight the strict assumption about groups having the same distribution shapes as a negative for nonparametric tests. Parametric tests like the t-test and ANOVA have versions (Welch’s) that can use group distributions with different shapes/spreads. That’s one reason for preferring parametric tests over nonparametric. I call all that out in this post (so probably read a bit more carefully).
  
  Click the links for the specific analyses and I discuss the issues you mention in greater detail. I don’t go into depth about stochastic superiority and related issues in this post because it’s more of an overview comparing nonparametric to parametric. But I cover all that in the more specific posts about each analysis.
  
  I appreciate the comment. You’re right on. But please read my posts a bit more carefully to realize that I already address and agree with those points. Thanks!
  
  Loading...
  
  Reply
Sara says

October 14, 2023 at 7:16 pm

Hello Jim,

Thank you very much for your effort, it is an invaluable website.

I have a question:
In the table you provided the sample needed to consider the normality for
One-Way ANOVA
For 2-9 groups, each group should have more than 15 observations.
For 10-12 groups, each group should have more than 20 observations.

However, I am conducting mixed between- within subject ANOVA with three groups and 12 repeated measures. How many students should be included in each group to consider my data are normally distributed, according to central limit theorem.

I would appreciate a reply

Loading...

Reply
- Jim Frost says
  
  October 17, 2023 at 4:24 pm
  
  Hi Sara,
  
  Unfortunately, I’m not really sure. The numbers I use that you quote are based on simulation studies for those specific conditions. I’m not aware of studies that have looked at more complex models. And your model is more complex in that it is a mixed model and has a number of repeated measures. Another consideration is having a sufficient number of subjects to avoid overfitting your model. If each repeated measure is a new condition that you’ll be estimating, you’d need say at least 150 subjects for that part of it alone (i.e., avoiding overfitting). I’d also guess that would be sufficient for handling the normality issue as well. However, I’d take that as an absolute minimum number and shoot for more if possible.
  
  But take that with a grain of salt because I’m not going by any published studies that have simulated those conditions.
  
  Given the complexity of your study, you might want to consult with a statistician at your institution who can devote the time necessary that your study deserves.
  
  Best of luck with your study! I’d be very interested in hearing from you about what you ultimately go with. 🙂
  
  Loading...
  
  Reply
Dipesh Patel says

July 6, 2023 at 5:31 pm

Hi Jim!
Thank you for creating this great website.I have never find it so easy to understand such complex topic like statistic.
My query is could you please write something more info of how to interpret Friedman test and all related important terminologies whether it is the right test,when to use etc just like you have discussed every other topic ?
I shall be grateful to you, Honestly even my uni stats team were not able to explain me so easily in the way you have taught the concept of statistics.

Loading...

Reply
- Jim Frost says
  
  July 7, 2023 at 3:02 am
  
  Hi Dipesh!
  
  Thanks so much for writing. I’m so glad to hear that my website has been helpful!
  
  I will definitely write about the Friedman test. I put it on my list!
  
  Be sure to join my email list if you haven’t already so you’ll know when it’s published. It might be several weeks.
  
  Loading...
  
  Reply
Natalya says

January 8, 2023 at 7:00 pm

Dear Jim, Many thanks for your great work!

Loading...

Reply
JD says

October 12, 2022 at 10:43 am

Hello, my research has pretest, posttest and a delayed posttest. I have 2 groups (control and treatment) of 10 participants each. Based on your page, you mentioned that it’s possible to carry out parametric tests if there is more than 20 participants:
1. Does that mean if I have 20 participants that is not enough to carry out a parametric test even though my data is not normally distributed? So, I can carry out paired sample t-test or should I just use Wilcoxon Signed Rank Test?
2. How do I go about testing for delayed posttest? The reason I am doing a delayed posttest was to ensure the reliability of the posttest results.

Loading...

Reply
Rana says

September 26, 2022 at 3:47 am

Thanks a lot for your valuable website and information. If I used 10 animal as a sample size and I have high partial eta square, can I apply parametric ANOVA.
Thanks again.

Loading...

Reply
- Jim Frost says
  
  September 26, 2022 at 8:31 pm
  
  Hi Rana,
  
  If you have a sample size of 10 per group and you are sure they follow a normal distribution, you can use parametric ANOVA. Your sample size is small, which means you must satisfy the normality assumption.
  
  Loading...
  
  Reply
Rohith Venkatakrishnan says

July 16, 2022 at 7:06 pm

Hey Jim,

Thanks for your article. I am confronted with a similar situation where I have 4 conditions (20 subjects per condition, one of which is a control group). I see that this meets the 15 subjects requirement for 2-9 groups but what I want to know is, when would you consider the data to be extremely skewed and unfit for parametric analysis?

Any thresholds to determine “extreme skew”?

Loading...

Reply
- Jim Frost says
  
  July 20, 2022 at 1:03 am
  
  Hi Rohith,
  
  That’s kind of a trick question because there is no clear-cut dividing line. In most cases, you’ll be fine given the number of subjects per group. If you really want to check, you can do a resampling method to see what kind of distribution it produces. Does the resulting distribution look fairly normal? To see what I’m talking about, read my post about the central limit theorem. I show examples of sampling distributions that do and do not converge on the normal distribution for different distributions and sample sizes. You can try it with your data to see what it looks like. There’s no statistical test but if your sampling distribution looks fairly normal, you’re safe.
  
  Loading...
  
  Reply
Filip says

June 13, 2022 at 3:46 pm

Hello Jim, thank you for this article. I have a problem, the images don’t load.

Loading...

Reply
- Jim Frost says
  
  June 13, 2022 at 4:04 pm
  
  Hi Filip, I just checked and I’m seeing the image in this post with no problem (there’s only one image).
  
  Loading...
  
  Reply
Tom says

March 18, 2022 at 8:34 pm

Hello Jim,

I recently discovered your site and it is extremely helpful. Thanks! I have been struggling figuring out how to report data. Say I am analyzing the response to a medication in 3 groups of a patients, and looking at response vs blood concentrations of the drug. I am trying to come up a reference range that says: These patients will respond to symptoms when in the following blood concentrations (eg 5-25 mcg/mL). My total n= ~900 patients. 1 group has ~80 patients (responders), one group has ~600 cases (partial responder) and one group has ~150 (non-responders). The data is not normally distributed based on several normality tests. In order to establish the reference range, I need to capture the central 95% of patient blood concentrations when looking at the responders group (ideally just those that fully responder, but also those that fully respond + partially respond). If mean +/- 2SD is used, then I end up at a negative blood concentration, which obviously isn’t possible. However if I use median, the boxplot and whisker seems to capture a good range and indicates outlier. Is this latter way the correct way to go?

I hope this makes sense

Thanks!

Loading...

Reply
Sanjeda Tamanna says

March 14, 2022 at 11:17 am

Thank you so much for your reply.

Loading...

Reply
Sanjeda Tamanna says

March 5, 2022 at 10:09 am

Hello Jim,
Thank you so much for writing wonderful articles. Your articles helped me a lot to understand statistics. They are making my data analysis easier. I will be very grateful if you would like to provide me with some suggestions regarding my sample size in ANOVA and parametric analysis. I have 4 groups having 50, 16, 54, 70 sizes respectively. I checked their distributions. They dont follow normal distributions. I did ordinary one way ANOVA or Welchs ANOVA depending on difference in their SD values. Among these 4 groups, first one is control group, 2nd one is experimental group comprising two types of patients, and the rest two groups are the two types of patient groups each which make up the 2nd group. Am i doing the right form of analysis?

Loading...

Reply
- Jim Frost says
  
  March 6, 2022 at 3:04 pm
  
  Hi Sanjeda,
  
  If you look at the table in this post, you’ll see that when you use one-way ANOVA and have 2-9 groups, you typically don’t need to worry about normality when each group has at least 15 observations. You have four groups, so this applies to you. And all of your groups have at least 15 observations. Although one group is very close. I think you’re safe using one-way ANOVA unless your data are extremely skewed.
  
  However, because you have unequal sample sizes across your groups, the equal variances assumption is particularly relevant. If your variances are not equal, definitely use Welch’s ANOVA. If they’re roughly equal, the regular one-way ANOVA should be fine.
  
  If you have significant results, you should perform a post hoc analysis to see which groups are different. Because you have a control and treatment group, I recommend Dunnett’s method. Click the link to learn about them and I include an example that uses Dunnett’s.
  
  I don’t know what you mean when you say that some groups are made up of two types of patients. In a one-way ANOVA, all subjects should be a random sample from the same population. Their primary difference between groups should be the grouping variable in your ANOVA, which is experimental group in your case.
  
  Loading...
  
  Reply
Baris says

December 19, 2021 at 2:20 pm

Hi Jim, I found the solution. I’m going to do an ordinal logistics regression analysis! I just wanted to let you know so you have more time to answer other questions. Thank you!

Loading...

Reply
- Jim Frost says
  
  December 21, 2021 at 12:58 am
  
  Hi Baris,
  
  Sorry for the delay in replying but I’m glad you found your answer. One thing I wasn’t sure about from your original question was about your IVs and DV. Ordinal logistic regression is a good choice when your DV is ordinal, like Likert scale data. However, I’m not sure what variables you’ll use as IVs? If they’re also ordinal, then you’ll need to enter them either as continuous or categorical. Ordinal has characteristics of both, but you’ll have to choose one or the other for each IV. Although, you don’t have to make the same decision for all IVs. The correct choice depends on the nature and amount of your data along with the goals of your study.
  
  Loading...
  
  Reply
Baris says

December 17, 2021 at 8:38 pm

Hi Jim,

Below you can find the survey question that tried to measure the impact of cognitive biases induced by marketing messages on consumer decision making to purchase in e-commerce.

“When you shop online, which one of these sales aspects impacts your decision-making to purchase?”
——————
Stock Availability: (1) Not at all (2) Rarely impacted (3) Sometimes impacted (4) Usually impacted (5) Highly
Reviews of people: (1) Not at all (2) Rarely impacted (3) Sometimes impacted (4) Usually impacted (5) Highly
Countdown timer: (1) Not at all (2) Rarely impacted (3) Sometimes impacted (4) Usually impacted (5) Highly
Nr. of likes: (1) Not at all (2) Rarely impacted (3) Sometimes impacted (4) Usually impacted (5) Highly

(Likert scale)

Based on the existing literature and other online sources, the following marketing messages are used to induce cognitive biases of consumers in e-commerce. Each marketing message in a way manifests a cognitive bias.
Stock availability –> Scarcity Bias
Reviews –> Bandwagon effect
Countdown timer –> Loss aversion
Nr. of likes on the product –> Bandwagon effect

390 people responded to the survey and my hypothesis are as follows:
Ho: Stock availability has no impact on consumer decision making to purchase
H1: Stock availability impacts the consumer decision making to purchase
Ho: Reviews of people have no impact on consumer decision making to purchase
H1: Reviews of people impact the consumer decision making to purchase
Ho: Countdown timer has no impact on consumer decision making to purchase
H1: Countdown timer impacts the consumer decision making to purchase
Ho: Nr. of likes on a product have no impact on consumer decision making to purchase
H1: Nr. of likes on a product impact the consumer decision making to purchase

Given this information:
1. What kind of hypothesis testing should I use?

Ps: sorry for my long comment, I tried to be as clear as possible

Thank you in advance!

Loading...

Reply
Vijay S Pawar says

December 5, 2021 at 6:59 am

Hii Jim,
you mentioned above sample size requirement about nonnormal data. How you fix that requirement, by doing some practical basis or from any reference?

Loading...

Reply
- Jim Frost says
  
  December 5, 2021 at 10:51 pm
  
  Hi Vijay,
  
  Those sample size requirements come from a simulation study conducted by some smart people I used to work with. You’ll find a link to it in the Advantage #1 section under Advantages of Parametric Tests. Click the link to read the study.
  
  Loading...
  
  Reply
Gabriel says

July 24, 2021 at 12:04 pm

Dear Jim,

Thank you so much for your prompt reply. Indeed your answers have me reassured! thank you! In short, one should not outrightly reject the application of parametric approaches under the non-normal distribution of data. It is still accurate and valid under that condition. The keyword here is “robustness” of the parametric approach even though it is used to analyse the highly skewed data. Robustness here of course is relying on several factors such as sample size, confidence interval set, or p value, Am i right?

By the way, i feel reluctant to use spearman rank correlation although my data (both continuous) are not normally distributed. Many articles and experts said we should use spearman in this case but i feel unsure due to the fact that spearman, by its name and intention of the analysis, it should be used on rank/ordinal data- like Likert scale data. However, as i mentioned, there are many, not just several, scholars recommend such an application of spearman for both continuous or interval data (not ranked). I am confused as i have read some articles (from old to new articles) suggesting that spearman is strictly meant for analysing ranked data. Therefore my question is should we be really concerned about the data type by which how spearman correlation is used?

Thanks in advance

Loading...

Reply
- Jim Frost says
  
  July 25, 2021 at 12:40 am
  
  Hi Gabriel,
  
  Robustness indicates that the test performs satisfactorily even with non-normal data. Specifically, all hypothesis tests have a Type I error rate. That’s basically a false positive. Imagine the null hypothesis is true. You perform a hypothesis test, get a low p-value, reject the null hypothesis, and conclude that the effect/relationship exists in the population. In our thought experiment, we know that the test result is incorrect but in the real world you never know that for sure.
  
  But, we know how often Type I errors occur. When a test performs correctly, the Type I error rate equals the significance level you use (e.g., 0.05). When a test is robust to departures from normality, the Type I error rate equals the significance level even with non-normal data. When a test is NOT robust, then non-normal data will cause the Type I error rate to NOT equal the significance level. The simulation studies have found that when you satisfy the sample size guidelines, the listed tests are robust to departures from normality.
  
  If Spearman’s correlation is appropriate for data and Pearson’s is not, you really need to use Spearman’s. It’s NOT just for rank and ordinal data. It’s also for nonlinear relationships that are monotonic. Read my post about Spearman’s correlation to understand what that means and the types of relationships for which you should use it. You’ll need to graph your data to make that determination. The are definitely cases where you have continuous data and Spearman’s will be the appropriate type of correlation to use. That may or may not be the case for your data but you need to make that determination. Again, read my post about Spearman’s.
  
  Loading...
  
  Reply
Gabriel says

July 22, 2021 at 2:21 pm

Hi Jim, Thank you so much for the clarification on the use of parametric approaches for non-normal distributed data, provided that other requirements like sample size needs to be reasonably large. I noticed that you have provided some rule of thumbs in terms of sample size for t test and anova under the non-normal distribution of means but may i know what what about Pearson’s correlation? What would be an adequate sample size under the non-normal distribution.

If by the above rule of thumb, the parametric approach is valid (e.g., the sample size is 150 or 200), should we still need to perform normality test (skewness and kurtosis)? or we can assume that it should be fine? or even the latter contradicts the former, will the latter prevail over the former?

FYI, i am not a statistician, however i came across an article by Professor Geoff Norman debunking various myths about statistics like many of so called experts claim that once it is non normal, data are categorical/ordinal, sample size is too small then you have no choice but to use nonparametric approaches.

Thank you very much. Look forward to hearing from you.

Loading...

Reply
- Jim Frost says
  
  July 23, 2021 at 10:50 pm
  
  Hi Gabriel,
  
  Thanks for the great question! Yes, you’re quite right, there are similar guidelines for Pearson’s correlation.
  
  In general, the sample size for correlation should be greater than 25. There’s no formal rule for this number, but you need a certain number of observations to identify patterns such as correlation.
  
  In terms of normality, it’s not necessarily an issue for the correlation coefficient itself but it is for the p-value. However, in some cases, the nature of the relationship will require you to use a different type of correlation, such as Spearman correlation. Fortunately, Pearson and Spearman correlation are robust to non-normal data when you have more than 25 paired observations. One caveat. The confidence intervals for the Pearson’s correlation coefficient remain sensitive to non-normality regardless of the sample size. The p-values for Spearman’s correlation are even robust to non-normal data because it’s a nonparametric method that uses ranks.
  
  Your sample size of 150 or 200 are so much larger than the guideline value that you don’t need to worry about normality.
  
  As for the article by Professor Norman, which I have not read, it’s inaccurate to say that you can’t use parametric methods with non-normal data. Thanks to the central limit theorem, you can use parametric methods with non-normal data when your sample size is large enough. The sample size guidelines I present are based on simulation studies that compare simulated test results to known correct results for various distributions and sample sizes. These studies find that when you satisfy these sample size guidelines, the tests work correctly even with non-normal data. However, if you have non-normal data and a small sample size, then you might need to use a nonparametric test, which I discuss in this article.
  
  I hope that helps!
  
  Loading...
  
  Reply
Simon Tanios says

July 6, 2021 at 9:24 am

Thanks so much!!

Loading...

Reply
Zeb says

March 7, 2021 at 11:37 am

Thanks for perfect explanation sir. Sir I have a question regarding my data analysis. I have conducted a study and I want to compare the present situation with previous. All participants (male and female) have already experienced the pre and post situation. To compare the present situation with previous one with the options of; (Not Available), (Worst Condition), (Average Condition), (Better Condition). Let say, to compare the “Drinking water facility” with above scale/options. Any suggestion how to analyze or whats kind of statistical test can be used for this kind of data.
I will strongly appreciate your valuable inputs.
Thanks
Zeb

Loading...

Reply
Adrian says

March 5, 2021 at 11:25 am

Dear Jim,
Let me add a few notes from my 10-year practice in the clinical research biostatistics.
1) by simulations, I only rarely obtained the assumed type-1 error and assumed power with using parametric tests on highly skewed (like log-normal) and multi-modal data of different dispersion across groups, with so small data sizes. Not rarely I work with so specific data, so even N=300 doesn’t give reliable results. This is unacceptable in this industry I work in. It is interesting, however, to see how the outcomes vary depending on our experience.

This was visible especially on the ANCOVA on change-from-baseline adjusted for the baseline (the recommended by guidelines standard of analyses in the RCTs) in more complex designs and multiple repetitions over time (fit either via GLS or a mixed model). But then I either switch to the (weighted) GEE estimation or choose quantile regression with random effects and run a set of the LR tests over it to get the assumption-free ANOVA over the underlying model.

Moreover, it makes entirely no statistical sense to compare means in skewed data. These are wrong measures of the central tendency most of the time. Why? Because the arithmetic mean is by definition an additive measure, which has nothing to do with multiplicative processes or processes that can be described with the log-normal distribution. The two are incompatible. For exactly this reason it makes no sense to bootstrap the difference in means or to run a permutation test over the means – because still, however technically possible, it makes no statistical sense to use means to describe such data.

Sure, one could log-transform the data, but transforming isn’t the best option here, because it changes too many things: the hypothesis, biases the back-transformed CIs (Jensen’s inequality), affects the underlying model errors, affects the mean-variance relationship and more. Instead, we need to use the generalized model, which properly deals with the conditional expected value (rather than raw data) linked to the predictor, or by employing quantile regression followed by the LR tests to get the main and interaction effects.

2) Almost neither (except maybe 3-5) of the non-parametric test (ouf ot about 320 I know) requires formally equal variance (or more generally – dispersion). And neither assesses the medians in general (sadly, this is repeated even in many textbooks, luckily not all, and the awareness grows quickly). It holds IF and ONLY IF the distributions are equal (IID): same shape, same dispersion AND both are symmetric. Otherwise is practically never happens. Mann-Whitney, Kruskal-Wallis are about stochastic equivalence, assessed via pseudomedian. They all fail entirely as tests of equality of medians just by the definition of the pseudomedian and its properties. Lots of the literature is available on this, also the simulations confirm it. It’s very easy to have numerically equal medians and the test report significant results due to the difference in shape of dispersions. And that’s OK, because it was designed as stochastic equivalence and not median tests. Sure, if we want to restrict ourselves to equal dispersions AND symmetry of the distributions (must be by the definition of pseudomedian), then we can treat it as asymptotic tests of medians, but – then – this is a perfect situation for the CLT and, actually, the standard t test (median approaches the arithmetic mean here).

3) By the way, there are also modern tests, like the ATS (ANOVA-Type Statistics), WTS (Wald-Type Statistics), permuted WTS and ART ANOVA (Aligned-Rank Transform), which are much more flexible (handle up to 3-5, depending on implementation, main effects + interaction + repeated observations) and powerful. They use so-called relative effects.

Loading...

Reply
Dr Rakesh Ranjan Pathak says

February 7, 2021 at 9:17 pm

Sir while comparing parametric and non-parametric methods we miss the two real question
1) what if we use non-parametric tests in parametric conditions ?
2) what if we use parametric tests in non-parametric conditions ?
Please detail on the error in outcome as the real life deterrent, Thanks

Loading...

Reply
- Jim Frost says
  
  February 7, 2021 at 10:39 pm
  
  Hi, I touch on those issues in this post. Specifically:
  
  1) Typically, non-parametric tests have less power than their parametric counterparts. For power reasons, you’ll want to use a parametric test when it’s valid. Using a nonparametric test in these conditions increases the Type II error rate (false negatives)
  
  2) If you use a parametric test when a nonparametric test is appropriate, you’ll obtain inaccurate results. The Type I error rate won’t necessarily equal the significance level you define for the test. I’m not sure if there is a consistent direction of change in that error rate. I suspect that the Type I error rate can be higher or lower than the significance level depending on the nature of the violation.
  
  I hope that helps.
  
  Loading...
  
  Reply
Maria says

December 17, 2020 at 2:19 pm

Great article, thank you
But may I ask when to say its better to choose the mean or the median as the best measure of central tendency for my data? is there any guide?

Loading...

Reply
- Jim Frost says
  
  December 17, 2020 at 10:43 pm
  
  Hi Maria,
  
  Thanks for writing! In my post about measures of central tendencies, I write about which measure is best for different situations, including choosing between the mean and median. I’d recommend reading it. In a nutshell, the mean is better when your data are symmetric, or at least not extremely skewed, while the median is better when your data are fairly skewed. In my other post, I show why that’s the case.
  
  Loading...
  
  Reply
Rafi Mohammed says

September 15, 2020 at 1:16 pm

Hi Jim. Very informative article. I would like know one more thing.
Can we use parametric tests to analyse ordinal data? If so, in what circumstances? Please advise.

Loading...

Reply
- Jim Frost says
  
  September 17, 2020 at 2:47 am
  
  Hi Rafi,
  
  That questions has been behind many debates in statistics! In some cases, yes! In this post, I have a link near the end for an article I wrote about analyzing Likert scaled data. Likert scale is an ordinal scale. And for those data, you can use the parametric 2-sample t-test. That’s based on a thorough simulation study. However, I would not say that means you can always use parametric tests for all scenarios where you have ordinal data. There are probably requirements for samples sizes and number of ordinal levels. At any rate, read that one post about analyzing Likert data to get an idea of some of the issues and how it works out for 2-sample t-tests.
  
  I hope that helps!
  
  Loading...
  
  Reply
Kenny L says

September 1, 2020 at 10:32 pm

Hi Jim,
Thanks for the very informative Article. It looks great to see all Hypothesis tests in one article, and appreciate the details and depth of the explanation.
One thing that I been struck upon is to make the best choice between Parametric and non-parametric tests, when there are many varying features and under the influence of many varying features the distribution become highly uneven making it hard to compare and harder to draw inferences.
But this is the actual case in practical application when you want to do A/B Testing. Real life A/B testing involves dealing with distributions that vary largely due to high number of Features(columns or variables).
For doing A/B Testing with varying distributions in the 2 experiments under conditions of multiple features involved, would you recommend Parametric Statistical Hypothesis Tests or Non-Parametric Statistical Hypothesis Tests?
( I have tried Parametric Statistical Hypothesis Tests but it was getting hard to meet the statistical significance, as there are multiple features involved. If I remove/ignore most of the variables I may end-up getting the statistical significance, but that may not be the intended purpose of A/B testing though.)
Can you throw some light,please?

Loading...

Reply
MahNoor Ashrif says

July 8, 2020 at 3:38 am

Hi Jim!
A researcher conducted a research that majority of people who died during pandemic bought a new phone during last year. What type of research is this? If his assumption is correct which statistical test should be apropriate to analyse the data?
please answer this question in detail. i will be really thankful to you.

Loading...

Reply
- Jim Frost says
  
  July 9, 2020 at 4:32 pm
  
  Hi MahNoor, apparently this is a question from a test because someone else recently asked the identical question. I’m not going to do your test for you. However, I will point you towards a 2-sample proportions test, which will allow you to determine whether there is a difference the proportion of fatalities between those who bought a new phone and those who didn’t.
  
  Loading...
  
  Reply
Ben Craggnon says

June 5, 2020 at 11:41 am

Amazing thanks!

Loading...

Reply
Ben Craggnon says

June 4, 2020 at 8:39 am

Hi Jim,

Thanks so much for explaining this all!
I want to compare the ages of two groups I have (one is only 17 people and one is 51 people). Because the first group is <20 people do I need a Mann-Whitney U test or can I just use a t test here?

Many thanks!
Ben

Loading...

Reply
- Jim Frost says
  
  June 4, 2020 at 1:38 pm
  
  Hi Ben,
  
  Do you have any theoretical reasons or empirical data that suggests the population for the smaller group follows a nonnormal distribution? If you can reasonably assume that it follows a normal distribution, you can probably use a t-test. However, if you have any doubts about that, best to go with Mann-Whitney.
  
  Loading...
  
  Reply
aruna says

May 23, 2020 at 3:04 pm

hi jim ,,,, thank you for the wonderful article ,,,,can you tell special features of factorial design.. it would be very helpful

Loading...

Reply
ELZED LIEW says

May 18, 2020 at 2:04 pm

Thanks heaps for this excellent overview.
However, I am bit confused with ‘The groups in a nonparametric analysis typically must all have the same variability (dispersion).’
As far as I can remember, ANOVA, as a parametric test assumes equal variances of the samples that wil be tested.
Do you think i should stick to ANOVA if the samples are normally distributed but have unequal variance?

Loading...

Reply
- Jim Frost says
  
  May 18, 2020 at 2:14 pm
  
  Hi Elzed,
  
  If you have unequal variances, you can use Welch’s ANOVA. Click the link to read my post about it!
  
  Loading...
  
  Reply
Lisa says

April 15, 2020 at 9:16 am

Thanks a bunch Jim !

Loading...

Reply
Lisa says

April 14, 2020 at 3:23 pm

Hi Jim,

Thanks for this article! I would like to kindly seek your advise-

I’m currently looking to filter out variables that are highly correlated so that I may remove one or the other for an analysis, I was thinking of using the non parametric test Spearmans Rank Correlation, would that be correct? Data are of equal groups, each group >20 observations, continuous data.

Loading...

Reply
- Jim Frost says
  
  April 14, 2020 at 10:34 pm
  
  Hi Lisa,
  
  You can use that or even just the regular Pearson’s correlation. If you’re performing regression analysis and worried about multicollinearity, you can fit the model with the variables and then check the VIFs.
  
  Loading...
  
  Reply
Heather says

April 9, 2020 at 12:29 pm

Hi Jim.
Thankyou for your article it was very helpful. I was wondering if you could help me- I’m currently doing my thesis and am carrying out a few statistical tests. One is an independent samples t test with 1 categorical independent variable (PP group 1, N= 57, PP group 2, N=45) with one continuous dependent variable. However, my data has violated the assumptions: Normality, Homogeneity of variance & has a few outliers. In this case, would I bootstrap my t-test or use the alternative non-parametric test (Mann-u Whitney). How would I make this decision? What would the criteria be for using bootstrapping over the alternative non-parametric test?

Thanks in advance for any insight you can offer! 🙂

Loading...

Reply
- Jim Frost says
  
  April 10, 2020 at 7:46 pm
  
  Hi Heather,
  
  In your case, I would strongly consider using the t-test. In fact there are specific reasons for not using a nonparametric test in your case.
  
  Specifically, you have a large enough sample size in each group so that the central limit theorem kicks in (see the table in the post for sample size requirements). Even though the data in your groups are non-normal, the sampling distributions should follow a normal distribution, which gives you valid results. Additionally, t-tests can handle unequal variances. Just be sure that your statistical software uses the version of the t-test that does NOT assume equal variances.
  
  While nonparametric tests don’t assume that your data follow a particular distribution, they do assume that the spread of the data in each group is the same. Because your data have different variances, it violates that assumption for nonparametric tests.
  
  I’d use the t-test! You could also use bootstrapping, but a t-test should work fine.
  
  Loading...
  
  Reply
Benny Zuse Rousso says

February 11, 2020 at 2:43 am

Hi Jim, very good post (along many others in your blog). Could you please provide any formal reference for the table of minimum sampling size?
Thanks a lot!
Ben

Loading...

Reply
vivian says

February 1, 2020 at 5:30 am

Thanks a lot for the valuable information, but may I ask how much do you mean by tiny size of data, are they less than 30?

Loading...

Reply
Mukhles says

January 29, 2020 at 6:25 pm

Thank you.

Loading...

Reply
Mukhles says

January 28, 2020 at 10:32 pm

Hello Jim, when did you publish this article? I would like to cite it for my school work

Loading...

Reply
- Jim Frost says
  
  January 28, 2020 at 10:54 pm
  
  Hi Mukhles,
  
  I’m glad this article was helpful for you! When you cite web pages, you actually use the date you accessed the article. See this link at Purdue University for Electronic Citations. Look in the section for “A Page on a Website.” Thanks!
  
  Loading...
  
  Reply
Akshat Garg says

January 14, 2020 at 2:00 am

Hi jim would u Please answer one of my doubt, i m badly stuck in

Loading...

Reply
- Jim Frost says
  
  January 14, 2020 at 11:00 am
  
  Hi Akshat,
  
  Please find the blog post that is closest to the topic of your question. There is a search box in the right hand column part way down that can help you. Ask your question in the comments of the appropriate post and I’ll answer it!
  
  Loading...
  
  Reply
Subhabrata Chakraborti says

December 20, 2019 at 11:22 am

Just wanted to add that the book “Nonparametric Statistical Inference, fifth edition” by Gibbons and Chakraborti (2010; CRC Press) has discussions about the power of some nonparametric tests, including Minitab Macro codes to simulate power. The updated edition (work in-progress) will discuss R codes. Hope this helps.

Loading...

Reply
Julia Kirchner says

December 10, 2019 at 4:49 pm

Hi Jim! Great article, it really helped me for my study.
Only problem now is that I need scientific papers for the statements made in your text, to refer to them in my study.
Specifically I was wondering if you coud provide me with the paper you used to draw this conclusion “parametric tests have more power. If an effect actually exists, a parametric analysis is more likely to detect it”

Thanks a lot!

Loading...

Reply
- Jim Frost says
  
  December 10, 2019 at 5:19 pm
  
  Hi Julia,
  
  Thanks for your kind words. I’m glad it was helpful!
  
  It’s generally recognized that nonparametric tests have somewhat lower power compared to a similar parametric test. In other words, to have the same power as a similar parametric test, you’d need a somewhat larger sample size for the nonparametric test. That’s the tendency.
  
  However, calculating the power for a nonparametric test and understanding the difference in power for a specific parametric and nonparametric tests is difficult. The problem arises because the specific difference in power depends on the precise distribution of your data. That makes it impossible to state a constant power difference by test. In other words, the power difference doesn’t just depend on the tests themselves but also the properties of your data.
  
  For more information about these considerations, look at the following texts:
  Walsh, J.E. (1962) Handbook of Nonparametric Statistics, New York: D.V. Nostrand.
  Conover, W.J. (1980). Practical Nonparametric Statistics, New York: Wiley & Sons.
  
  Loading...
  
  Reply
Andrea says

November 3, 2019 at 2:05 am

Jim, do you have anything which describes how to estimate the power of a nonparametric test?

Loading...

Reply
- Jim Frost says
  
  November 4, 2019 at 9:32 am
  
  Hi Andrea,
  
  Calculating power for nonparametric tests can be a bit complicated. For one thing, while nonparametric tests don’t require particular distributions, you need to know the distribution to be able to calculate statistical power for these tests. I don’t think many statistical packages have built in analyses for this type of power analysis. I’ve also heard of people using bootstrap methods or Monte Carlo simulations to come up with an answer. For these methods, you’ll still need either representative data or knowledge about the distribution.
  
  Apparently, the pwr.boot function in R uses the bootstrap method to calculate power for nonparametric tests. Unfortunately, I have not used it myself but could be something to try. The problem is that you should not use data from a hypothesis test to calculate the power for that hypothesis test. If the test was statistically significant, power will be high. If the test was not significant, the power is low. You don’t know the real power. So, I’m not sure about the rational for using this command, but it is one approach.
  
  Loading...
  
  Reply
John says

September 23, 2019 at 11:20 pm

Hi. I wanted to leave a comment . . .

Loading...

Reply
- Jim Frost says
  
  September 23, 2019 at 11:55 pm
  
  Hi John,
  
  Thanks for the heads-up. I tried sending you an email but it bounced.
  
  Loading...
  
  Reply
Jovana says

April 5, 2019 at 7:34 am

Hi Jim,
Thank you for this nice explanation. I must consult with you regarding the situation I have with my data. I have 10 data sets (10 different metals), each data set consisting of 20 values (5 values in 4 seasons). These are the measurements of the metal concentrations in fish liver and I want to assess if there are seasonal variations. I tested the normality of distribution and got normal distribution for 7 metals, and for 3 a non normal distribution. I have tested the homogeneity of variance (Leven’s test) and got result that 6 of the metals have homogeneous variation, while other 4 metals (3 of which have non normal distribution) does not have homogeneous variance. Finally, my question is, should I use parametric test (One way ANOVA) for all the 10 data sets, since majority of samples have normal distribution and homogeneous variance? Should I use non parametric (Kruskal-Wallis H) since my data sets are not large (20 values)? Or should I test normally distributed data with parametric, and non normally distributed data with non parametric?
Thank you in advance,
Kind regards,
Jovana

Loading...

Reply
Pam says

April 3, 2019 at 3:16 am

Hi again Jim,
This time my query regarding missing data when sample size is low. How do we deal with missing dependent variables in a continuous data set observed at different time intervals?
Is multiple imputation a good option when data (sample) is missing at some time points and some were not detected due to method limitations. Some suggest replacing undetected data with the lowest possible value, such as 1/2 of the limit of detection instead of using zero. Can undetected data be treated as missing data?

I have looked up some multiple imputation methods in SPSS but not sure how much acceptable it is and how to report if acceptable.

Please enlighten with your expertise.
Thank you in advance!

Loading...

Reply
- Jim Frost says
  
  April 5, 2019 at 4:44 pm
  
  Hi Pam,
  
  Generally speaking, the less data you have the more difficult it is to estimate missing data. The missing values also play a larger role because they’re part of a smaller set. I don’t have personal experience using SPSS’ missing data imputation. I’ve read about it and it sounds good, but I’m not sure about limitations.
  
  I’m not really sure about the detection limits issue. For one thing, I’d imagine that it depends on whether the lowest detectable value is still large enough to be important to your study. In other words, if it is so low that you’re not missing anything important, it might not be a problem. Perhaps the lowest detectable value is so low that in practical terms it’s not different from zero. But, that might not be the case. Additionally, I’d imagine it also depends on how much of your data fall in that region. If you’re obtaining lots of missing values or zeroes because much of the observations fall within that range, it becomes more problematic. Consequently, how to address it become very context sensitive and I wouldn’t be able to give you a good answer. I’d consult with subject-area specialists and see how similar studies have handled it. Sorry I couldn’t give you a more specific answer.
  
  Loading...
  
  Reply
Pam says

March 27, 2019 at 4:20 pm

Great! Thanks Jim. This is really helpful.
Cheers!

Loading...

Reply
Brittney says

March 26, 2019 at 6:03 pm

Thank you so much for this article! I wasn’t planning on using statistics in my research, but my research took a turn and my committee wanted to see testable hypotheses…for paleontology! Ugh. But, this article and your website is incredibly useful in dusting off the stats in my brain!

Loading...

Reply
- Jim Frost says
  
  March 27, 2019 at 10:32 am
  
  Your kind words mean so much to me. Thank you, Brittney!
  
  Loading...
  
  Reply
Pam says

March 23, 2019 at 12:38 am

Hi Jim,
Thank you for making statistics a lot easier to understand. I now understand that parametric tests can be performed on a non-normal data if the sample size is big enough as indicated.

I have a few confusions regarding when and when not to perform log transformation of skewed data?
When does the data have to be log transformed to perform statistical analysis? Can parametric tests be done on a log transformed data and how do we report the results after log transformation?

Do you have a blog post regarding this? Please provide your expert insights on these when possible.

Thank you

Loading...

Reply
- Jim Frost says
  
  March 24, 2019 at 10:49 pm
  
  Hi Pam,
  
  Yes, you can log transform data and use parametric analyses although it does change a key aspect of the test. You can present the results as saying that the difference between the log transformed means are statistically significant. Then, back transform those values to the natural units and present those as well. Also, note that using log transformed data changes the nature of that test so that it is comparing geometric means rather than the usual arithmetic means. Be sure that is acceptable. Also, check that the transformed data follow the normal distribution.
  
  However, you generally don’t need to do this if you have a large enough sample size per group–as I point out in this post. Consider using transformations only when the data are severely skewed and/or you have a smaller sample size. Unfortunately, I don’t have a blog post on this process. However, unless you have a strong need to transform your data, I would not use that approach.
  
  I hope this helps!
  
  Loading...
  
  Reply
Mrinali says

February 28, 2019 at 5:14 pm

Very helpful article. Nice explanation

Loading...

Reply
Lynn says

January 20, 2019 at 11:34 pm

Jim, your site in general and this page has helped me understand statistics so much better as a novice. Regarding the Wilcoxon, although super helpful in understanding the basics- I’m still unsure about how I can relate this to my study. It’s been loosely suggested to me by a peer that I use the Wilcoxon text, but I’m not sure how to confirm this.
I have 13 participants. They each watched Video 1 and answered 16 corresponding questions (8 for construct A and 8 for construct B). They then watched Video 2 and answered the same 16 questions (8 for construct A and 8 for construct B). The questions were 3, 5, and 7 likert scale questions.
I want to find the differences in ratings between Videos 1 and 2 for construct A, the differences in ratings between Videos 1 and 2 for construct B, and the highest rated Video in total (combining both constructs). Any advice? Thanks

Loading...

Reply
Asmat says

November 21, 2018 at 10:23 am

It is really helpful article. I learned a lot. Thanks for posting.

Loading...

Reply
- Jim Frost says
  
  November 21, 2018 at 10:57 am
  
  You’re very welcome. I’m glad it was helpful!
  
  Loading...
  
  Reply
Asmat says

November 21, 2018 at 9:32 am

Thanks Jim. Which post-hoc test would you suggest in this case. I really appreciate it. Thanks.

Loading...

Reply
- Jim Frost says
  
  November 21, 2018 at 10:11 am
  
  The post-hoc test I’m most familiar with is the Games-Howell test, which is similar to Tukey’s test. I’m sure there are others, but I’m not familiar with them. For more information and an example of Welch’s with this post-hoc test, read my post on Welch’s ANOVA.
  
  Loading...
  
  Reply
Asmat says

November 21, 2018 at 4:48 am

Hi Jim,

I am dealing with 6 groups of a data set with different number of sample sizes. The minimum sample size of one group is 56 and maximum is 350 and other groups sample sizes are in between these two points. My data is not normal and through levene’s test I found that the variances are not equal. I think comparison of mean is somehow meaningful compared to median. Could you please guide me to select between Welch-test ANOVA or Kruskal Wallis test?

Thanks

Loading...

Reply
- Jim Frost says
  
  November 21, 2018 at 9:27 am
  
  Hi Asmat,
  
  Given your large sample sizes, unequal variances, and the fact that you want to compare means, I ‘d use Welch’s ANOVA.
  
  Best of luck with your analysis!
  
  Loading...
  
  Reply
Ferhat says

August 11, 2018 at 10:54 am

Hi from Turkey
I have followed your post for 6 months. Every article is better than the last. Thank you for have loved the statistic.

Loading...

Reply
- Jim Frost says
  
  August 11, 2018 at 4:14 pm
  
  Hi Ferhat, thank you so much! That means a lot to me!
  
  Loading...
  
  Reply
John says

August 6, 2018 at 8:01 am

Hi Jim,

This is really an insightful article. I have a question though regarding my study. Can I still use a parametric test even if the distribution is not normal and the variances aren’t homogeneous? I checked those assumptions via Shapiro-Wilk test and Levene’s F-test and the results suggested that both assumptions were violated. Other online articles mentioned that if this is the case, I should use a non-parametric test but I also read somewhere that oneway ANOVA would do. By the way, I have 3 groups with equal number of observations, i.e., 21 for each group.

Thanks for your time.

Loading...

Reply
- Jim Frost says
  
  August 6, 2018 at 10:58 am
  
  Hi John,
  
  If you sample size per group meets the requirements that I present in the Advantage #1 for parametric test, then nonnormal data are not a problem. These tests are robust to departures from normality as long as you have a sufficient number of observations per group.
  
  As for unequal variances, you often have stricter requirements when you use nonparametric tests. This fact isn’t discussed much but nonparametric tests typically requires the same spread across groups. For t-tests and ANOVA, you have options that allow you to use them when variances are not equal. For example, for ANOVA you can use Welch’s ANOVA. For details on that method, read my post about Welch’s ANOVA.
  
  Based on your sample size per group, you should be able to use ANOVA regardless of whether the data are normally distributed. If you suspect that the variances are not equal, you can use Welch’s ANOVA.
  
  I hope this helps.
  
  Loading...
  
  Reply
  - John says
    
    August 8, 2018 at 4:01 am
    
    Thanks a lot for your prompt response, Jim. Really appreciate it. I’ll check on Welch’s ANOVA, then. Again, many thanks!
    
    Loading...
    
    Reply
jain says

August 1, 2018 at 4:32 am

My data of 350 doesnt follow normal distribution.. which one should i take median or mean..how should it be reported.. should i report on mean sd cv etc

Loading...

Reply
- Jim Frost says
  
  August 1, 2018 at 3:18 pm
  
  Hi Jain,
  
  The answer to this question depends on which measure best represents the middle of your distribution and what is important to the subject area. In general, the more skewed your distribution, the more you should consider using the median. Graph your data to help answer this question. Also, I’ve written a post about the different measures of central tendency that you should read!
  
  I hope this helps!
  
  Loading...
  
  Reply
Muhammad Nazir says

July 8, 2018 at 3:43 am

Thanks Respected Sir
I got your point. You are great.

Loading...

Reply
- Jim Frost says
  
  July 8, 2018 at 10:39 pm
  
  You’re welcome. I’m glad I could help!
  
  Loading...
  
  Reply
Muhammad Nazir says

July 8, 2018 at 3:23 am

there is no significant difference in pre-intervention scores of groups with p value>0.05 but when we see Mean scores of groups there are minor difference among the groups. In this case Can I use ANCOVA?

Loading...

Reply
- Jim Frost says
  
  July 8, 2018 at 3:38 am
  
  ANCOVA allows you include a covariate (a continuous variable that might be correlated with the dependent variable) in the analysis along with your categorical variables (factors). Telling me about the means of the groups is not applicable to whether you should use ANCOVA specifically. Do you have a continuous independent variable to include in the analysis?
  
  I’m not sure why you’re analyzing the pre-intervention scores? However, it is entirely normal to see differences between the group means when the p-value is greater than 0.05. However, that issue does not relate to whether you should use ANCOVA or not.
  
  If you have only the 5 groups and there are no other variables in your analysis, no you can’t use ANCOVA because you don’t have a covariate. Seems like you should use one-way ANOVA. You can subtract the pretest scores from the post-test scores so you’re analyzing the differences by group. This process will tell you how the changes in the experimental groups compare to the change in the control group.
  
  Loading...
  
  Reply
Muhammad Nazir says

July 6, 2018 at 1:48 pm

Respected Sir, please answer my last two questions too.

Loading...

Reply
- Jim Frost says
  
  July 6, 2018 at 1:59 pm
  
  I will, Muhammad. Please keep in mind that the website is something I do in my spare time. I try to answer all questions but sometimes it will take a day or two depending on what else I have going on.
  
  Loading...
  
  Reply
Muhammad Nazir says

July 6, 2018 at 1:41 pm

Thanks Great Sir

Loading...

Reply
Muhammad Nazir says

July 6, 2018 at 1:25 pm

Dear Jim Frost thanks for your kind reply,
Please also guide and answer my two questions more:
1. NO significant difference was found among the Covariates with p>0.05 before intervention. But there is minor difference in their mean score. In this case, Can I use ANCOVA for analysis with covariates having significant score with p>0.05?
Is it okay that using ANCOVA will remove the initial differences found in mean score of covariate though there was No significant difference found in terms of p>0.05 before intervention?
2. In my experimental study sample size is 50. There are 5 groups (4 experimental and 1 control group). I am using randomized pretest-posttest control group design but some people say this research design is not appropriate. Please guide is this research okay or not? if not then please tell the appropriate design?
I am giving different interventions to 4 experimental groups, No intervention to control group. Please reply immediately.

Loading...

Reply
- Jim Frost says
  
  July 8, 2018 at 3:08 am
  
  Hi Muhammad,
  
  I’m a bit confused by your first question. Covariates are continuous variables so there are not any significant differences. Covariates don’t assess the differences between the means of the levels of a categorical variable. Instead, you use the p-value to determine whether there is a significant relationship between the covariate and dependent variable in the same manner as for linear regression. Usually, if the it is not significant, you don’t include it in the model. However, if theory strongly suggests that it should be in the model, it is ok to include it even when the p-value is greater than 0.05.
  
  I don’t see why a pretest-posttest would not be OK. But, I don’t have much information to go by. Why did they say it was not appropriate?
  
  Loading...
  
  Reply
Muhammad Nazir says

July 6, 2018 at 1:07 pm

Actually I have only 10 subjects each group which is not greater than 15. thats why I asked?

Loading...

Reply
- Jim Frost says
  
  July 6, 2018 at 1:31 pm
  
  Hi Muhammad,
  
  That size limit is only important when your data don’t follow a normal distribution. You said that your data do follow the normal distribution. So, it shouldn’t be a problem!
  
  Loading...
  
  Reply
Muhammad Nazir says

July 3, 2018 at 9:51 pm

I have 5 groups in experimental study (4 experimental and 01 control). Sample size 50 with 10 subjects in each group. All groups have normal distribution. Can I use parametric test, please reply immediate.

Loading...

Reply
- Jim Frost says
  
  July 5, 2018 at 3:14 pm
  
  Hi Muhammad, given what you state, I see no reason why you couldn’t use a parametric test.
  
  Loading...
  
  Reply
sam says

June 17, 2018 at 1:57 pm

Hi Jim, thanks for the overview! Do you happen to have a source/reference I can refer to when using the claims you make as argumentation in my paper?

Loading...

Reply
- Jim Frost says
  
  June 18, 2018 at 11:35 am
  
  Hi Sam, I include a link in this post to a white paper about the sample size claims. You’ll find your answers there!
  
  Loading...
  
  Reply
Mohammad Hasan says

May 28, 2018 at 7:44 am

Wonderful article…love all your articles…?

Loading...

Reply
- Jim Frost says
  
  May 30, 2018 at 10:52 am
  
  Thank you, Mohammad! That means a lot to me!
  
  Loading...
  
  Reply
david okurut says

April 24, 2018 at 5:48 am

I have benefited from your information. May God bless You.

Loading...

Reply
- Jim Frost says
  
  April 24, 2018 at 9:44 am
  
  Thank you, David! It makes me happy to hear that this has been helpful for you!
  
  Loading...
  
  Reply
Anitha Suseelan.s. says

February 12, 2018 at 2:33 am

Very nice explanation
Of central tendencies

Loading...

Reply
- Jim Frost says
  
  February 12, 2018 at 9:51 am
  
  Thank you, Anitha!
  
  Loading...
  
  Reply
Mosbah says

April 25, 2017 at 4:23 am

How can I cite this article?

Loading...

Reply
- Jim Frost says
  
  April 26, 2017 at 2:08 am
  
  Hi, there are several standard formats for electronic sources, such as MLA, APA, and Chicago style. You’ll need to check with your institution to determine which one you should use.
  
  Loading...
  
  Reply
BIRUK AYALEW Wondem says

April 24, 2017 at 7:33 am

very nice

Loading...

Reply
Lucas says

April 24, 2017 at 1:03 am

Great article. This is one of those statistical tests that took a while to understand. But you explained it very nicely!

Loading...

Reply
- Jim Frost says
  
  April 24, 2017 at 1:12 am
  
  Thank you so much Lucas!
  
  Loading...
  
  Reply
- Jim Frost says
  
  March 16, 2021 at 12:34 am
  
  Thanks, Lucas!
  
  Loading...
  
  Reply