Choosing whether to perform a one-tailed or a two-tailed hypothesis test is one of the methodology decisions you might need to make for your statistical analysis. This choice can have critical implications for the types of effects it can detect, the statistical power of the test, and potential errors.

In this post, you’ll learn about the differences between one-tailed and two-tailed hypothesis tests and their advantages and disadvantages. I include examples of both types of statistical tests. In my next post, I cover the decision between one and two-tailed tests in more detail.

## What Are Tails in a Hypothesis Test?

First, we need to cover some background material to understand the tails in a test. Typically, hypothesis tests take all of the sample data and convert it to a single value, which is known as a test statistic. You’re probably already familiar with some test statistics. For example, t-tests calculate t-values. F-tests, such as ANOVA, generate F-values. The chi-square test of independence and some distribution tests produce chi-square values. All of these values are test statistics.

These test statistics follow a sampling distribution. Probability distribution plots display the probabilities of obtaining test statistic values when the null hypothesis is correct. On a probability distribution plot, the portion of the shaded area under the curve represents the probability that a value will fall within that range.

The graph below displays a sampling distribution for t-values. The two shaded regions cover the two-tails of the distribution.

Keep in mind that this t-distribution assumes that the null hypothesis is correct for the population. Consequently, the peak (most likely value) of the distribution occurs at t=0, which represents the null hypothesis in a t-test. Typically, the null hypothesis states that there is no effect. As t-values move further away from zero, it represents larger effect sizes. When the null hypothesis is true for the population, obtaining samples that exhibit a large apparent effect becomes less likely, which is why the probabilities taper off for t-values further from zero.

**Related posts**: How t-Tests Work and Understanding Probability Distributions

## Critical Regions in a Hypothesis Test

In hypothesis tests, critical regions are ranges of the distributions where the values represent statistically significant results. Analysts define the size and location of the critical regions by specifying both the significance level (alpha) and whether the test is one-tailed or two-tailed.

Consider the following two facts:

- The significance level is the probability of rejecting a null hypothesis that is correct.
- The sampling distribution for a test statistic assumes that the null hypothesis is correct.

Consequently, to represent the critical regions on the distribution for a test statistic, you merely shade the appropriate percentage of the distribution. For the common significance level of 0.05, you shade 5% of the distribution.

**Related post**: Significance Levels and P-values

## Two-Tailed Hypothesis Tests

Two-tailed hypothesis tests are also known as nondirectional and two-sided tests because you can test for effects in both directions. When you perform a two-tailed test, you split the significance level percentage between both tails of the distribution. In the example below, I use an alpha of 5% and the distribution has two shaded regions of 2.5% (2 * 2.5% = 5%).

When a test statistic falls in either critical region, your sample data are sufficiently incompatible with the null hypothesis that you can reject it for the population.

In a two-tailed test, the generic null and alternative hypotheses are the following:

**Null**: The effect equals zero.**Alternative**: The effect does not equal zero.

The specifics of the hypotheses depend on the type of test you perform because you might be assessing means, proportions, or rates.

### Example of a two-tailed 1-sample t-test

Suppose we perform a two-sided 1-sample t-test where we compare the mean strength (4.1) of parts from a supplier to a target value (5). We use a two-tailed test because we care whether the mean is greater than or less than the target value.

To interpret the results, simply compare the p-value to your significance level. If the p-value is less than the significance level, you know that the test statistic fell into one of the critical regions, but which one? Just look at the estimated effect. In the output below, the t-value is negative, so we know that the test statistic fell in the critical region in the left tail of the distribution, indicating the mean is less than the target value. Now we know this difference is statistically significant.

We can conclude that the population mean for part strength is less than the target value. However, the test had the capacity to detect a positive difference as well. You can also assess the confidence interval. With a two-tailed hypothesis test, you’ll obtain a two-sided confidence interval. The confidence interval tells us that the population mean is likely to fall between 3.372 and 4.828. This range excludes the target value (5), which is another indicator of significance.

### Advantages of two-tailed hypothesis tests

You can detect both positive and negative effects. Two-tailed tests are standard in scientific research where discovering any type of effect is usually of interest to researchers.

## One-Tailed Hypothesis Tests

One-tailed hypothesis tests are also known as directional and one-sided tests because you can test for effects in only one direction. When you perform a one-tailed test, the entire significance level percentage goes into the extreme end of one tail of the distribution.

In the examples below, I use an alpha of 5%. Each distribution has one shaded region of 5%. When you perform a one-tailed test, you must determine whether the critical region is in the left tail or the right tail. The test can detect an effect only in the direction that has the critical region. It has absolutely no capacity to detect an effect in the other direction.

In a one-tailed test, you have two options for the null and alternative hypotheses, which corresponds to where you place the critical region.

You can choose either of the following sets of generic hypotheses:

**Null**: The effect is less than or equal to zero.**Alternative**: The effect is greater than zero.

Or:

**Null**: The effect is greater than or equal to zero.**Alternative**: The effect is less than zero.

Again, the specifics of the hypotheses depend on the type of test you perform.

Notice how for both possible null hypotheses the tests can’t distinguish between zero and an effect in a particular direction. For example, in the example directly above, the null combines “the effect is greater than or equal to zero” into a single category. That test can’t differentiate between zero and greater than zero.

### Example of a one-tailed 1-sample t-test

Suppose we perform a one-tailed 1-sample t-test. We’ll use a similar scenario as before where we compare the mean strength of parts from a supplier (102) to a target value (100). Imagine that we are considering a new parts supplier. We will use them only if the mean strength of their parts is greater than our target value. There is no need for us to differentiate between whether their parts are equally strong or less strong than the target value—either way we’d just stick with our current supplier.

Consequently, we’ll choose the alternative hypothesis that states the mean difference is greater than zero (Population mean – Target value > 0). The null hypothesis states that the difference between the population mean and target value is less than or equal to zero.

To interpret the results, compare the p-value to your significance level. If the p-value is less than the significance level, you know that the test statistic fell into the critical region. For this study, the statistically significant result supports the notion that the population mean is greater than the target value of 100.

Confidence intervals for a one-tailed test are similarly one-sided. You’ll obtain either an upper bound or a lower bound. In this case, we get a lower bound, which indicates that the population mean is likely to be greater than or equal to 100.631. There is no upper limit to this range.

A lower-bound matches our goal of determining whether the new parts are stronger than our target value. The fact that the lower bound (100.631) is higher than the target value (100) indicates that these results are statistically significant.

This test is unable to detect a negative difference even when the sample mean represents a very negative effect.

### Advantages and disadvantages of one-tailed hypothesis tests

One-tailed tests have more statistical power to detect an effect in one direction than a two-tailed test with the same design and significance level. One-tailed tests occur most frequently for studies where one of the following is true:

- Effects can exist in only one direction.
- Effects can exist in both directions but the researchers only care about an effect in one direction. There is no drawback to failing to detect an effect in the other direction. (Not recommended.)

The disadvantage of one-tailed tests is that they have no statistical power to detect an effect in the other direction.

As part of your pre-study planning process, determine whether you’ll use the one- or two-tailed version of a hypothesis test. To learn more about this planning process, read 5 Steps for Conducting Scientific Studies with Statistical Analyses.

This post explains the differences between one-tailed and two-tailed statistical hypothesis tests. How these forms of hypothesis tests function is clear and based on mathematics. However, there is some debate about when you can use one-tailed tests. My next post explores this decision in much more depth and explains the different schools of thought and my opinion on the matter—When Can I Use One-Tailed Hypothesis Tests.

If you’re learning about hypothesis testing and like the approach I use in my blog, check out my eBook!

Dr. PRAVIN VADHEL says

Hi Jim,

I didn’t get an idea that when to use two tail test and one tail test. Will you please explain?

Jim Frost says

Hi Pravin,

I have a complete article dedicated to that: When Can I Use One-Tailed Tests.

Basically, start with the assumption that you’ll use a two-tailed test but then consider scenarios where a one-tailed test can be appropriate. I talk about all of that in the article.

If you have questions after reading that, please don’t hesitate to ask!

Anh Tran says

Thank you so so much for this webpage.

I have two scenarios that I need some clarification. I will really appreciate it if you can take a look:

So I have several of materials that I know when they are tested after production. My hypothesis is that the earlier they are tested after production, the higher the mean value I should expect. At the same time, the later they are tested after production, the lower the mean value. Since this is more like a “greater or lesser” situation, I should use one tail. Is that the correct approach?

On the other hand, I have several mix of materials that I don’t know when they are tested after production. I only know the mean values of the test. And I only want to know whether one mean value is truly higher or lower than the other, I guess I want to know if they are only significantly different. Should I use two tail for this? If they are not significantly different, I can judge based on the mean values of test alone. And if they are significantly different, then I will need to do other type of analysis. Also, when I get my P-value for two tail, should I compare it to 0.025 or 0.05 if my confidence level is 0.05?

Thank you so much again.

Jim Frost says

Hi Anh,

For your first, if you absolutely know that the mean must be lower the later the material is tested, that it cannot be higher, that would be a situation where you can use a one-tailed test. However, if that’s not a certainty, you’re just guessing, use a two-tail test. If you’re measuring different items at the different times, use the independent 2-sample t-test. However, if you’re measuring the same items at two time points, use the paired t-test. If it’s appropriate, using the paired t-test will give you more statistical power because it accounts for the variability between items. For more information, see my post about when it’s ok to use a one-tailed test.

For the mix of materials, use a two-tailed test because the effect truly can go either direction.

Always compare the p-value to your full significance level regardless of whether it’s a one or two-tailed test. Don’t divide the significance level in half.

Najma says

Is it possible that we reach to opposite conclusions if we use a critical value method and p value method

Secondly if we perform one tail test and use p vale method to conclude our Ho, then do we need to convert sig value of 2 tail into sig value of one tail. That can be done just by dividing it with 2

Jim Frost says

Hi Najma,

The p-value method and critical value method will always agree as long as you’re not changing anything about how the methodology.

If you’re using statistical software, you don’t need to make any adjustments. The software will do that for you.

However, if you calculating it by hand, you’ll need to take your significance level and then look in the table for your test statistic for a one-tailed test. For example, you’ll want to look up 5% for a one-tailed test rather than a two-tailed test. That’s not as simple as dividing by two. In this article, I show examples of one-tailed and two-tailed tests for the same degrees of freedom. The t critical value for the two-tailed test is +/- 2.086 while for the one-sided test it is 1.725. It is true that probability associated with those critical values doubles for the one-tailed test (2.5% -> 5%), but the critical value itself is not half (2.086 -> 1.725). Study the first several graphs in this article to see why that is true.

For the p-value, you can take a two-tailed p-value and divide by 2 to determine the one-sided p-value. However, if you’re using statistical software, it does that for you.

Mo Has says

Hello Jim, if you have the time I’d be grateful if you could shed some clarity on this scenario:

“A researcher believes that aromatherapy can relieve stress but wants to determine whether it can also enhance focus. To test this, the researcher selected a random sample of students to take an exam in which the average score in the general population is 77. Prior to the exam, these students studied individually in a small library room where a lavender scent was present. If students in this group scored significantly above the average score in general population [is this one-tailed or two-tailed hypothesis?], then this was taken as evidence that the lavender scent enhanced focus.”

Thank you for your time if you do decide to respond.

Jim Frost says

Hi Mo,

It’s unclear from the information provided whether the researchers used a one-tailed or two-tailed test. It could be either. A two-tailed test can detect effects in both directions, so it could definitely detect an average group score above the population score. However, you could also detect that effect using a one-tailed test if it was set up correctly. So, there’s not enough information in what you provided to know for sure. It could be either.

However, that’s irrelevant to answering the question. The tricky part, as I see it, is that you’re not entirely sure about why the scores are higher. Are they higher because the lavender scent increased concentration or are they higher because the subjects have lower stress from the lavender? Or, maybe it’s not even related to the scent but some other characteristic of the room or testing conditions in which they took the test. You just know the scores are higher but not necessarily why they’re higher.

I’d say that, no, it’s not necessarily evidence that the lavender scent enhanced focus. There are competing explanations for why the scores are higher. Also, it would be best do this as an experiment with a control and treatment group where subjects are randomly assigned to either group. That process helps establish causality rather than just correlation and helps rules out competing explanations for why the scores are higher.

By the way, I spend a lot of time on these issues in my Introduction to Statistics ebook.

Toni says

If a left tail test has an alpha value of 0.05 how will you find the value in the table

Marvalisa Payne says

Hi Jim,

My question is in regards to the results in the table in your example of the one-sample T (Two-Tailed) test. above. What about the P-value? The P-value listed is .018. I assuming that is compared to and alpha of 0.025, correct?

In regression analysis, when I get a test statistic for the predictive variable of -2.099 and a p-value of 0.039. Am I comparing the p-value to an alpha of 0.025 or 0.05? Now if I run a Bootstrap for coefficients analysis, the results say the sig (2-tail) is 0.098. What are the critical values and alpha in this case? I’m trying to reconcile what I am seeing in both tables.

Thanks for your help.

Jim Frost says

Hi Marvalisa,

For one-tailed tests, you don’t need to divide alpha in half. If you can tell your software to perform a one-tailed test, it’ll do all the calculations necessary so you don’t need to adjust anything. So, if you’re using an alpha of 0.05 for a one-tailed test and your p-value is 0.04, it is significant. The procedures adjust the p-values automatically and it all works out. So, whether you’re using a one-tailed or two-tailed test, you always compare the p-value to the alpha with no need to adjust anything. The procedure does that for you!

The exception would be if for some reason your software doesn’t allow you to specify that you want to use a one-tailed test instead of a two-tailed test. Then, you divide the p-value from a two-tailed test in half to get the p-value for a one tailed test. You’d still compare it to your original alpha.

For regression, the same thing applies. If you want to use a one-tailed test for a cofficient, just divide the p-value in half if you can’t tell the software that you want a one-tailed test. The default is two-tailed. If your software has the option for one-tailed tests for any procedure, including regression, it’ll adjust the p-value for you. So, in the normal course of things, you won’t need to adjust anything.

Bill says

Hey Jim, for a one-tailed hypothesis test with a .05 confidence level, should I use a 95% confidence interval or a 90% confidence interval? Thanks

Jim Frost says

Hi Bill,

You should use a one-sided 95% confidence interval. One-sided CIs have either an upper OR lower bound but remains unbounded on the other side.

Stan Alekman says

This is not applicable to the subject but…

When performing tests of equivalence, we look at the confidence interval of the difference between two groups, and we perform two one-sided t-tests for equivalence..

Dominic Sando says

Hi Jim,

Thanks for this illustrative blogpost. I had a question on one of your points though.

By definition of H1 and H0, a two-sided alternate hypothesis is that there is a difference in means between the test and control. Not that anything is ‘better’ or ‘worse’.

Just because we observed a negative result in your example, does not mean we can conclude it’s necessarily worse, but instead just ‘different’.

Therefore while it enables us to spot the fact that there may be differences between test and control, we cannot make claims about directional effects. So I struggle to see why they actually need to be used instead of one-sided tests.

What’s your take on this?

Jim Frost says

Hi Dominic,

If you’ll notice, I carefully avoid stating better or worse because in a general sense you’re right. However, given the context of a

specificexperiment, you can conclude whether a negative value is better or worse. As always in statistics, you have to use your subject-area knowledge to help interpret the results. In some cases, a negative value is a bad result. In other cases, it’s not. Use your subject-area knowledge!I’m not sure why you think that you can’t make claims about directional effects? Of course you can!

As for why you shouldn’t use one-tailed tests for most cases, read my post When Can I Use One-Tailed Tests. That should answer your questions.

Jack says

Your website is absolutely amazing Jim, you seem like the nicest guy for doing this and I like how there’s no ulterior motive, (I wasn’t automatically signed up for emails or anything when leaving this comment). I study economics and found econometrics really difficult at first, but your website explains it so clearly its been a big asset to my studies, keep up the good work!

Jim Frost says

Thank you so much, Jack. Your kind words mean a lot!

Anthony says

Hy Jim I really need your help now pls

One-tailed and two- tailed hypothesis, is it the same or twice, half or unrelated pls

Jim Frost says

Hi Anthony,

I describe how the hypotheses are different in this post. You’ll find your answers.

AD says

Thank you for your blog Jim, I have a Statistics exam soon and your articles let me understand a lot!

Jim Frost says

You’re very welcome! I’m happy to hear that it’s been helpful. Best of luck on your exam!

Surya says

Hi Jim,

When you say target value is 5. Do you mean to say the population mean is 5 and we are trying to validate it with the help of sample mean 4.1 using Hypo tests ?.. If it is so.. How can we measure a population parameter as 5 when it is almost impossible o measure a population parameter. Please clarify

Jim Frost says

Hi Surya,

When you set a target for a one-sample test, it’s based on a value that is important to you. It’s not a population parameter or anything like that. The example in this post uses a case where we need parts that are stronger on average than a value of 5. We derive the value of 5 by using our subject area knowledge about what is required for a situation. Given our product knowledge for the hypothetical example, we know it should be 5 or higher. So, we use that in the hypothesis test and determine whether the population mean is greater than that target value.

When you perform a one-sample test, a target value is optional. If you don’t supply a target value, you simply obtain a confidence interval for the range of values that the parameter is likely to fall within. But, sometimes there is meaningful number that you want to test for specifically.

I hope that clarifies the rational behind the target value!

Henare says

I understand that in Psychology a one tailed hypothesis is preferred. Is that so

Jim Frost says

Hi Henare,

No, there’s no overall preference for one-tailed hypothesis tests in statistics. That would be a study-by-study decision based on the types of possible effects. For more information about this decision, read my post: When Can I Use One-Tailed Tests?

Nyor Teryima Jude says

I’m grateful to you for the explanations on One tail and Two tail hypothesis test. This opens my knowledge horizon beyond what an average statistics textbook can offer. Please include more examples in future posts. Thanks

Stanley Alekman says

Thank you. I will search it as well.

Stan Alekman

Stanley Alekman says

Jim, what is the difference between the central and non-central t-distributions w/respect to hypothesis testing?

Stan Alekman

Jim Frost says

Hi Stan, this is something I will need to look into. I know central t-distribution is the common Student t-distribution, but I don’t have experience using non-central t-distributions. There might well be a blog post in that–after I learn more!

MG says

this is awesome.

Jim Frost says

Thanks!