Use a paired t-test when each subject has a pair of measurements, such as a before and after score. A paired t-test determines whether the mean change for these pairs is significantly different from zero. This test is an inferential statistics procedure because it uses samples to draw conclusions about populations.
Paired t tests are also known as dependent samples t tests. The two samples are dependent because they contain the same subjects. Conversely, an independent samples t test contains different subjects in the two samples.
For example, you gather a random sample of people, give them a pretest, administer a training program, and then perform a posttest. Each subject has a pretest and posttest score, and you want to determine whether there is significant improvement between the tests.
Or, perhaps you have a sample of wood boards, and you paint half of each board with one paint and the other half with the other paint. Then, you measure the paint durability for both types of paint on all the boards. Each board has two paint durability scores, and you want to determine whether the two paints have different durability.
In both cases, you have the same subjects/items in both groups. Each subject has a pair of measurements. A paired t-test determines whether the mean difference of these pairs equals zero (no effect).
In this article, you’ll learn about the hypotheses, assumptions, and how to interpret the results for paired t tests.
Paired T Test Hypotheses
Paired t tests have the following hypotheses:
- Null hypothesis: The mean of the paired differences equals zero in the population.
- Alternative hypothesis: The mean of the paired differences does not equal zero in the population.
If the p-value is less than your significance level (e.g., 0.05), you can reject the null hypothesis. Your sample provides strong enough evidence to conclude that the mean paired difference does not equal zero in the population.
Related post: How to Interpret P Values
Paired T Test Assumptions
For reliable paired t-test results, your data should satisfy the following assumptions.
You have a random sample with independent subjects
Drawing a random sample from the population you are studying helps ensure that your data represent the population. Representative samples are vital when drawing inferences about the population. If your data do not represent the population, your analysis results will not be valid for that population.
When drawing a random sample, each item or person must have the same probability of being selected.
Paired t-tests use the same people or items in both groups. These are dependent samples.
It’s important to distinguish between independent subjects when drawing a random sample and dependent samples when measuring. When choosing the subjects, selecting one must not affect the probability of choosing the others. However, after selecting your subjects, they will all be in both groups. In this manner, you have independent subjects but dependent samples.
If the two groups contain different subjects, use an independent samples t test instead.
Your data must be continuous
T tests require continuous data. Continuous variables can take on any numeric value. Values can be meaningfully divided into smaller increments, including fractional and decimal values. Typically, you measure continuous variables on a scale. For example, weight, temperature, and height are continuous data.
If you don’t have continuous data, you’ll need to use a different type of hypothesis test. To learn more, read my post, Comparing Hypothesis Tests for Continuous, Binary, and Count Data.
Data should follow a normal distribution or have a sample size larger than 20
All t-tests assume that your data follow the normal distribution. For a paired t test, the normality assumption applies to the distribution of paired differences rather than raw test scores. However, you can waive this assumption when your sample size is large enough thanks to the central limit theorem.
For a paired t test, if you have at least 20 subjects, your test results will be reliable even when your data are skewed. However, when you have a smaller sample size, nonnormal data can cause the test results to be unreliable.
Paired T Test Example
For example, imagine we have a training program and administer a pretest and posttest to the same sample of students. Consequently, each student has a pair of test scores. We need to determine whether the average change for the pairs of scores is different from zero.
Here is what the data look like in the datasheet. Note that the analysis does not use the subject’s ID number.
Here’s the deciding characteristic for choosing between an independent samples t test and a paired t test. Does it make sense to assess the difference within a row? In other words, does each row correspond to one person or item?
For our dataset, each row in the dataset contains the same subject in the two measurement columns. Consequently, it makes sense to find the difference between the pairs of values. Each paired difference represents how much a subject’s score changed after the training program. The paired t-test is the correct choice.
Conversely, if each row had contained different subjects, it would not make sense to subtract them. The change between the pretest for one subject and the posttest for another does not provide meaningful information. In that case, we’d need to perform an independent samples t test.
Interpreting the Results
Here’s how to read and report the results for a paired t test.
The output indicates that the mean for the Pretest is 97.06, and for the Posttest it is 107.83. The average difference between the pretest and posttest is -10.77. If the p-value is less than your significance level, the difference does not equal zero.
Because our p-value (0.002) is less than the standard significance level of 0.05, we can reject the null hypothesis. Our sample data support the notion that the average paired difference does not equal zero. Specifically, the Posttest mean is greater than the Pretest mean.
The sample estimate of the difference (-10.77) is unlikely to equal the population difference. The confidence interval estimates that the actual population difference between the Pretest and Posttest is likely between -16.96 and -4.59.
The negative values reflect the fact that the Pretest has a lower mean than the Posttest (i.e., Pretest – Posttest < 0). The confidence interval excludes the value of zero (no difference between groups), so we can conclude that the population rates are different.
If high scores are better, then the Posttest scores are significantly better than the pretest scores.
To learn more about performing t-tests and how they work, read the following posts: