In statistics, the degrees of freedom (DF) indicate the number of independent values that can vary in an analysis without breaking any constraints. It is an important idea that appears in many contexts throughout statistics including hypothesis tests, probability distributions, and regression analysis. Learn how this fundamental concept affects the power and precision of your statistical analysis!

In this blog post, I bring this concept to life in an intuitive manner. I’ll start by defining degrees of freedom. However, I’ll quickly move on to practical examples in a variety of contexts because they make this concept easier to understand.

## Definition of Degrees of Freedom

Degrees of freedom are the number of independent values that a statistical analysis can estimate. You can also think of it as the number of values that are free to vary as you estimate parameters. I know, it’s starting to sound a bit murky!

Degrees of freedom encompasses the notion that the amount of independent information you have limits the number of parameters that you can estimate. Typically, the degrees of freedom equal your sample size minus the number of parameters you need to calculate during an analysis. It is usually a positive whole number.

Degrees of freedom is a combination of how much data you have and how many parameters you need to estimate. It indicates how much independent information goes into a parameter estimate. In this vein, it’s easy to see that you want a lot of information to go into parameter estimates to obtain more precise estimates and more powerful hypothesis tests. So, you want many degrees of freedom!

## Independent Information and Constraints on Values

The definitions talk about independent information. You might think this refers to the sample size, but it’s a little more complicated than that. To understand why, we need talk about the freedom to vary. The best way to illustrate this concept is with an example.

Suppose we collect the random sample of observations shown below. Now, imagine that we know the mean, but we don’t know the value of an observation—the X in the table below.

The mean is 6.9, and it is based on 10 values. So, we know that the values must sum to 69 based on the equation for the mean.

Using simple algebra (64 + X = 69), we know that X must equal 5.

## Estimating Parameters Imposes Constraints on the Data

As you can see, that last number has no freedom to vary. It is not an independent piece of information because it cannot be any other value. Estimating the parameter, the mean in this case, imposes a constraint on the freedom to vary. The last value and the mean are entirely dependent on each other. Consequently, after estimating the mean, we have only 9 independent pieces of information even though our sample size is 10.

That’s the basic idea for degrees of freedom in statistics. In a general sense, DF are the number of observations in a sample that are free to vary while estimating statistical parameters. You can also think of it as the amount of independent data that you can use to estimate a parameter.

## Degrees of Freedom and Probability Distributions

Degrees of freedom also define the probability distributions for the test statistics of various hypothesis tests. For example, hypothesis tests use the t-distribution, F-distribution, and the chi-square distribution to determine statistical significance. Each of these probability distributions is a family of distributions where the degrees of freedom define the shape. Hypothesis tests use these distributions to calculate p-values. So, the DF are directly linked to p-values through these distributions!

Next, let’s look at how these distributions work for several hypothesis tests.

**Related posts**: Understanding Probability Distributions and A Graphical Look at Significance Levels (Alpha) and P values

## Degrees of Freedom for t-Tests and the t-Distribution

T-tests are hypothesis tests for the mean and use the t-distribution to determine statistical significance.

A 1-sample t-test determines whether the difference between the sample mean and the null hypothesis value is statistically significant. Let’s go back to our example of the mean above. We know that when you have a sample and estimate the mean, you have n – 1 degrees of freedom, where n is the sample size. Consequently, for a 1-sample t-test, the degrees of freedom is n – 1.

The DF define the shape of the t-distribution that your t-test uses to calculate the p-value. The graph below shows the t-distribution for several different degrees of freedom. Because the degrees of freedom are so closely related to sample size, you can see the effect of sample size. As the degrees of freedom decreases, the t-distribution has thicker tails. This property allows for the greater uncertainty associated with small sample sizes.

To dig into t-tests, read my post about How t-Tests Work. I show how the different t-tests calculate t-values and use t-distributions to calculate p-values.

The F-test in ANOVA also tests group means. It uses the F-distribution, which is defined by the degrees of freedom. However, you calculate the DF for an F-distribution differently. For more information, read my post about How F-tests Work in ANOVA.

**Related post**: How to Interpret P-values Correctly

## Degrees of Freedom for the Chi-Square Test of Independence

The chi-square test of independence determines whether there is a statistically significant relationship between categorical variables. Just like other hypothesis tests, this test incorporates degrees of freedom. For a table with r rows and c columns, the general rule for calculating degrees of freedom for a chi-square test is (r-1) (c-1).

However, we can create tables to understand it more intuitively. The degrees of freedom for a chi-square test of independence is the number of cells in the table that can vary before you can calculate all the other cells. In a chi-square table, the cells represent the observed frequency for each combination of categorical variables. The constraints are the totals in the margins.

### Chi-Square 2 X 2 Table

For example, in a 2 X 2 table, after you enter one value in the table, you can calculate the remaining cells.

In the table above, I entered the bold 15, and then I can calculate the remaining three values in parentheses. Therefore, this table has 1 DF.

### Chi-Square 3 X 2 Table

Now, let’s try a 3 X 2 table. The table below illustrates the example that I use in my post about the chi-square test of independence. In that post, I determine whether there is a statistically significant relationship between uniform color and deaths on the original *Star Trek* TV series.

In the table, one categorical variable is shirt color, which can be blue, gold, or red. The other categorical variable is status, which can be dead or alive. After I entered the two bolded values, I can calculate all the remaining cells. Consequently, this table has 2 DF.

Read my post, Chi-Square Test of Independence and an Example, to see how this test works and how to interpret the results using the *Star Trek* example.

Like the t-distribution, the chi-square distribution is a family of distributions where the degrees of freedom define the shape. Chi-square tests use this distribution to calculate p-values. The graph below displays several chi-square distributions.

## Degrees of Freedom in Regression Analysis

Degrees of freedom in regression is a bit more complicated, and I’ll keep it on the simple side. In a regression model, each term is an estimated parameter that uses one degree of freedom. In the regression output below, you can see how each term requires a DF. There are 28 observations and the two independent variables use a total of two degrees of freedom. The remaining 26 degrees of freedom are displayed in Error.

The error degrees of freedom are the independent pieces of information that are available for estimating your coefficients. For precise coefficient estimates and powerful hypothesis tests in regression, you must have many error degrees of freedom. This equates to having many observations for each model term.

As you add terms to the model, the error degrees of freedom decreases. You have fewer pieces of information available to estimate the coefficients. This situation reduces the precision of the estimates and the power of the tests. When you have too few remaining degrees of freedom, you can’t trust the regression results. If you use all your degrees of freedom, the p-values can’t be calculated.

For more information about the problems that occur when you use too many degrees of freedom and how many observations you need, read my blog post about overfitting your model.

Even though they might seem murky, degrees of freedom are essential to any statistical analysis! In a nutshell, DF define the amount of information you have relative to the number of properties that you want to estimate. If you don’t have enough information for what you want to do, you’ll have imprecise estimates and low statistical power.

### References

Walker, H. W. Degrees of Freedom. Journal of Educational Psychology. 31(4) (1940) 253-269.

Pandy, S., and Bright, C. L., Social Work Research Vol 32, number 2, June 2008.

Tarun Sachdeva says

Dear i am confused that degree pf freedom tells us the no of observation or on some sites they said no of independent variables required to estimate the relationship.

in terms of P value as well when p-value is less than 0.05 than it means null hypothesis is false or true and we reject the null hypothesis . if its zero that means there is no relationship between variables .

p value tells the relationship between x and y or between independent variables (y).

GLAD IF YOU HELP ME

Jim Frost says

Hi Tarun,

As I write in this post, degrees of freedom represent the number of independent pieces of data. In regression modeling, you do incorporate the number of IVs into calculating the error degrees of freedom. As I show in this post, the method for calculating degrees of freedom changes based on the context. However, the underlying principle is that it represents the amount of independent data values that your are using to estimate the value of a population parameter.

I think you might have a fundamental misunderstanding about p-values. A p-value of zero (which technically is not possible but it can appear that way thanks to rounding) does not indicate there is no relationship. P-values represent the strength of the evidence that your sample provides against the null hypothesis. Please read this post of mine about how to interpret p-values. If after reading that post you have more questions, please post them in the comments section of that post.

I hope that helps!

Moritz says

Dear Jim, thank you very much for the elaboration, it helps a lot! However i do have difficulties making the connection between the degrees of freedom and the actual computing of the parameters. For example for calculating the variance i do understand that the quadratic difference of the last datapoint is determined beforehand by the other data. But we still add it up with all the other quadratic differences.. if we then want to take the mean of those quadratic differences it would still make sense to me to divide by n since we added up n different quadratic differences even though the last one was predetermined by the others… could you help me find out where i make a mistake in my reasoning?

Thank you again for this amazing blog post!

Best regards

Moritz

Carrie P says

Hi Jim,

Thank you for your helpful explanation of degrees of freedom!

I am planning to run a linear mixed effects model containing the following explanatory variables: Treatment (n=4) * Genotype (n=48) + Source Location (4) + (1 | Unit). Unit is a random effect and n= 24 Units.

I assume that I would calculate the number of degrees of freedom in these variables as (3 X 47) + 3 + 23 = 167. I have 576 samples, so the total error degrees of freedom would be 576 – 167 = 409. Please let me know if this is correct, or if I am thinking about it incorrectly.

If my calculation is correct, is 409 a sufficient number of error degrees of freedom? What would be the minimum that would give me enough power (in case I wish to add more explanatory variables)?

Thank you.

Sincerely,

Carrie

Jim Frost says

Hi Carrie,

Calculating the DF for a linear mixed effects model is fairly complicated and even difficult for statistical software to calculate. That’s why you might not obtain p-values for mixed effects models. Some software calculates approximate DF and provide estimated p-values. Consequently, it’s tricky for me to say whether you have a sufficient sample size.

On the one hand, you have a good number of observations. However, on the other hand, you also have categorical (nominal) variables that have many levels, which eat up a ton of DF! My inclination is that, yes, you do have a sufficient number of observations. It being a mixed effects model does complicate it a bit but you

probablyhave a somewhat small sample size. It might be a bit on the low side and you might have somewhat low power. Of course, the size of the effect is also an issue! I wouldn’t add too many additional variables, particularly if they are categorical variables with a large number of levels.In short, I think it’s a little low on the sample size but probably not unusable. Usually you’d want ~10 subjects per DF used. Again, DF are a bit difficult to calculate for mixed effects models. However, as the DF increases, you don’t always need to add 10 subjects for each one, it diminishes. If we use the linear model DF, you’ve got ~3.5 subjects per DF. Probably lower than ideal in terms of power but the estimates should largely be unbiased.

Rukmankesh Mehra says

Hi Jim,

I find the discussion about df very informative. In my study, I found different df values in t-tests of the same sample size (with different t-tests vary in values only). Kindly, please help in understanding this.

YIHENEW says

I found this discussions are more use full to me because I am Naive to statistics. But try some.

Dear Dr. Jim do we consider DF both in parametric and non parametric statistic.

Thanks

Jim Frost says

Hi, thanks for the great question. The answer is both yes and no! Hey, that’s statistics for you, right?

It depends on the nonparametric test. For example, the Mann Whitney test doesn’t use degrees of freedom.

However, the Kruskal Wallis uses DF because its test statistic approximates the chi-square distribution.

Kevin says

Hi Jim, thank you for further explaining. I get the idea that unknowns are free to vary until the very last one. The part I’m having trouble with is “imagine you know the mean”.

If we know the mean, then by definition we must know its components — how else could we have arrived at the mean? If I have 2 chimps in my zoo, and chimp A weighs 40 kg while chimp B weighs 60 kg, then on average they must weigh 50 kg. All values are fixed by reality.

But, of course if we’re saying that all chimps in the world weigh a theoretical 50 kg, then if it turns out that chimp A actually weighs 50 kg, then chimp B must theoretically weigh 50 kg (even though we know it really weighs 60 kg). Is this the idea? That the mean is a presupposed universal value based on prior knowledge? In other words, are we taking the global mean based on studies 1, 2, 3, …., 99 and applying it to study 100?

Thought I’d have one more go, but accepting and moving on may be the right move at this juncture!

Kevin

Jim Frost says

Hi Kevin,

The key concept is independent pieces of information. While the explanation might not totally be intuitive, it illustrates the general idea. The approach I take is a fairly standard one, but I’m trying to think of a better way to explain it.

It’s not based on presupposing prior knowledge, although I see how the explanation confuses things in that regard. But, when you’re estimating something, such as a mean, with the first values you uncover, the mean can still be anything. But, once you get to the final value, you can use algebra to calculate it using the mean–hence it is not an independent value. Of course, in the real world, you’re not going to know the mean before you calculate it using all the values. But, the 100% dependency still exists even though you can’t really use this method without prior knowledge. The example tries to show this underlying dependency.

I’m not sure if that helps any. I’ll think about it some more!

Jim

Kevin says

Hi Jim,

I also share Surya’s confusion. If, as you say, sample values are revealed, then doesn’t this imply that the mean must also be revealed? But here we talk about mean as if we have foreknowledge of its value. Is there a different explanation you can provide? Or is this something I should just accept, and move on?

Many thanks,

Kevin

Jim Frost says

Hi Kevin,

I’d never want to say that you should just accept something and move on! But, DF is a particularly slippery concept.

Focus on the key concept behind degrees of freedom. It is a count of the number of independent pieces of information. Focus particularly on the notion of “independent.”

Imagine you know the mean, and you keep revealing the actual observations one by one. At first, you can reveal an observation, and it doesn’t limit the potential values of the remaining observations. You can’t predict them using the available information–they’re truly independent. You get down to the last two unknown observed values. Despite there just being two unknowns, you still have absolutely zero ability to predict their values. However, once you reveal one more, you have a 100% ability to predict the final value. At that point, you are revealing one value yet you are learning two values. Hence, those two observations are not independent. Consequently, in the count of independent observations (aka DF), you need to subtract one (n – 1).

I hope that helps a bit at least!

Hamza shabbir says

what role of degree of freedom statistics in psychological research?

Nathaniel says

Hi Jim,

I’ve been going crazy trying to find the answer to this question – maybe you can help. Why is that the population variance doesn’t also get divided by N-1? It too depends on the population mean in order to be calculated correctly, so therefore the nth member of the population should not have freedom to vary, yet population variance is calculated by dividing by N. Please help, thanks!

Jim Frost says

Hi Nathaniel,

If have the measurements for an entire population, or you just want the variance for a sample, then you don’t need to divide by N-1 because in those cases you are not using it as an estimate. However, if you are working with a sample and want to use it to estimate the population variance, you do need to divide by N-1.

I hope this helps!

Surya says

Hi Jim,

In the above post, the mean and 9 numbers are known ..so the DF is 9.I understood that.

My question is… How can we know the mean of 10 numbers in the first place if we do not know what is the 10th value.. Please clarify

Jim Frost says

Hi again Surya!

DF is a tricky concept. It is a bit weird but the idea is that the mean exists and the sample exists. You don’t know the values if haven’t calculated them, but as the sample values are revealed, the constraints on the remaining values increases. For that final observation, it must be on particular value and is no longer free to vary. There’s a 100% dependence of that last value on the value of the mean. In practice, you won’t go through that process, but it is important to build it into statistical tests because it reflects those underlying constraints.

Wisdom Akurugu says

Most grateful Sir.

Wisdom Akurugu says

Dear Dr Frost,

Thanks so much for this surgical explanation of the degrees of freedom. My understanding from your presentation is that the general rule for determining DF is the (R-1)(C-1) formula.

I have a 3×3 table as below:

AA AB BB Totals

Low

Mid

High

Totals

Here the number of rows=3 and columns=3

Thus, (3-1)(3-1)=(2)(2) = 4 DF.

Sir, am I right?

Kind regards

Wisdom

Jim Frost says

Hi Wisdom,

Yes, you are absolutely correct! Your 3X3 table has 4 degrees of freedom.

Arash says

Dear Jim

Hi

I am sorry to bother you. I have a problem to describe the results below by degrees of freedom f.

Is it possible for helping me.

thanks.

x <- 1:20

true.y <- 2*x + 5

amt.noise <- 30

y <- true.y + amt.noise*rnorm(length(x))

cor.test(x,y)

# Pearson's product-moment correlation

#

# data: x and y

# t = -0.455, df = 18, p-value = 0.65

# alternative hypothesis: true correlation is not equal to 0

# 95 percent confidence interval:

# -0.52434 0.35260

# sample estimates:

# cor

# -0.10655

Rob says

Hi Jim, just want to say thanks for providing a great and easy to understand resource for all of us who struggle to understand statistics! Much appreciated

Jim Frost says

You’re very welcome, Rob. I appreciate the kinds words too!

payeng says

Hi Dr Jim,

I really appreciate your reply. It is really a great one.

Some articles do mentioned that we shall use interpolation method to find the t-value if it is not given in the table. But none discuss like what you have explained which can convince us to use the lower DF instead of using the standard rounding rules.

Thanks a lot Dr JIm.

payeng says

Thanks Dr Jim for the reply.

But I have another question. Is the standard rounding rules can be used for F-table as well?

I have read a statistics textbook about finding the F-value which is not given in the table.

The author wrote: “When the degree of freedom values cannot be found in the table, the closed value of the smaller side should be used. For example, if d.f.N =14, this value is between the given table values of 12 and 15, therefore 12 should be used, to be on the safe side.”

May I have your opinion and what does it means safe side?

Thanks again.

Jim Frost says

Hi, so that’s a good point to consider, although it’s not always crucial, but one that I ultimately agree with.

What the author means by “safe side” is to pick the DF that requires stronger evidence to be statistically significant. For any given test statistic distribution (t-values, F-values, etc), if you can’t pick the exact DF from a table that you require, you should pick the DF that requires stronger evidence. For a test statistic, this is equivalent to picking the DF that is associated with a larger absolute value of the statistic–and that means choosing a lower DF.

In other words, you are in a situation where you need to make a choice because you can’t use the exactly correct value. The choice you make should require stronger evidence rather than weaker evidence to be statistically significant. That is “being safe.” It’s equivalent to lowering alpha from 5% to, say, 4.95%. You wouldn’t want to go the other direction and raise it to 5.05%

To be honest, it has been decades since I’ve thought about the practical realities of using tables given the use of statistical software. But, you raise an excellent point. In some cases, such as how I described 39 DF for the t-distribution, the difference is minute. You have to go out three decimal places to see a difference.

Additionally, hypothesis test results that are borderline significant (right around p = 0.05 when alpha = 0.05) are not particularly strong results. To see why, read my post about correctly interpreting p-values. Near the end of that post, I discuss strength of evidence. In a nutshell, I would not consider results with a p-value of 0.049 to be any stronger than 0.051. In either case, both results are fairly weak evidence to build a case on. Changing the DF affects these borderline cases. So, this approach of choosing lower DF requires “stronger” evidence to be significant–but borderline cases still don’t constitute strong evidence when you use the typical significance level of 0.05.

However, I do agree with the approach of choosing the DF that requires stronger evidence to produce statistically significant results. If you have to make a choice, make a choice in the direction of requiring stronger evidence. That approach indicates choosing the lower DF. Thanks for raising this issue! It was good to think through this!

payeng says

Hi Dr Jim,

How do I find values not given in t-distribution?

I am using Statistical Table “Statistical Tables – J.A. Barnes|J. Murdoch – Macmillan International”, let say i wanted to find the t-value for alpha=0.05 with the degree of freedom 39. The table just provided the degrees of freedom for 30 and 40. Which one shall I choose?

What is the general rule for this problem? Either round up the degree of freedom or round down?

Thanks

Jim Frost says

Hi,

In the t-distribution, after you get past about 30 df, the differences between the t-values for different probabilities become miniscule. You often have to go out to three decimal places before you’ll find a difference in the t-values.

Consequently, you won’t be too far off using the standard rounding rules: rounding up for >= 5 and rounding down for < 5. In your case, I'd use 40. You can also use a more precise table of t-values, such as this one that lists 39 df specifically.

I hope this helps!

Edson Chiwenga says

Thanks a lot . your explanation makes the job easier even for us who are not good in math and stats

Jim Frost says

Hi Edson, you’re very welcome. I’m glad it has helped!

Eajaz Ahmad Dar says

Thanks Jim, I have probably found the first person with such clear basics. Hope to learn much more with you.

Jim Frost says

Hi Eajaz, thanks so much for the kind words! You made my day because I strive to find ways to teach statistics using easy to understand language!

Indranil says

Jim thanks for the core area in stat that you always state. I dwnld the hurd-0.9.tar.gz ..is it the right file? if not, could you please suggest which one is right and which app file has to run? thanks.

Jim Frost says

Hi Indranil, I’m not sure which file you’re referring to? Are you referring to the PSPP software? If you, I believe the correct file for Windows is pspp-20170909-daily-64bits-setup.exe. That is a file you can run to install the program.

Ali Munsir says

Many thanks for such valuable knowledge sharing

Jim Frost says

My pleasure, Ali!

Arindam says

Very Very informative. Thank you very much.

Jim Frost says

Hi Arindam, you’re very welcome! I’m glad it was helpful!

salman shah says

And also we are confused in the diference between sample size and degree of freedom……

Jim Frost says

Sample size is the number of data points in your study. Degrees of freedom are often closely related to sample size yet are never quite the same. The relationship between sample size and degrees of freedom depends on the specific test. Hypothesis tests actually use the degrees of freedom in the calculations for statistical significance. Typically, DF define the probability distribution of the test statistic.

salman shah says

Dear sir,plz tell me that what is the diference between statistic and test statistic ?

Jim Frost says

Hi Salman,

A statistic is a piece of information based on data. For example, the crime rate, median income, mean height, etc.

A test statistic is a statistic that summarizes the sample data

andis used in hypothesis testing to determine whether the results are statistically significant. The hypothesis test takes all of the sample data, reduces it to a single value, and then calculates probabilities based on that value to determine significance. For more information about how test statistics work, read my posts about t-values and F-values. Both of those are test statistics.I hope this helps!

Akhilesh kumar Gupta says

The minitab software you are using is free or paid…if it is free please provide me its link… thank you

Jim Frost says

Hi Akhilesh, Minitab is not free. However, if you’re looking for free statistical software, I recommend PSPP, which is freeware (fully functional, no time limits) that is very similar to SPSS. Download PSPP here.

Dhruv govil says

The topic clearity is in very good format. But please explain this through R programming . Do that we can feel confidence while prediction.

Jim Frost says

Hi Dhruv, I’m glad you found the topic helpful! My blog is designed to teach statistical concepts, analyses, interpretation, etc rather than teaching a specific software package. You’ll find that degrees of freedom is inherent to statistics regardless of the software you use. The software package can supply the documentation that describes how to obtain the specific results that you need.

Tavsief says

Wow. Superb. Thank you so much. All I can do.

nadeem malik says

thanks Dr jim so nice concept

Jim Frost says

Thank you!

Muhammad Arif says

in simple words we can say that, the total sample size minus the number of parameters to be estimated in a series is called D.F, am i right dear Jim? which software you have used for graphs?

Jim Frost says

Hi Muhammad! Yes, that’s a good general sense of the term. However, it’s not always exactly correct. For instance, take a look at the chi-square examples. I used Minitab software for the graphs.

Best wishes to you!

Jim

Teng Li says

thank you Jim, I always find your article is of great value to me.

Jim Frost says

You’re very welcome, Teng. I’m very happy to hear that you find them to be helpful!