P-values and coefficients in regression analysis work together to tell you which relationships in your model are statistically significant and the nature of those relationships. The coefficients describe the mathematical relationship between each independent variable and the dependent variable. The p-values for the coefficients indicate whether these relationships are statistically significant.

After fitting a regression model, check the residual plots first to be sure that you have unbiased estimates. After that, it’s time to interpret the statistical output. Linear regression analysis can produce a lot of results, which I’ll help you navigate. In this post, I cover interpreting the p-values and coefficients for the independent variables.

**Related post**: When Should I Use Regression Analysis?

## Interpreting P-Values for Variables in a Regression Model

Regression analysis is a form of inferential statistics. The p-values help determine whether the relationships that you observe in your sample also exist in the larger population. The p-value for each independent variable tests the null hypothesis that the variable has no correlation with the dependent variable. If there is no correlation, there is no association between the changes in the independent variable and the shifts in the dependent variable. In other words, there is insufficient evidence to conclude that there is effect at the population level.

If the p-value for a variable is less than your significance level, your sample data provide enough evidence to reject the null hypothesis for the entire population. Your data favor the hypothesis that there *is* a non-zero correlation. Changes in the independent variable *are* associated with changes in the response at the population level. This variable is statistically significant and probably a worthwhile addition to your regression model.

On the other hand, a p-value that is greater than the significance level indicates that there is insufficient evidence in your sample to conclude that a non-zero correlation exists.

The regression output example below shows that the South and North predictor variables are statistically significant because their p-values equal 0.000. On the other hand, East is not statistically significant because its p-value (0.092) is greater than the usual significance level of 0.05.

It is standard practice to use the coefficient p-values to decide whether to include variables in the final model. For the results above, we would consider removing East. Keeping variables that are not statistically significant can reduce the model’s precision.

Related post: F-test of overall significance in regression

## Interpreting Regression Coefficients for Linear Relationships

The sign of a regression coefficient tells you whether there is a positive or negative correlation between each independent variable the dependent variable. A positive coefficient indicates that as the value of the independent variable increases, the mean of the dependent variable also tends to increase. A negative coefficient suggests that as the independent variable increases, the dependent variable tends to decrease.

The coefficient value signifies how much the mean of the dependent variable changes given a one-unit shift in the independent variable while holding other variables in the model constant. This property of holding the other variables constant is crucial because it allows you to assess the effect of each variable in isolation from the others.

The coefficients in your statistical output are estimates of the actual population parameters. To obtain unbiased coefficient estimates that have the minimum variance, and to be able to trust the p-values, your model must satisfy the seven classical assumptions of OLS linear regression.

## Graphical Representation of Regression Coefficients

A simple way to grasp regression coefficients is to picture them as linear slopes. The fitted line plot illustrates this by graphing the relationship between a person’s height (IV) and weight (DV). The numeric output and the graph display information from the same model.

The height coefficient in the regression equation is 106.5. This coefficient represents the mean increase of weight in kilograms for every additional one meter in height. If your height increases by 1 meter, the average weight increases by 106.5 kilograms.

The regression line on the graph visually displays the same information. If you move to the right along the x-axis by one meter, the line increases by 106.5 kilograms. Keep in mind that it is only safe to interpret regression results within the observation space of your data. In this case, the height and weight data were collected from middle-school girls and range from 1.3 m to 1.7 m. Consequently, we can’t shift along the line by a full meter for these data.

Let’s suppose that the regression line was flat, which corresponds to a coefficient of zero. For this scenario, the mean weight wouldn’t change no matter how far along the line you move. That’s why a near zero coefficient suggests there is no effect—and you’d see a high (insignificant) p-value to go along with it.

The plot really brings this to life. However, plots can display only results from simple regression—one predictor and the response. For multiple linear regression, the interpretation remains the same.

## Use Polynomial Terms to Model Curvature in Linear Models

The previous linear relationship is relatively straightforward to understand. A linear relationship indicates that the change remains the same throughout the regression line. Now, let’s move on to interpreting the coefficients for a curvilinear relationship, where the effect depends on your location on the curve. The interpretation of the coefficients for a curvilinear relationship is less intuitive than linear relationships.

As a refresher, in linear regression, you can use polynomial terms model curves in your data. It is important to keep in mind that we’re still using linear regression to model curvature rather than nonlinear regression. That’s why I refer to curvilinear relationships in this post rather than nonlinear relationships. Nonlinear has a very specialized meaning in statistics. To read about this distinction, read my post: The Difference between Linear and Nonlinear Regression Models.

This regression example uses a quadratic (squared) term to model curvature in the data set. You can see that the p-values are statistically significant for both the linear and quadratic terms. But, what the heck do the coefficients mean?

## Graphing the Data for Regression with Polynomial Terms

Graphing the data really helps you visualize the curvature and understand the regression model.

The chart shows how the effect of machine setting on mean energy usage depends on where you are on the regression curve. On the x-axis, if you begin with a setting of 12 and increase it by 1, energy consumption should decrease. On the other hand, if you start at 25 and increase the setting by 1, you should experience an increased energy usage. Near 20 and you wouldn’t expect much change.

Regression analysis that uses polynomials to model curvature can make interpreting the results trickier. Unlike a linear relationship, the effect of the independent variable changes based on its value. Looking at the coefficients won’t make the picture any clearer. Instead, graph the data to truly understand the relationship. Expert knowledge of the study area can also help you make sense of the results.

Related post: Curve Fitting using Linear and Nonlinear Regression

## Regression Coefficients and Relationships Between Variables

Regression analysis is all about determining how changes in the independent variables are associated with changes in the dependent variable. Coefficients tell you about these changes and p-values tell you if these coefficients are significantly different from zero.

All of the effects in this post have been main effects, which is the direct relationship between an independent variable and a dependent variable. However, sometimes the relationship between an IV and a DV changes based on another variable. This condition is an interaction effect. Learn more about these effects in my post: Understanding Interaction Effects in Statistics.

In this post, I didn’t cover the constant term. Be sure to read my post about how to interpret the constant!

The statistics I cover in the post tell you how to interpret the regression equation, but they don’t tell you how well your model fits the data. For that, you should also assess R-squared.

If you’re learning regression and like the approach I use in my blog, check out my eBook!

**Note: I wrote a different version of this post that appeared elsewhere. I’ve completely rewritten and updated it for my blog site.**

Olga Pap says

Hello Jim.

Hello All.

I have one question. Specifically, when the dependent variable (e.g. earnings) is expressed on a logarithmic form (and not the independent variables) via mincer equation, does the interpretation of coefficients follow the below rules?

• For an increase of one-unit of the independent variable “X”, with coefficient b, then the change for dependent variable “Y” in logarithmic form should be e^b?

• And only for very small values of b (b < |0.1|) and having in mind that

e^b ≈ 1 + b, increase of one-unit of the independent variable “X”, with coefficient b, then the change for dependent variable “Y” should be equal to (100 × b)?

Thank you in advance.

Olga Pap says

Is it possible please to answer me on the above question?

Jessica says

Thank you! It does, though, when I’m looking at a scatterplot, I’ve seen an R value. This is not to be interpreted as the same thing as the correlation coefficient, r . . . correct? Even though the R value is not R-squared, it is still not the same as r . . . right?

Jim Frost says

Hi Jessica,

Ah, yes, I jumped straight to R-squared because that is used much more frequently. R is the coefficient of multiple

correlationwhereas R-squared is the coefficient of multipledetermination. The use of the capital letter R for both of these statistics indicates that they are sample estimates. I’ve described R-squared so onto R!The calculation for R is (unsurprisingly) just taking the positive square root of R-squared. R represents the correlation between a set of variables with another variable. In the regression context, this could be the correlation between your set of independent variables and the dependent variable. The interpretation of R is not intuitive. Hence, R-squared is used more frequently.

Lower case r is the correlation between two variables and it is commonly used. R involves more than two variables.

I haven’t seen R used much at all. Perhaps it is in some specialized context. But, you probably don’t need to worry about R.

Jessica says

Hi, I know this may seem to be a very simple question, but is there a difference between R and r? Do they stand for the same thing in regression analysis?

Jim Frost says

Hi Jessica,

Yes, r and R-squared are related as they both measure the strength of relationships between variables. r is a correlation coefficient that ranges between -1 to +1. It measures the strength of the linear relationship between two continuous variables. R-squared measures the strength of the relationship between a set of independent variables and the dependent variable. It’s a percentage that ranges from 0 – 100%.

Suppose you have a pair of variables, say X and Y, and the correlation coefficient (r) is 0.7. If you perform a simple regression using these two variables, you will obtain an R-squared of 0.49 (49%). We know this because 0.7^2 = 0.49. However, unlike correlation coefficients (r), you can use R-squared when you have more than two variables.

I write about that aspect in my post about correlation. You can also read more about R-squared.

I hope this helps!

Olga Pap says

Hi Jim. I would be very grateful if you could help me. Specifically, when the dependent variable (e.g. earnings) is expressed on a logarithmic form (and not the independent variables) via mincer equation, does the interpretation of coefficients follow the below rules?

• For an increase of one-unit of the independent variable “X”, with coefficient b, then the change for dependent variable “Y” in logarithmic form should be e^b?

• And only for very small values of b (b < |0.1|) and having in mind that

e^b ≈ 1 + b, increase of one-unit of the independent variable “X”, with coefficient b, then the change for dependent variable “Y” should be equal to (100 × b)?

Thank you in advance.

Tesfakiros Semere says

What a clear, simple, and easy to understand. You saved my time from reading lots of books. It is really helpful.

Would it be possible to get them all in Pdf just to print and read when I am out of network

THANK YOU SO MUCH Kim.

Jim Frost says

Hi Kim, thanks so much for your kind words! They made my day! While I don’t have PDFs of the blog posts, in several weeks I’ll releasing an ebook all about regression analysis. If you like the simple and easy to understand approach in my blog posts, you’ll love this book. It should be out in early March 2019!

Digambar salunkhe says

Thank you so much for sharing this blog…It’s really helpful and easy to understand the concept of whole regression model.

Adu Emmanuel Ifedayo says

Thank you.

Neven says

Hi Jim ! Great blog , very clear and very helpfull . The best I have found in this field! Thanks.

Qmars Safikhani says

Hi Jim,

Thanks a lot for sharing your knowledge through this article. I found it very interesting as you explained somehow difficult concepts in an easy way. Well done

Hans says

Hey Jim,

Great Blog! You helped us a lot preparing for our studies at university. We have a question regarding the p-value… Is there an explanation for a p-value being exactly 1.0? Does it mean that there is a 100 percent chance that the independent variable has no effect on the dependent one? Or is there anything else to consider? Thanks a lot for your help and keep that great work going!

Jim Frost says

Hi Hans, thank you so much! It’s great to hear that it’s been helpful for you all. That makes my day!

Yes, you can obtain a p-value of 1.0. To get exactly 1.0, your sample statistic would have to exactly equal the null hypothesis. For example, if you perform a 1-sample t-test and your null hypothesis is that the population mean equals 10. If your sample statistic is exactly 10, you obtain a p-value of exactly 1.0. In regression analysis, typically the null for a coefficient is that it equals zero. So, if the estimated coefficient equals zero exactly, you’d again get a p-value of 1.0.

The interpretation of a p-value in general is the probability of obtaining the observed sample statistic or more extreme if you assume the null hypothesis is true. The reason p = 1.0 when the sample statistics equals the null hypothesis value makes sense when you think about it with that interpretation in mind. When the sample stat equals the null value, there is a 100% probability that a sample statistic will equal the null value or be more extreme! That’s true by definition because that case covers the entire range of the sampling distribution (i.e., you’d shade the entire area beneath the sampling distribution curve).

To see these sampling distributions in action for a hypothesis test, read my post about p-values and significance levels.

Of course, the probability of obtaining a sample statistic that exactly equals your null hypothesis is miniscule. When using statistical software in the field, if you see a p-value = 1, it’s more likely due to rounding.

Paul says

Hi. I want to find out if simple or multiple regressions can be used to explain effects (as in experimental studies)?

Thank you.

Jim Frost says

Hi Paul,

You bet they can! The coefficients describe the effects and the p-values determine whether the effects are statistically significant.

Rashan says

This is very helpful. Thank you

Surya says

Thanks Jim

Surya says

Hi Jim, I have just subscribed to your posts after reading the wonderful post on residual plots.

Could you please let me know how do we interpret the SE of coefficients , T statistic as well.. Or do you already have an article on them… Please reply.. Thanks..

Jim Frost says

Hi Surya,

Thanks so much! I’m glad that post was helpful!

The standard error (SE) of the coefficient measures the precision of the coefficient estimate. Smaller values represent more precise estimates. Standard errors are the standard deviations of sampling distributions. If you were to perform your study many times, drawing the same sample size, and fitting the same model, you’d obtain a distribution of coefficient estimates. That’s the sampling distribution of a coefficient estimate. The standard error of a coefficient is the standard deviation of that sampling distribution. The SE is used to create confidence intervals for the coefficient estimate, which I find more intuitive to interpret.

The t-statistic in the context of regression analysis is the test statistic that the analysis uses to calculate the p-value. I write a post about how it works in the context of t-tests. It’s fairly similar for coefficient estimates. Read that post but replace sample mean with coefficient estimate and you’ll get a good idea. How t-tests work.

I hope that helps!

[email protected] says

been reading your posts all night, (morning now).. I can’t stop because it’s like a light bulb keeps going off. Been studying this stuff for weeks, now I finally get it thanks to your post. Thank you:)

-Extremely tired data science grad student.

Jim Frost says

Hi, I’m sorry my posts caused you to lose some sleep last night, but I love your analogy about light bulbs going off! I’m really happy to hear that they were helpful. That really makes my day! Best of luck with your studies!

Tracey says

Hi Jim. Thank you so much for this as it helped clear up some things in my mind as I prepare a research paper.

Jim Frost says

Hi Tracey, you’re very welcome. I am happy to hear that it was helpful!

Qiumei Jing says

Thank you for your explanation,Jim.That’s really great!

When I’m doing multiple liner regression , I have a question.The liner regression has three independent variables(A,B,C) and one dependent variable(D). I got significant p-value of ANOVA table,but in Coefficients table ,the constant p-value is 0.237,which is not significant ,with one predictor(Variable A) p-value is 0.211,another two predictors have good significant value(P=0.000). In that case ,how can I interpret the results? The hypothesis of the two predictors (variable B and C)which have significant is”there is a relationship between B and D” and “there is a relationship between C and D ” In this case,can I say the two hypothesis were supported? And how can I interpret the one (A)with insignificant p-value in coefficient table? Thank you in advance!

Jim Frost says

Hi Qiumei,

It’s generally not worthwhile interpreting the constant, so I’d skip that. To learn why, click the link for interpreting the constant in this post.

Here’s how you can interpret the significant predictors.

The sample provides sufficient evidence to conclude that changes in both independent variables B and C are correlated with changes in the dependent variable D. Statistical significance indicates that the correlation does not equal zero. In other words, you can reject the null hypothesis that the coefficients equal zero.

For the insignificant variable, the sample provides insufficient evidence to conclude that there is a relationship between these insignificant variables and the dependent variable. In other words, you fail to reject the null hypothesis that these two coefficients equal zero.

For more elaboration, reread this post where I talk about this in depth.

Appadu says

Dear Jim

Thank you for your explanations on how to Interpret Regression Coefficients for Linear Relationships and p-value. It is very clear appreciate you time to put this together.

I have one question I was looking at an example on Estimated standardised OLS beta coefficient data. The results show R squared (%) as 26.2 and F-Value 18.14. Please advise how to interpret this 2 figures. Thank you

Jim Frost says

Hi Appadu,

When you standardize the continuous independent variables in your model, the output produces standardized coefficients. Standardization is when you take the original data for each variable, subtract the variable’s mean from each observation and divide by the variable’s standard deviation. The main reason I’m aware of for performing this standardization is to reduce the multicollinearity caused by including polynomials and interaction terms in your model. I write about that in my post about multicollinearity.

In terms of interpreting the standardize coefficient–it represents the mean change in the dependent variable given a one standard deviation in the independent variable. Another reason statisticians use it is as a possible measure for identifying which variable is the most important.

As for interpreting R-squared and the F-test of overall significance, those don’t change from the usual interpretations. Click on the links to read my blog post about interpreting each statistic.

I hope this helps!

Hrishikesh Geed says

Thanks for the explaination Jim !!.

I have one doubt, how do you calculate the p-value corresponding to each coefficient?

How do you decide the standard deviation,and the sample mean for calculating the z value for each coefficient?

Thanks

Hrishi

eric says

Thank you very much for the explanation Jim!

If the p-value is under the significant level, this would indicate that there is enough evidence to reject the null hypothesis. The null hypothesis being here that there is no correlation between 2 variables (in a single linear regression).

Here is my first question: how do we decide how to set the significant level? Is it purely arbitrary?

My second question is: since the coefficient of correlation varies -1 and 1, it is tempting to conclude that there is a significant correlation (positive or negative) between 2 variables is the coefficient of correlation is close to -1 or 1 and that there is no correlation when the coefficient of correlation is close to 0. However I think this assumption is false but can’t get the intuition to understand why.

Could you help me about those questions?

Many thanks for your time and your attention

Best regards

Eric

Hanan Shteingart says

the following claim is not true if the features are correlated, what’s known as multicollinearity: “The sign of a regression coefficient tells you whether there is a positive or negative correlation between each independent variable the dependent variable”. In fact, a feature could have a positive correlation with the target yet a negative coefficient and vice vera.

Jim Frost says

Hi Hanan,

You raise a good point. The interpretation that I present, including the portion that you quote, is accurate when your model doesn’t contain a severe problem. However, if your model does contain a severe problem, it can produce unreliable results, which includes the possibility that the coefficients don’t accurately describe the relationship between the independent variables and the dependent variable. The problem isn’t with how to interpret coefficients, but rather with a condition in the model that causes it to produce coefficients that you can’t trust.

As you point out, multicollinearity can produce unreliable, erratic coefficients. In some cases, the sign of the coefficient can even be incorrect. However, the sign switch doesn’t necessarily have to happen when your model has multicollinearity. I write more multicollinearity, including switched signs, in this post: Multicollinearity in Regression Analysis: Problems, Detection, and Solutions.

By the way, there are a number of other potential problems that can cause your model to produce results that can’t trust. Multicollinearity is just scratching the surface of that. These problems include an incorrectly specified model, overfitting the model, heteroscedasticity, and data mining among others. I spend quite a bit of time talking about these problems, how they can invalidate your results, and what you can do to address them.

I hope this helps!

MN says

Thank you very much for the wonderful elaboration. Amazing!!

Jim Frost says

You’re very welcome, MN! I’m glad it’s helpful!

Rajasekar says

I am currently working on a multiple regression model, where i have 4 x variable and all my variable are not statistically significant. I know when this happen i can reject null hypothesis but like to know what might be the wrong , do i need to add some more x variable in this case.Also the R Square =0.109842937

Adjusted R Square =0.034084889

Ayush says

This is really one of the best websites I have come across for DATA SCIENCE… Great effort put up by Sir Jim…

Jim Frost says

Thank you, Ayush!

Rali says

Hi Mr. Jim

Thanks for the helpful blog

all the best

Jim Frost says

Hi Rali, you’re very welcome! I’m glad it was helpful!

ADIL HUSSAIN RESHI says

Really fabulous ..it cleared all my doubts about p- value

Jim Frost says

Hi Adil, Thanks! I’m so glad to hear that it was helpful!

Javed Iqbal says

Thanks Jim for the nice explanation. This regression seems to violate one of the model assumption namely the homoskedasticity. Log transformation should work here.

Jim Frost says

Hi Javed, thanks for your comment. The residuals for this model are homoscedastic–or very close to it. Their variance are fairly equal across the entire range. The variance might appear to be lower in the very low end of the range, but there are also fewer observations in that region, which can make the dispersion appear to be smaller. At any rate, it is close enough. To see how a true case of heteroscedasticity appears, along with multiple methods for correcting it, read my post about heteroscedasticity. By the way, I explain in that post why I always recommend trying other methods of addressing this problem before using a transformation.

Toby says

Great blog with detailed explanation! It helps clear my doubts for p-value.

Thank you Jim! and Happy new year! 😀

Jim Frost says

Thank you, Toby! And, I’m very happy you found the blog to be helpful! Happy new year to you too!!