P-values and coefficients in regression analysis work together to tell you which relationships in your model are statistically significant and the nature of those relationships. The coefficients describe the mathematical relationship between each independent variable and the dependent variable. The p-values for the coefficients indicate whether these relationships are statistically significant.

After fitting a regression model, check the residual plots first to be sure that you have unbiased estimates. After that, it’s time to interpret the statistical output. Linear regression analysis can produce a lot of results, which I’ll help you navigate. In this post, I cover interpreting the p-values and coefficients for the independent variables.

**Related post**: When Should I Use Regression Analysis?

## Interpreting P-Values for Variables in a Regression Model

Regression analysis is a form of inferential statistics. The p-values help determine whether the relationships that you observe in your sample also exist in the larger population. The p-value for each independent variable tests the null hypothesis that the variable has no correlation with the dependent variable. If there is no correlation, there is no association between the changes in the independent variable and the shifts in the dependent variable. In other words, there is insufficient evidence to conclude that there is effect at the population level.

If the p-value for a variable is less than your significance level, your sample data provide enough evidence to reject the null hypothesis for the entire population. Your data favor the hypothesis that there *is* a non-zero correlation. Changes in the independent variable *are* associated with changes in the response at the population level. This variable is statistically significant and probably a worthwhile addition to your regression model.

On the other hand, a p-value that is greater than the significance level indicates that there is insufficient evidence in your sample to conclude that a non-zero correlation exists.

The regression output example below shows that the South and North predictor variables are statistically significant because their p-values equal 0.000. On the other hand, East is not statistically significant because its p-value (0.092) is greater than the usual significance level of 0.05.

It is standard practice to use the coefficient p-values to decide whether to include variables in the final model. For the results above, we would consider removing East. Keeping variables that are not statistically significant can reduce the model’s precision.

Related post: F-test of overall significance in regression

## Interpreting Regression Coefficients for Linear Relationships

The sign of a regression coefficient tells you whether there is a positive or negative correlation between each independent variable the dependent variable. A positive coefficient indicates that as the value of the independent variable increases, the mean of the dependent variable also tends to increase. A negative coefficient suggests that as the independent variable increases, the dependent variable tends to decrease.

The coefficient value signifies how much the mean of the dependent variable changes given a one-unit shift in the independent variable while holding other variables in the model constant. This property of holding the other variables constant is crucial because it allows you to assess the effect of each variable in isolation from the others.

The coefficients in your statistical output are estimates of the actual population parameters. To obtain unbiased coefficient estimates that have the minimum variance, and to be able to trust the p-values, your model must satisfy the seven classical assumptions of OLS linear regression.

## Graphical Representation of Regression Coefficients

A simple way to grasp regression coefficients is to picture them as linear slopes. The fitted line plot illustrates this by graphing the relationship between a person’s height (IV) and weight (DV). The numeric output and the graph display information from the same model.

The height coefficient in the regression equation is 106.5. This coefficient represents the mean increase of weight in kilograms for every additional one meter in height. If your height increases by 1 meter, the average weight increases by 106.5 kilograms.

The regression line on the graph visually displays the same information. If you move to the right along the x-axis by one meter, the line increases by 106.5 kilograms. Keep in mind that it is only safe to interpret regression results within the observation space of your data. In this case, the height and weight data were collected from middle-school girls and range from 1.3 m to 1.7 m. Consequently, we can’t shift along the line by a full meter for these data.

Let’s suppose that the regression line was flat, which corresponds to a coefficient of zero. For this scenario, the mean weight wouldn’t change no matter how far along the line you move. That’s why a near zero coefficient suggests there is no effect—and you’d see a high (insignificant) p-value to go along with it.

The plot really brings this to life. However, plots can display only results from simple regression—one predictor and the response. For multiple linear regression, the interpretation remains the same.

## Use Polynomial Terms to Model Curvature in Linear Models

The previous linear relationship is relatively straightforward to understand. A linear relationship indicates that the change remains the same throughout the regression line. Now, let’s move on to interpreting the coefficients for a curvilinear relationship, where the effect depends on your location on the curve. The interpretation of the coefficients for a curvilinear relationship is less intuitive than linear relationships.

As a refresher, in linear regression, you can use polynomial terms model curves in your data. It is important to keep in mind that we’re still using linear regression to model curvature rather than nonlinear regression. That’s why I refer to curvilinear relationships in this post rather than nonlinear relationships. Nonlinear has a very specialized meaning in statistics. To read about this distinction, read my post: The Difference between Linear and Nonlinear Regression Models.

This regression example uses a quadratic (squared) term to model curvature in the data set. You can see that the p-values are statistically significant for both the linear and quadratic terms. But, what the heck do the coefficients mean?

## Graphing the Data for Regression with Polynomial Terms

Graphing the data really helps you visualize the curvature and understand the regression model.

The chart shows how the effect of machine setting on mean energy usage depends on where you are on the regression curve. On the x-axis, if you begin with a setting of 12 and increase it by 1, energy consumption should decrease. On the other hand, if you start at 25 and increase the setting by 1, you should experience an increased energy usage. Near 20 and you wouldn’t expect much change.

Regression analysis that uses polynomials to model curvature can make interpreting the results trickier. Unlike a linear relationship, the effect of the independent variable changes based on its value. Looking at the coefficients won’t make the picture any clearer. Instead, graph the data to truly understand the relationship. Expert knowledge of the study area can also help you make sense of the results.

Related post: Curve Fitting using Linear and Nonlinear Regression

## Regression Coefficients and Relationships Between Variables

Regression analysis is all about determining how changes in the independent variables are associated with changes in the dependent variable. Coefficients tell you about these changes and p-values tell you if these coefficients are significantly different from zero.

All of the effects in this post have been main effects, which is the direct relationship between an independent variable and a dependent variable. However, sometimes the relationship between an IV and a DV changes based on another variable. This condition is an interaction effect. Learn more about these effects in my post: Understanding Interaction Effects in Statistics.

In this post, I didn’t cover the constant term. Be sure to read my post about how to interpret the constant!

The statistics I cover in the post tell you how to interpret the regression equation, but they don’t tell you how well your model fits the data. For that, you should also assess R-squared.

If you’re learning regression and like the approach I use in my blog, check out my eBook!

**Note: I wrote a different version of this post that appeared elsewhere. I’ve completely rewritten and updated it for my blog site.**

Rosie says

Dear Jim,

Thank you very sincerely for your time and your kind explaination.

I like your book, and I introduced it to my friends, too.

Have a nice day!

Rosie says

Dear Jim,

Thank you sincerely for your time and your kind explaination.

I like your book.

Have a nice day!

Rosie says

Hello Jim,

Sorry that I have 2 more questions for you.

1) As far as I know, with sample size of few hundreds, it’s normal to have few outliers. However, when I tried removing outliers, I got 1 more predictor significant. Thus, could you please kindly advise me should I remove outliers in this case?

2) I got the Mahal. Distance’s maximum value equals 52.361 which is far higher than the critical value (11.07) of df=5 (as I have 5 predictor variables) taken from Chi-squared distribution table at 0.5 alpha level. This indicates there are outliners which may place undue influence on the model.

– Whether my above understanding is correct?

– I tried removing the outliers by running “Select cases” with condition of “MAH1<11.07" and run the regression again. But then I still see the Mahal. Distance's maximum value equals around 15. Although it is already much lower but it is still higher than the critical value of 11.07. So can I stop with this lower value of Mahal. Distance and go ahead with interpreting the regression results, or I still need to do something else regarding removing the outliners?

Thank you so much for your kind explaination so far. I really appreciate it.

Rosie

Jim Frost says

Hi Rosie,

When you have a sample of that size, it’s typical for outlier tests to find a few outliers. However, that doesn’t mean those values are actually outliers. If you use these tests, you should consider the values as candidates that you need to investigate. Don’t assume that just because a test identifies values as being outliers that they are actually outliers. You don’t want to automatically remove outliers based on statistical tests only. Additionally, rerunning outlier tests after removing outliers can be problematic in some cases. Instead, you’ll need to investigate each outlier candidate and determine whether you should remove them based on what you find out and subject area knowledge. If you do remove an outlier, you need to be able to explain why for each data point.

It’s not surprising that removing outliers made a predictor become significant. By removing unusual values you’re reducing the variability in your data, which tends to increase statistical power. However, that doesn’t indicate that removing the values is the correct approach. Again, you’ll need to make that determination on a case-by-case basis.

I’ve just recently written two posts about outliers that you’ll probably find helpful. These posts aren’t written from the regression point of view but the general approaches are still applicable. Read Five Ways to Identify Outliers and Determining Whether to Remove Outliers.

Additionally, outliers are more complicated in regression because there are a variety of ways that an observation can be unusual. I cover this in detail from the regression perspective specifically in my ebook, Regression Analysis: An Intuitive Guide. If you haven’t bought it already, you should consider getting it.

I hope this helps!

Rosie says

Dear Jim,

Thank you for your response.

A nice day to you.

Rosie says

Dear Jim,

I would like to consult you on the conflict results that Pearson correlation and Multiple Regression test produce.

For example, my hypothesis is:

H1: There is a positive relationship between subjective norms and purchase intention for eco-products.

If my Pearson correlation test shows that there is a positive relationship between these 2 variables, but my regression test shows that subjective norms and purchase intention are not significant (I have several indepdent variables in multiple regression analysis and “subjective norms” is one of them. In my regression test, “purchase intention” is outcome variable).

So is it correct if I made conclusion for my hypothesis H1 based on result of Pearson correlation test; and for multiple regression result, I just can say and discuss that “subjective norms” is not an effective predictor of “purchase intention”?

(As Pearson test and Regression test show conflict results so I wonder for only hypotheses, conslusion should be based on which test.)

Thank you so much.

Jim Frost says

Hi Rosie,

This discrepancy sounds like a form of omitted variable bias. You have to remember that these two analyses are testing different models. Pairwise correlation only assesses two variables at a time while your multiple regression model has at least two independent variables and the dependent variable. The regression model tells you the significance of each IV after accounting for the variance that the other IVs explain. When a model excludes an important variable, it potentially biases the relationships for the variables in the model. Hence, omitted variable bias. For more information, read my post about omitted variable bias. That post tells you more about it along with conditions under which it can occur.

In your case, the Pearson correlation is essentially a model with one IV and the DV whereas your multiple regression model contains multiple IVs. The difference is the number of IVs. While I can’t say whether either model is correct, I’d lean towards your multiple regression model because it controls for additional variables. Of course, you’ll have to be sure that the model and its results make theoretical sense and that the residual plots look good.

I hope that helps!

KB says

Hello Jim,

Really really helpful blog, still getting my head multiple regression statistics so nice to find someone who simplifies and is clear.

I have a question. I have an ANOVA F value of 0.06. Both my variables have negative Beta coefficents with first P=0.02 and the second P=0.07. I understand this means the variables relationship with the dependent is inverse, but is it normal to have a good F value and one variable to be deemed not statistically significant.

Grateful for any guidance

KB

Jim Frost says

Hi KB,

It sounds like you’re referring to the Overall F-test of Significance. Click that link to read a post I’ve written about it and discuss the type of situation you’re experience. Read that post and if you have more questions, don’t hesitate to post them there!

Rosie says

Dear Jim,

Thank you very sincerely for your quick response and clear explaination!

This is the most helpful site I’ve ever found!

Rosie says

Dear Jim,

Thank you very sincerely for your quick response and clear explaination!

This is the most helpful site I’ve ever found!

Rosie says

Dear Jim,

Thank you so much for your post!

Could you please kindly help me with the following question:

The p-value of my ANOVA test is smaller than 0.05, revealing a statistical finding that there is a linear relationship between dependent variable and independent variables. However, the p-values of all independent variables in “Coefficients” table show that among five independent variables, only 2 have a statistically significant impact on the outcome variable. Is it possible? (Because I think that if ANOVA test shows a statistical finding that there is a linear relationship between dependent variable and independent variables, there also should have statistically significance for all independent variables)

(By the way, R-square I got = 0.316, showing that 31.6% of the variance in the dependent variable is explained by the independent variables. Is this % too low?)

With great thanks again!

Jim Frost says

Hi Rosie,

I’m assuming the p-value you’re referring is for the F-test of overall significance. Click that link for a post I’ve written about that test specifically. In a nutshell, when that test is significant, it indicates that your model predicts the mean dependent value significantly better than just using the mean of the dependent variable itself. In other words, your model explains the variability of the values around the dependent variable better than just using the mean. While your model has some explanatory power, it doesn’t guarantee that all of the independent variables in your model are individually significant. It assesses the collective effect of all the independent variables. For example, if your overall F-test is significant and then you add another independent variable to the model that has no relationship with the dependent variable, your overall F-test is still likely to be significant.

So, yes, it’s quite possible to have a significant F-test for the entire model but have some independent variables that are not significant.

As for the R-squared, I’ve written several posts for that. You should read one about how high does R-squared need to be. You’ll find it varies depending on your subject area and the purpose of your model. Also read my post about low R-squared values and how they can provide important information.

Best of luck with your analysis!

Nadal Merquez says

Hi Jim Frost,

Thanks for your post and the amazing books! They have been really helpful. However, I’d like to ask you two pressing questions regarding the use of p-values.

1. I do not see how the assumption of normality of the error term is need in order to make use of p-values. For the derivation of the asymptotic normality of the estimators, the normality of the error term is not needed. Could you elaborate why the normality of the error term is needed in order to make use of the p-value?

2. I noticed from computing robust regression methods in R that the p-value is usually not given. Do you know what complicates the derivation of the p-value in the case of robust regression models? How would one know if coefficients are significant in the case of robust regressions?

I’d love to hear from you!

Nadia

Jim Frost says

Hi Nadia,

I’m glad my posts and my books have been helpful! I really appreciate you supporting my books! On to your questions!

1. The distribution of the error term is intrinsically tied to the sampling distribution of the coefficient estimates. One of the properties of the normal distribution is that any linear function of normally distributed variables is itself normally distributed. Given this property, it’s not difficult to prove mathematically that the assumption of the normality of the error terms implies that the sampling distribution of the coefficient estimates are also normally distributed. Therefore, if the error distribution is nonnormal, so are the sampling distributions. In that case, the hypothesis tests based on them are not valid.

2. Unfortunately, I don’t have much experience using robust regression. As I understand it, robust regression first performs OLS, analyzes the residuals, and then reweights the observations based on the residuals. The fact that the residuals are random means that the weights themselves are random. Weighted regression assumes that the weights are fix. Hence, the problem. I gather there is a procedure to work around that to produce hypothesis tests and CIs. However, there are criticisms that the procedure or analysts need to specify a scaling factor and tuning constant, which can cause large changes in the results. That’s the extent of my knowledge on that!

I hope that helps!

Robiul says

I have got my R square .997 and adjusted R squared is .995 is that bad /or how can i reduce the value ?

Jim Frost says

Hi Robiul,

There’s no general rule whether that’s good or bad. You’ll need to use subject-area knowledge as well as knowledge about your model fitting process to make that determination. It could be good if your study area has low noise measurements and it involves something that is inherently very predictable (such as modeling physical laws). But, it could represent something like overfitting your model, which indicates that the R-squared is too high and your coefficients are likely invalid.

I’ve written a post about why your R-squared might be too high. That post will help you answer this question.

nah says

Thank you Jim

you help us a lot.

Juston Shen says

Hi Jim,

The page 284 of Regression Analysis book which was mentioned effect size, statistical significant and practical significant. Could you let us know the difference between statistical significant and practical significant? How many types of effect size in regression analysis?

Jim Frost says

Hi Juston,

First, thanks so much for supporting my ebook. I really appreciate that!

There’s really two primary measures of effect size for regression coefficients. The first is the raw regression coefficient. The coefficient tells you how much the DV changes given a 1 unit increase in the IV. Of course, you have to be careful about determining causality. It might just be an association but not causation. I cover causation vs. correlation in detail in my new Introduction to Statistics ebook by the way.

Another way to look at it is standardized coefficient, which I also write about in my regression ebook. The standardized effect size is better for comparing the magnitude of effect across different types of IVs. This measure tells you how much the DV changes given a 1 standard deviation change in the DV. Because it’s all on a common standardized scale, you can compare the coefficients.

Finally, for the question significant and practical significant, let me point you to a blog post that I’ve written all about practical vs. statistical significance. In a nutshell, statistical significance is all about whether your sample provides enough evidence to conclude that the effect exists in the population. Practical significance is about whether the estimated size of that effect is large enough to be meaningful. That’s based on subject-area knowledge and can’t be computed mathematically. Anyway, read the post on it!

I hope this helps!

eric godson says

sir what if all the result shown in the T test shows negative sign or the significant is greater than 0.05

Jim Frost says

Hi Eric,

A negative t-value just means the coefficient is negative. If a negative coefficient is statistically significant, it indicates that as that independent variable increases, the mean of the dependent variable decreases.

I’ve written a post about t-values. It’s written in the context of t-test for when you’re assessing group means. However the same principles apply to t-tests in regression analysis. I suggest you read the following post, and when write about group means, just think about regression coefficients (which is a type of mean, a mean change in the DV). Read about t-values and t-distributions.

Omoleye Ojuri says

Hi Jim,

You are a great teacher Jim. The use of simple languages and expressions fascinated me to your website. Please just a quick one. My case is MRQAP model, do I have to plot residual plots to indicate the fit of the MRQAP model? And if yes, please what are the values to use to compute the residual plots (unstandardised coefficients, standardised coefficients etc). Or are p values and R square enough to indicate the fitness of the MRQAP model.

Omo

Jim Frost says

Hi Omo,

I have to apologize, but I don’t know MRQAP models well enough to provide an answer. I just looked in to them and they sound interesting. I will need to learn more!

Julie says

Hie Jim

I just stumbled on your postings and found it to be extremely useful. Kindly help me with something . If you found a variable to be statistically insignifant for your final panel regression model can you explain the coefficient of the insignificant variable or once the variable is insignificant then the coefficient sign is not to be considered . I found one variable to be statistically insignificant but it’s coefficient sign supports previous studies

Jim Frost says

Hi Julie,

There are several considerations here.

First, when the p-value is not significant, the coefficient is indistinguishable from zero statistically. In other words, your sample provides insufficient evidence to conclude that the sample effect exists in the population. In that light, you don’t consider the sign.

However, there’s another question about leaving an insignificant variable in your model. Often analysts will remove insignificant variables from the model. In your case, you have theoretical expectations that this particular variable is relevant and the sign is consistent with expectations. Removing this variable would potentially bias the other coefficients. Consequently, I’d leave the variable in the model even though it is not significant. While it’s not good to include too many insignificant variables in the model (reduces the precision), it can be worse to remove one relevant variable, even when not significant, because it can bias the model.

In the write up, I’d explain that you left the variable in the model because of theoretical expectations and not wanting to bias the model. However, your sample doesn’t provide additional support for the effect of this variable.

I talk about some of these issues in my post about choosing the correct regression model.

I hope this helps!

Angeles Dorantes says

Are the coefficients in the hierarchical beta regression interpreted in a similar way?

Jim Frost says

Hi Angeles,

Your question contains several terms, hierarchical and beta, that mean different things in different settings and software packages.

If you’re referring to hierarchical regression as the practice of entering independent variables in groups, such as a group of demographic variables followed by a group of variables you’re testing, yes, you interpret them the same. However, there is one caveat. If a group that is entered into the model later has statistically significant IV, it’s possible that the earlier groups without that significant variable can have omitted variable bias.

Beta in SPSS refers to standardized independent variables. If that’s the case for your model, then you must use a different interpretation for these coefficients. Standardized coefficients represent the mean change in the DV given a one standard deviation change in the IV. I talk about why you might use standardized values in this post about identifying the most important variables in your model.

Jose Chvaicer says

Hi Jim, your articles have helped me understand a lot of previous unclear points. A question remains in mind however: I’ve been asked to force the intercept to pass by the zero point inspite of observed data giving a value for the “a” in Y= a+bx. What I noticed is that the residuals do change much for the modified model (Y=bx) . So what is the gain? What consequences are expected? What happens to the p-value?

Thank you.

Jim Frost says

Hi Jose,

In most cases you should NOT force the regression line to go through the origin (y intercept equals zero). The fact that you’re observing changes in the residuals suggests that you should not do this. The best case scenario is that forcing the line to go through does not change the residuals.

If you don’t fit the constant in your model, it forces the constant to equal zero. For more information, read my post about the regression constant. In that post, I show why it’s almost always good to include the constant in your model. I would say there are no benefits for excluding it. Excluding it can bias your coefficients and produce misleading p-values (check those residual plots). Excluding it also changes the meaning of the R-squared value. It almost always increases R-squared but it completely changes the meaning of it. You cannot compare R-squared values between models with and without the constant.

Rashid says

Where to know if Regression coefficient is not significant at 5, but at 10% or viceversa?

Hello Sir, I hope my questiona finds you,

In some articles Regression coeficients are mentioned to be significant at 5% level and some other predictors significant at 10% level. So, where to know if Regression coefficient is not significant at 5, but at 10%?

Jim Frost says

Hi Rashid,

The significance level is something that the researchers decide before they start the analysis. There are advantages and disadvantages between use higher and lower significance levels. I’ve written about significance levels in the context of hypothesis testing. In summary:

Higher significance levels (e.g, 0.10) require weaker evidence to determine that an effect is significant. The tests are more sensitive–more likely to detect an effect when one truly exists. However, false positives are also more likely.

Lower significance levels (e.g., 0.1) require stronger evidence to determine that an effect is significant. The tests are less sensitive. They are less likely to detect an effect when one exists. On the good side, false positives are less likely to occur.

Analysts often use a significance level of 0.05 as a compromise between the pros and cons of higher and lower values.

You can read more in my posts about significance levels and p-values and errors in hypothesis testing.

Best of luck with your analysis!

Mahshameen Munawar John says

Sir Thankyou so much for the prompt reslonse. Yes, the first model is significant (P=. 02). However, as you also mentioned there seems to be no increase in the predictive capacity when I add the IV (R square remains almost the same in both models) …is that a negative thing? Yes the p value for the IV in the second model is significant.

Thankyou again for all your guidance.

Jim Frost says

Hi, you’re very welcome!

It sounds like your results disagree a bit. That happens because the F-test and t-test for the coefficients measure different things. The F-test measures the amount variance your model accounts for. In this case, you’re seeing whether the 2nd model accounts for significantly more variance than the first model. The t-test for the coefficient p-value assesses whether the coefficient is significantly different than zero (no effect).

While it might sound bad to say the 2nd model doesn’t account for significantly more variance than the first model, it’s actually good news overall for you. We know in the second model that your IV is statistically significant even when controlling for the demographic variables. The first model doesn’t include the IV even though we know it is significant. In other words, we know the first model is incomplete. In fact, the first model might have omitted variable bias because it does not include a significant IV.

Consequently, even though the second model doesn’t necessarily explain significantly more of the variance, it does include a significant IV and is, therefore, less likely to have biased coefficients. You should ask yourself, does the sign and magnitude of the IV coefficient match theoretical expectations and other research? If so, it looks like the IV is a good addition to the model. Of course, check your residual plots to be sure that you’re not violating any OLS assumptions.

Because you’re using regression analysis, you might consider buying my ebook about regression analysis, which includes far more information about it.

Mahshameen Munawar John says

Hello Sir, your posts have been a great help for me, thank you very much! I have been experiencing much confusion while interpreting the P values for Hierarchical Regression. i have one IV and DV , I controlled the demographics in the first step. The Sig. F Change value from the Model Summary output shows that it is not significant (P= .98) for the second model, where I introduced the IV. The same model is significant in ANOVA Table (F=2.15, P=.02). Could you please explain how to you interpret this result. Is the model valid and meaningful? I have searched but could not find an explanation or understand where the problem lies.

Your reply will mean a lot.

Jim Frost says

Hi Mahshameen,

Is the first model with the demographics significant?

If it is, then the results seem to indicate that both the first and second model are significant. However, adding the IV in the second model did not significantly improve the model. In other words, both models are significant but you can’t say that the second model is better.

However, I think the more crucial statistic to assess is the p-value for the IV in the second model. That statistic will tell you specifically whether that IV is significant while controlling for all the demographic variables. I think that’s what you really want to know.

Best of luck with your analysis!

Nancy Lohalo says

Hi Jim, thank you very much for this insightful post! I have encountered a few problems with the dependent variable Y in the linear regression model. The data collected showed a decreasing trend for the past 20 years, and my hypothesis stated that X1 will have a positive impact on Y. When I ran the regression test, almost all of the independent variables had negative coefficients. How can I interpret it? Thank you!

Jim Frost says

Hi Nancy,

It’s difficult for me to say much about your specific case because there’s so little information. It sounds like your hypothesis was that X1 would have a positive coefficient but your analysis produced a negative coefficient. I’m going to assume that X1 is negative and statistically significant. If it’s negative but not significant, it’s not distinguishable from zero and you can’t assume that it has a negative value in the population. Given those assumptions about the situation, there are two general possibilities.

1) Your hypothesis was incorrect. I have no way to know about that. But, it’s something you can investigate.

2) Your hypothesis is correct but your regression model has a problem that produces biased coefficients. This problem is causing the analysis to produce a negative coefficient but it’s should be a positive coefficient. There are a number of reasons why this can occur, including confounding variables, overfitting, data mining, and a misspecified model among other possibilities. Be sure to go through the OLS assumptions and see if your model violates any of them. It will probably take some effort to check these potential problems.

Because you’re performing a study with regression analysis, you might consider buying my ebook about regression analysis. In this ebook, I provide much more information all about regression analysis.

Best of luck with your study!

Karis says

Hi Jim, thank you so much for this post it’s helped a lot! I’m learning this stuff at uni and have come across a question which has completely confused me and wondered if you could help? The question asks to interpret the regression analysis result and its significance of these regression results:

R^2 = 0.74 (F = 16.82, p>0.01; t = 0.54, p<0.01).

However, the differing levels of confidence levels has thrown me? Does the fact that the F ratio is not within the confidence threshold mean that the regression model altogether is not statistically significant? Thank you!

Jim Frost says

Hi Karis,

So, the F-test and R-squared goes together. These are measures of Goodness-of-Fit. I’m assuming that the F-value and its p-value are for the F-test of overall significance. That test indicates that your R-squared (0.72) is not significantly different from zero–assuming that alpha is 0.01. Your model is no better at predicting the DV than just using the mean. That’s kind of odd for a model with an R-squared as high as 0.74. There might be a very small sample size or some problem with the model. I can’t tell from these results. Read more about that in my post about the F-test of overall significance. Read my post to see how to interpret R-squared.

The t-value and its p-value are for a term in the model, such as an independent variable. That particular IV is statistically significant. This post details what that means. For this model, the overall significance and significance for a particular IV disagree. The post about the F-test of overall significance describes how this disagreement can happen.

Note that none of the statistics you provide relate to confidence, as in confidence levels or confidence intervals. However, there is a disagreement about statistical significance. Read the post about the F-test to understand that issue.

I do think it’s odd that R-squared is reasonably high but that the overall F-test is not significant. I suspect something odd is going on.

I hope this helps!

Klaudia Pająk says

The question is- when I make the analyse of regression, SPSS shows the results and COEF has some value… When I describe these results on paper- should I define the coeff value as a b or β?

Thank you in advance

Jim Frost says

Hi Klaudia,

You should be able to work this out from the information provided. Coefficients are estimates of population parameters. And, b is an estimate of a parameter. Therefore, b = coefficients in this context because they are both estimates. Conversely, Beta is a population parameter and not an estimate.

I hope this helps!

Klaudia says

Hello Jim,

I’d like to ask what does the “COEF” mean. Is it the same thing as b or β?

Klaudia 😉

Jim Frost says

Hi Klaudia,

COEF stands for coefficient. These are the values that the procedure estimates from your data. In a regression equation, these values multiply the independent variables.

Technically, β is the parameter value for the population. Your regression equation estimates these parameter values. In textbooks, these estimates are often denoted using beta-hats. That’s a β with a ^ on top. Some sources use a lower-case b to indicate that it’s an estimate. The key thing to note is that some forms (β) refer to the true population parameters while others (beta-hat and b) refer to the estimates of the parameters. The coefficients in your output are estimates of the parameters.

One caution, SPSS for some strange reason uses the term “beta” to refer to standardized coefficients!

I hope this helps!

Curt Miller says

Hi Jim,

Do we still use p-values in determining whether or not a predictor variable should remain in the model, even when we are building a model on full population data?

Thanks you,

curt

Jim Frost says

Hi Curt,

When you’re working with data for an entire population, there is no need to use any p-values. P-values are an integral part of hypothesis tests that help you determine whether an apparent effect that exists in your sample also exists in the population. When you have the population data, all effects that you observe by definition do exist in the population. There’s no need to perform any hypothesis testing to confirm it because you’re looking at all the data for the population. This applies to regression analysis and other forms of hypothesis testing such as 2-sample t-tests, et al.!

Phil A. says

Hi Jim,

Quick question for a special type of regression… I have the following equation but I am not clear on the interpretation of the coefficient I obtain:

log($RealGDP) = B0 + B1(Junk-Bond Yield %) + e

My X1 data is in terms of percentage points (%) and my Y-variable (in log-scale) is in terms of dollars ($).

After I run my regression, my B1 coefficient = -0.005

As of now, I am interpreting the B1 coefficient as “A 1% increase in the Junk-Bond yield leads to a -0.5% decrease in Real GDP” – does this sound like the correct interpretation?

My main confusion is around the “1% increase in X” …. If the junk-bond spread is currently at 5%, do I interpret “a 1% change” as the junk-bond yield moving from 5% to 6%? Or do I interpret it as a 1% change of 5% (ex: 5% to 5.05%)?

Olga Pap says

Hello Jim.

Hello All.

I have one question. Specifically, when the dependent variable (e.g. earnings) is expressed on a logarithmic form (and not the independent variables) via mincer equation, does the interpretation of coefficients follow the below rules?

• For an increase of one-unit of the independent variable “X”, with coefficient b, then the change for dependent variable “Y” in logarithmic form should be e^b?

• And only for very small values of b (b < |0.1|) and having in mind that

e^b ≈ 1 + b, increase of one-unit of the independent variable “X”, with coefficient b, then the change for dependent variable “Y” should be equal to (100 × b)?

Thank you in advance.

Olga Pap says

Is it possible please to answer me on the above question?

Jessica says

Thank you! It does, though, when I’m looking at a scatterplot, I’ve seen an R value. This is not to be interpreted as the same thing as the correlation coefficient, r . . . correct? Even though the R value is not R-squared, it is still not the same as r . . . right?

Jim Frost says

Hi Jessica,

Ah, yes, I jumped straight to R-squared because that is used much more frequently. R is the coefficient of multiple

correlationwhereas R-squared is the coefficient of multipledetermination. The use of the capital letter R for both of these statistics indicates that they are sample estimates. I’ve described R-squared so onto R!The calculation for R is (unsurprisingly) just taking the positive square root of R-squared. R represents the correlation between a set of variables with another variable. In the regression context, this could be the correlation between your set of independent variables and the dependent variable. The interpretation of R is not intuitive. Hence, R-squared is used more frequently.

Lower case r is the correlation between two variables and it is commonly used. R involves more than two variables.

I haven’t seen R used much at all. Perhaps it is in some specialized context. But, you probably don’t need to worry about R.

Jessica says

Hi, I know this may seem to be a very simple question, but is there a difference between R and r? Do they stand for the same thing in regression analysis?

Jim Frost says

Hi Jessica,

Yes, r and R-squared are related as they both measure the strength of relationships between variables. r is a correlation coefficient that ranges between -1 to +1. It measures the strength of the linear relationship between two continuous variables. R-squared measures the strength of the relationship between a set of independent variables and the dependent variable. It’s a percentage that ranges from 0 – 100%.

Suppose you have a pair of variables, say X and Y, and the correlation coefficient (r) is 0.7. If you perform a simple regression using these two variables, you will obtain an R-squared of 0.49 (49%). We know this because 0.7^2 = 0.49. However, unlike correlation coefficients (r), you can use R-squared when you have more than two variables.

I write about that aspect in my post about correlation. You can also read more about R-squared.

I hope this helps!

Olga Pap says

Hi Jim. I would be very grateful if you could help me. Specifically, when the dependent variable (e.g. earnings) is expressed on a logarithmic form (and not the independent variables) via mincer equation, does the interpretation of coefficients follow the below rules?

• For an increase of one-unit of the independent variable “X”, with coefficient b, then the change for dependent variable “Y” in logarithmic form should be e^b?

• And only for very small values of b (b < |0.1|) and having in mind that

e^b ≈ 1 + b, increase of one-unit of the independent variable “X”, with coefficient b, then the change for dependent variable “Y” should be equal to (100 × b)?

Thank you in advance.

Tesfakiros Semere says

What a clear, simple, and easy to understand. You saved my time from reading lots of books. It is really helpful.

Would it be possible to get them all in Pdf just to print and read when I am out of network

THANK YOU SO MUCH Kim.

Jim Frost says

Hi Kim, thanks so much for your kind words! They made my day! While I don’t have PDFs of the blog posts, in several weeks I’ll releasing an ebook all about regression analysis. If you like the simple and easy to understand approach in my blog posts, you’ll love this book. It should be out in early March 2019!

Digambar salunkhe says

Thank you so much for sharing this blog…It’s really helpful and easy to understand the concept of whole regression model.

Adu Emmanuel Ifedayo says

Thank you.

Neven says

Hi Jim ! Great blog , very clear and very helpfull . The best I have found in this field! Thanks.

Qmars Safikhani says

Hi Jim,

Thanks a lot for sharing your knowledge through this article. I found it very interesting as you explained somehow difficult concepts in an easy way. Well done

Hans says

Hey Jim,

Great Blog! You helped us a lot preparing for our studies at university. We have a question regarding the p-value… Is there an explanation for a p-value being exactly 1.0? Does it mean that there is a 100 percent chance that the independent variable has no effect on the dependent one? Or is there anything else to consider? Thanks a lot for your help and keep that great work going!

Jim Frost says

Hi Hans, thank you so much! It’s great to hear that it’s been helpful for you all. That makes my day!

Yes, you can obtain a p-value of 1.0. To get exactly 1.0, your sample statistic would have to exactly equal the null hypothesis. For example, if you perform a 1-sample t-test and your null hypothesis is that the population mean equals 10. If your sample statistic is exactly 10, you obtain a p-value of exactly 1.0. In regression analysis, typically the null for a coefficient is that it equals zero. So, if the estimated coefficient equals zero exactly, you’d again get a p-value of 1.0.

The interpretation of a p-value in general is the probability of obtaining the observed sample statistic or more extreme if you assume the null hypothesis is true. The reason p = 1.0 when the sample statistics equals the null hypothesis value makes sense when you think about it with that interpretation in mind. When the sample stat equals the null value, there is a 100% probability that a sample statistic will equal the null value or be more extreme! That’s true by definition because that case covers the entire range of the sampling distribution (i.e., you’d shade the entire area beneath the sampling distribution curve).

To see these sampling distributions in action for a hypothesis test, read my post about p-values and significance levels.

Of course, the probability of obtaining a sample statistic that exactly equals your null hypothesis is miniscule. When using statistical software in the field, if you see a p-value = 1, it’s more likely due to rounding.

Paul says

Hi. I want to find out if simple or multiple regressions can be used to explain effects (as in experimental studies)?

Thank you.

Jim Frost says

Hi Paul,

You bet they can! The coefficients describe the effects and the p-values determine whether the effects are statistically significant.

Rashan says

This is very helpful. Thank you

Surya says

Thanks Jim

Surya says

Hi Jim, I have just subscribed to your posts after reading the wonderful post on residual plots.

Could you please let me know how do we interpret the SE of coefficients , T statistic as well.. Or do you already have an article on them… Please reply.. Thanks..

Jim Frost says

Hi Surya,

Thanks so much! I’m glad that post was helpful!

The standard error (SE) of the coefficient measures the precision of the coefficient estimate. Smaller values represent more precise estimates. Standard errors are the standard deviations of sampling distributions. If you were to perform your study many times, drawing the same sample size, and fitting the same model, you’d obtain a distribution of coefficient estimates. That’s the sampling distribution of a coefficient estimate. The standard error of a coefficient is the standard deviation of that sampling distribution. The SE is used to create confidence intervals for the coefficient estimate, which I find more intuitive to interpret.

The t-statistic in the context of regression analysis is the test statistic that the analysis uses to calculate the p-value. I write a post about how it works in the context of t-tests. It’s fairly similar for coefficient estimates. Read that post but replace sample mean with coefficient estimate and you’ll get a good idea. How t-tests work.

I hope that helps!

[email protected] says

been reading your posts all night, (morning now).. I can’t stop because it’s like a light bulb keeps going off. Been studying this stuff for weeks, now I finally get it thanks to your post. Thank you:)

-Extremely tired data science grad student.

Jim Frost says

Hi, I’m sorry my posts caused you to lose some sleep last night, but I love your analogy about light bulbs going off! I’m really happy to hear that they were helpful. That really makes my day! Best of luck with your studies!

Tracey says

Hi Jim. Thank you so much for this as it helped clear up some things in my mind as I prepare a research paper.

Jim Frost says

Hi Tracey, you’re very welcome. I am happy to hear that it was helpful!

Qiumei Jing says

Thank you for your explanation,Jim.That’s really great!

When I’m doing multiple liner regression , I have a question.The liner regression has three independent variables(A,B,C) and one dependent variable(D). I got significant p-value of ANOVA table,but in Coefficients table ,the constant p-value is 0.237,which is not significant ,with one predictor(Variable A) p-value is 0.211,another two predictors have good significant value(P=0.000). In that case ,how can I interpret the results? The hypothesis of the two predictors (variable B and C)which have significant is”there is a relationship between B and D” and “there is a relationship between C and D ” In this case,can I say the two hypothesis were supported? And how can I interpret the one (A)with insignificant p-value in coefficient table? Thank you in advance!

Jim Frost says

Hi Qiumei,

It’s generally not worthwhile interpreting the constant, so I’d skip that. To learn why, click the link for interpreting the constant in this post.

Here’s how you can interpret the significant predictors.

The sample provides sufficient evidence to conclude that changes in both independent variables B and C are correlated with changes in the dependent variable D. Statistical significance indicates that the correlation does not equal zero. In other words, you can reject the null hypothesis that the coefficients equal zero.

For the insignificant variable, the sample provides insufficient evidence to conclude that there is a relationship between these insignificant variables and the dependent variable. In other words, you fail to reject the null hypothesis that these two coefficients equal zero.

For more elaboration, reread this post where I talk about this in depth.

Appadu says

Dear Jim

Thank you for your explanations on how to Interpret Regression Coefficients for Linear Relationships and p-value. It is very clear appreciate you time to put this together.

I have one question I was looking at an example on Estimated standardised OLS beta coefficient data. The results show R squared (%) as 26.2 and F-Value 18.14. Please advise how to interpret this 2 figures. Thank you

Jim Frost says

Hi Appadu,

When you standardize the continuous independent variables in your model, the output produces standardized coefficients. Standardization is when you take the original data for each variable, subtract the variable’s mean from each observation and divide by the variable’s standard deviation. The main reason I’m aware of for performing this standardization is to reduce the multicollinearity caused by including polynomials and interaction terms in your model. I write about that in my post about multicollinearity.

In terms of interpreting the standardize coefficient–it represents the mean change in the dependent variable given a one standard deviation in the independent variable. Another reason statisticians use it is as a possible measure for identifying which variable is the most important.

As for interpreting R-squared and the F-test of overall significance, those don’t change from the usual interpretations. Click on the links to read my blog post about interpreting each statistic.

I hope this helps!

Hrishikesh Geed says

Thanks for the explaination Jim !!.

I have one doubt, how do you calculate the p-value corresponding to each coefficient?

How do you decide the standard deviation,and the sample mean for calculating the z value for each coefficient?

Thanks

Hrishi

eric says

Thank you very much for the explanation Jim!

If the p-value is under the significant level, this would indicate that there is enough evidence to reject the null hypothesis. The null hypothesis being here that there is no correlation between 2 variables (in a single linear regression).

Here is my first question: how do we decide how to set the significant level? Is it purely arbitrary?

My second question is: since the coefficient of correlation varies -1 and 1, it is tempting to conclude that there is a significant correlation (positive or negative) between 2 variables is the coefficient of correlation is close to -1 or 1 and that there is no correlation when the coefficient of correlation is close to 0. However I think this assumption is false but can’t get the intuition to understand why.

Could you help me about those questions?

Many thanks for your time and your attention

Best regards

Eric

Hanan Shteingart says

the following claim is not true if the features are correlated, what’s known as multicollinearity: “The sign of a regression coefficient tells you whether there is a positive or negative correlation between each independent variable the dependent variable”. In fact, a feature could have a positive correlation with the target yet a negative coefficient and vice vera.

Jim Frost says

Hi Hanan,

You raise a good point. The interpretation that I present, including the portion that you quote, is accurate when your model doesn’t contain a severe problem. However, if your model does contain a severe problem, it can produce unreliable results, which includes the possibility that the coefficients don’t accurately describe the relationship between the independent variables and the dependent variable. The problem isn’t with how to interpret coefficients, but rather with a condition in the model that causes it to produce coefficients that you can’t trust.

As you point out, multicollinearity can produce unreliable, erratic coefficients. In some cases, the sign of the coefficient can even be incorrect. However, the sign switch doesn’t necessarily have to happen when your model has multicollinearity. I write more multicollinearity, including switched signs, in this post: Multicollinearity in Regression Analysis: Problems, Detection, and Solutions.

By the way, there are a number of other potential problems that can cause your model to produce results that can’t trust. Multicollinearity is just scratching the surface of that. These problems include an incorrectly specified model, overfitting the model, heteroscedasticity, and data mining among others. I spend quite a bit of time talking about these problems, how they can invalidate your results, and what you can do to address them.

I hope this helps!

MN says

Thank you very much for the wonderful elaboration. Amazing!!

Jim Frost says

You’re very welcome, MN! I’m glad it’s helpful!

Rajasekar says

I am currently working on a multiple regression model, where i have 4 x variable and all my variable are not statistically significant. I know when this happen i can reject null hypothesis but like to know what might be the wrong , do i need to add some more x variable in this case.Also the R Square =0.109842937

Adjusted R Square =0.034084889

Ayush says

This is really one of the best websites I have come across for DATA SCIENCE… Great effort put up by Sir Jim…

Jim Frost says

Thank you, Ayush!

Rali says

Hi Mr. Jim

Thanks for the helpful blog

all the best

Jim Frost says

Hi Rali, you’re very welcome! I’m glad it was helpful!

ADIL HUSSAIN RESHI says

Really fabulous ..it cleared all my doubts about p- value

Jim Frost says

Hi Adil, Thanks! I’m so glad to hear that it was helpful!

Javed Iqbal says

Thanks Jim for the nice explanation. This regression seems to violate one of the model assumption namely the homoskedasticity. Log transformation should work here.

Jim Frost says

Hi Javed, thanks for your comment. The residuals for this model are homoscedastic–or very close to it. Their variance are fairly equal across the entire range. The variance might appear to be lower in the very low end of the range, but there are also fewer observations in that region, which can make the dispersion appear to be smaller. At any rate, it is close enough. To see how a true case of heteroscedasticity appears, along with multiple methods for correcting it, read my post about heteroscedasticity. By the way, I explain in that post why I always recommend trying other methods of addressing this problem before using a transformation.

Toby says

Great blog with detailed explanation! It helps clear my doubts for p-value.

Thank you Jim! and Happy new year! 😀

Jim Frost says

Thank you, Toby! And, I’m very happy you found the blog to be helpful! Happy new year to you too!!