The difference between linear and nonlinear regression models isn’t as straightforward as it sounds. You’d think that linear equations produce straight lines and nonlinear equations model curvature. Unfortunately, that’s not correct. Both types of models can fit curves to your data—so that’s not the defining characteristic. In this post, I’ll teach you how to identify linear and nonlinear regression models.
The difference between nonlinear and linear is the “non.” OK, that sounds like a joke, but, honestly, that’s the easiest way to understand the difference. First, I’ll define what linear regression is, and then everything else must be nonlinear regression. I’ll include examples of both linear and nonlinear regression models.
Linear Regression Equations
A linear regression model follows a very particular form. In statistics, a regression model is linear when all terms in the model are one of the following:
- The constant
- A parameter multiplied by an independent variable (IV)
Then, you build the equation by only adding the terms together. These rules limit the form to just one type:
Dependent variable = constant + parameter * IV + … + parameter * IV
Statisticians say that this type of regression equation is linear in the parameters. However, it is possible to model curvature with this type of model. While the function must be linear in the parameters, you can raise an independent variable by an exponent to fit a curve. For example, if you square an independent variable, the model can follow a U-shaped curve.
While the independent variable is squared, the model is still linear in the parameters. Linear models can also contain log terms and inverse terms to follow different kinds of curves and yet continue to be linear in the parameters.
The regression example below models the relationship between body mass index (BMI) and body fat percent. In a different blog post, I use this model to show how to make predictions with regression analysis. It is a linear model that uses a quadratic (squared) term to model the curved relationship.
Related post: Linear Regression
Nonlinear Regression Equations
I showed how linear regression models have one basic configuration. Now, we’ll focus on the “non” in nonlinear! If a regression equation doesn’t follow the rules for a linear model, then it must be a nonlinear model. It’s that simple! A nonlinear model is literally not linear.
The added flexibility opens the door to a huge number of possible forms. Consequently, nonlinear regression can fit an enormous variety of curves. However, because there are so many candidates, you may need to conduct some research to determine which functional form provides the best fit for your data.
Below, I present a handful of examples that illustrate the diversity of nonlinear regression models. Keep in mind that each function can fit a variety of shapes, and there are many nonlinear functions. Also, notice how nonlinear regression equations are not comprised of only addition and multiplication! In the table, thetas are the parameters, and Xs are the independent variables.
The nonlinear regression example below models the relationship between density and electron mobility.
The equation for the nonlinear regression analysis is too long for the fitted line plot:
Electron Mobility = (1288.14 + 1491.08 * Density Ln + 583.238 * Density Ln^2 + 75.4167 * Density Ln^3) / (1 + 0.966295 * Density Ln + 0.397973 * Density Ln^2 + 0.0497273 * Density Ln^3)
It’s important to note that R-squared is invalid for nonlinear models and statistical software can’t calculate p-values for the terms.
The defining characteristic for both types of models are the functional forms. If you can focus on the form that represents a linear model, it’s easy enough to remember that anything else must be a nonlinear. Now that you understand the differences between the two types of regression models, learn more about fitting curves and choosing between them in the following blog posts!
- How to Choose Between Linear and Nonlinear Regression
- Curve Fitting using Linear and Nonlinear Regression
If you’re learning regression, check out my Regression Tutorial!
Note: I wrote a different version of this post that appeared elsewhere. I’ve completely rewritten and updated it for my blog site.
Wink says
Txs!
Wink Saville says
I didn’t understand what was meant by “linear in the parameters” so I prompted GPT-4 with:
‘I was reading an article on linear vs non-linear regression and at one point it says “While the function must be linear in the parameters, you can raise an independent variable by an exponent to fit a curve.”. What does “linear in the parameters” mean?’
How would you grade its response, any corrections? https://chat.openai.com/share/63461a95-7753-4a31-a791-cd8ec04a9acb
Jim Frost says
Hi Wink,
It gets it mostly correct. It basically says the same things I say in this blog post. Perhaps it scraped my blog (which was published well before GPT existed)?
But I have one slight nitpick with this portion:
“In summary, “linear in the parameters” means that the regression model’s parameters are linear even if the relationship between the independent and dependent variables is nonlinear.”
I have an issue the “nonlinear” right at the end of the sentence I quoted. Nonlinear in regression analysis refers to the form of the model. Yet, in this sentence, it refers to nonlinear in a curved sense. A curved polynomial relationship is a linear relationship (even though the line is curved) because the form of the model is linear.
This sentence suggests that it would be considered nonlinear, which is inaccurate. If the relationship between the IV and DV is truly nonlinear, a polynomial linear relationship would not be able to fit the data adequately. To be more accurate, replace that “nonlinear” just before the period with “curvilinear” or “curved.”
I have checked other statistical content by GPT and it’s infamous tendency to hallucinate things is frequently present. Sometimes it’s subtle, like it is here. Sometimes more blatant!
Vivian says
This is helpful, especially the nonlinear equations. Thanks
Nicola says
Brilliant!
thank you. I was about to ask the same question as Eunice. One of my PhD examiners wants me to present regression analysis results for some data (small data sets), but the data do not meet the assumptions, even when transformed. Thank you 🙂 Nicola Fraser
Meredith says
Hi Jim, I have no idea if you’re still responding to comments on this very helpful post, but I hope you are. I’m running logistic, exponential, polynomial, and sinusoidal regressions against a data set for a school paper. I know I can use R^2 on the polynomial regressions, but what about the others? Do you have any idea what statistical metric I can use to try to evaluate the non linear regressions? Thank you!
Jim Frost says
Hi Meredith,
Yes, you bet I’m still answering comments here! 🙂
I’ve written a post that will answer your question: Curve Fitting with Regression.
In that post, I fit a curve using various methods. Then I show you how to evaluate which fit is best. One of the methods I uses is non-linear regression. I show you how to compare the fits between linear and nonlinear models.
And you’re quite right to question using R-squared with nonlinear models–because you shouldn’t. Polynomial models are linear models (as you know from this post), so that’s why R-squared is valid for them. However, R-squared is not valid for nonlinear models. Click that link to learn why! (But the first link really shows you how to assess and compare different types of fits.)
Mayank Mishra says
Thanks Jim. got it.
Jim Frost says
You bet! 🙂
Mayank Mishra says
Thanks Jim, you opened my eyes.
I found following 2 equations listed under non-linear regression in a book. I believe they are linear. Requesting your expert inputs:
1) Y = b0 + e^b1 X + e. Where b0 and b1 are model parameters and e is error term. I believe it to be linear because I can substitute b2 = e^b1 and “make it appear as linear”.
2) Y = b0 + X / (1 + ln(b1)) + e. Where b0 and b1 are model parameters and e is error term. I believe it also to be linear because I can substitute b2 = 1 / (1 + ln(b1)) and “make it appear as linear”.
In both cases once I find b2 using linear regression I can easily solve for b1. My understanding is that if I can not make a regression equation “appear linear” by any substitution then I should consider it to be non-linear.
Awaiting your inputs. Thanks.
Jim Frost says
Hi Mayank,
Those two equations are in fact nonlinear. You can tell because they’re raising the parameter as an exponent in #1 and taking the log of the parameter in #2.
It’s not just “making it appear as linear.” It truly has to be linear in the parameters. And those two equations are not. What you can do is raise the variables by an exponent or take the log of the variables (the Xs). But you can’t do that with the parameters (the βs). Your two examples do that with the parameters, making them nonlinear.
Keep in mind the difference between parameters and variables in the equation.
Eunice AD says
Hi Jim.
Thank you for your useful guidance.
I would like to know whether the assumptions of the linear model must be met in estimating a non-linear model. I have a set of panel data that has unit root. Assuming the theoretical basis for my work requires a non-linear least square model, can I proceed with the dataset and estimate my model?
Thanks in advance.
Eunice
Jim Frost says
Hi Eunice,
Yes, the residual assumptions for OLS also apply to nonlinear regression.
Partha Shankar Nayak says
Nice clarification. Thank you dear.
AP says
Oh – I see it now. Thanks for the clarification Jim. Since parameters for a given model are constant values (or so I think), say if θ2 = 2 then isn’t the equation really same as the linear one?
Can “linear in parameters” include second or third order parameters? like: θ1 + (θ2^2) * X ? Is this linear in parameters?
Really good discussions and looking forward to many more!
Jim Frost says
Hi, no, the parameters can’t be in the exponents at all because then it wouldn’t be linear in the parameters.
AP says
Jim – For the first example of Non-Linear equation in your post: y = b.X^2, where I am using “b” as parameter instead of theta, why cannot this be consider this as Linear of the form: y = C + aX + bX^2 where a and C = 0?
Jim Frost says
Hi AP,
The equation as you have it is linear. However, that’s not the equation I have in my first example.
The equation shown is actually: θ1*Xθ2. It’s not squaring the X value as shown in your equation. Instead, it raises X by θ2, and then multiplies that by θ1. It’s including a parameter in the exponent that makes this an example of a nonlinear equation.
Jędrzej Wałęga says
This post was incredibly helpful. Many tutorials ommit the difference between the two regressions, so this topic was quite muddy to me for a long time. Thank you for your time, you really cleared a lot of my doubts here!
Jim Frost says
You’re very welcome!
Mike says
Thank you. Would it be possible to provide a citation for this definition? I believe you but I will probably need an official definition to justify using r squared for a model.
y = a*log(x)+b
y is dependent variable, x is independent variable
This is linear, right?
Joe Kelleher says
The Fourier example could be achieved with an equivalent linear model by using sin and cos terms instead of the addition within the cos, yes? Is the R2 measure still invalid for a nonlinear model that is equivalent to a linear model?
Camille Williams says
Hello,
Thank you for your articles. They are very helpful.
In this tutorial they look at the R2 of spline regressions.
http://www.sthda.com/english/articles/40-regression-analysis/162-nonlinear-regression-essentials-in-r-polynomial-and-spline-regression-models/
Is it correct to judge the R2 of cubic splines, b-splines, natural splines, and smooth splines?
Thank you!
VIVEK KUMAR YADAV YADAV says
Really it was funny …. difference between linear regression and non linear is only non 😂😂😂😜
Jim Frost says
Hi Vivek, I’m glad you appreciated the humor! But, yes, it’s true. The best way to think about it is that nonlinear is literally every type of model that doesn’t fit the definition for a linear model!
Noor Yaseen says
Hi Jim! Your blogs are outstanding and almost all of them have been read by me; they proved very helpful to me and I thereby, I recommend these to most of my friends working on their MSc projects. As for the multiple nonlinear regression, I have a question whether the following equation is correct to be used as a multiple nonlinear regression model…..T = aX^m + b*((Y+Z) / X)^n….a, m, b, and n are the regression parameters, X, Y, and Z are the independent variables and T is the response variable. (Please note that all these variables have the same units of m^3/sec).
My second question is regarding the outcomes of the MINITAB software, after running multiple nonlinear regression on the above model, in which case I came up with a missing lower CI of the first parameter ‘a’ i.e., (*, 1.6323) and also the following prompt warning:
“* WARNING * Some parameter estimates are highly correlated. Consider simplifying the
expectation function or transforming predictors or parameters to reduce collinearities.”
However, this model has drastically reduced the S value from 315 m^3/sec (in linear regression) to 300 m^3/sec (in the above nonlinear model), because of which I am stubborn to use this equation. Now I modestly request you to resolve the above questions one by one. Thanks in advance!
Angie says
Hello,
many thanks for your very interesting articles. However, I still have a doubt about this topic. If I have a model
y=a+bc+x1x2c-x1x3c (where a,b and c are independent parameters, and x1,x2 and x3 are variables) and I want to plot the results of the model in the plane (y-x2),can I still use the R2? is still the model a linear regression model?
Many thanks in advance,
Best regards
Angie
Jim Frost says
Hi Angie,
Yes, that is a linear model. However, by using C for both the x1x2 and x1x3 terms, you’re forcing those terms to have the same coefficient. Maybe you meant to have x1x3d?
Mai Sáu says
Hi Jim
I understand that the following model is non-linear regression model. Am I right or wrong?
Yit = β0 + β1i*(X1it*X2it) + β2i*X2it + β3i*((X2it)^2) + β4i*X4it + Ɛit
Thank you so much.
Swagat Mishra says
Hi Jim.. Thank you for the wonderful explanation. I just had one question. As you explained that Linear always has the form ‘Dependent variable = constant + parameter * IV + … + parameter * IV’.
For ex I have a equation of the form : Y=b1*cos(X1)+b2(X2)^2+b3*log(root under X3)+C.
where X1,X2,X3 are dependent variable and b1,b2,b3 are parameters and C is constant.
We can still consider this linear right because it preserves the basic form of the Liner eq..
Thank you.
Richard says
Hi Jim
I am doing an online course that is looking at regression fitting linear vs non-linear models. The definition is that a model is linear if linear in parameters and it fits the general example you have shown. I understand that it can be linear in parameters but not in independent variables which is fine, to solve the equation of fit.
However it also states that to be “linear” the parameters cannot be multipled or divided or raised to powers, etc. My question is: if the model is squared in a parameter say (e.g. y = a^2X or y = abX) then the a^2 or the ab are still constants and would surely still fit a straight line (albeit, you would not know it was a power or two parameters squared). Does it just mean you would not derive the true value of each parameter exactly, but then how would you know? Often such parameters are always a composite of various properties.
Many thanks
Richard
Sotiris says
Hello there,really nice work! My questions is related to non linear regression.I am trying to create a predictive model using nls(in R). My formula goes like that: Y~a*X*exp(b/Z), where ”Y” is my dependent variable,”X” and ”Z” are my independent variables and ”a”,”b” are my coefficients. My question is what happens when the P value of ”b” is more than 0.05 meaning it is not statisticaly significant. Does this mean that i should exclude b from the equation like we normally do in linear regression? Thank you in advance for your time!
Jim Frost says
Hi Sotiris,
Typically, in nonlinear regression, you don’t see p-values for predictors like you do in linear regression. Linear regression can use a consistent test for each term/parameter estimate in the model because there is only a single general form of a linear model (as I show in this post). In that form, zero for a term always indicates no effect. Consequently, the test for each model term tests whether the difference between the coefficient and zero is statistically significant. All of these tests use the same null and alternative hypotheses, as shown below:
H0: bi equals 0
HA: bi does not equal 0
This flexibility gives nonlinear regression great flexibility for fitting many types of curves. However, because of the many different forms, you can’t assume that zero is the correct null hypothesis value for all parameter estimates. That depends on the function, the parameter’s location in it, and the study area. Consequently, statistical software does not show p-value for parameter estimates in nonlinear regression.
Instead, have your software calculate confidence intervals and use your subject area expertise to identify meaningful values and determine whether the CIs include or exclude them. As for whether to include or exclude a term, that should never be determine solely based on statistical significance. You can use the general process I describe in my post about model specification for guidance.
I hope this helps!
Steph says
Hi,
I’m considering using non-linear regression for a thesis research project as my model will not be linear.
How do you calculate the number of participants needed? I use Gpower for linear regressions, what would be the process for non-linear?
Thanks
S
Leonel Nava says
FOlks you can convert a Non- Linear Equation(model) into a Linear One. Piece of cake, that is studied in any book of advance calculus.
Jim Frost says
Hi Leonel,
Yes, this is a known process. I talk about the ability to include nonlinear relationships in linear regression in my post about fitting curvature. However, it’s not possible to transform all possible nonlinear relationships into a linear form. And, indeed there are benefits to fitting the raw untransformed data using nonlinear regression.
statscurious says
If I’m reading correctly, I can turn any nonlinear model into a linear one. Say my independent variable is X. I do nonlinear regression which gets me some function G(X). Now if I make G(X) a variable in a linear regression, my model is technically a linear model, yes? At least this doesn’t seem any different in principle from raising X to a power or transforming it with any other function which seems to be OK. Assuming this is OK, after I do my linear regression on G(X) is it ok for me to compare the R2 of that “linear” regression against linear regression of just X? If that’s ok I’m not sure I understand why we can’t compare R2 of linear and nonlinear models.
Jim Frost says
Hi, no, you can’t turn all nonlinear models into linear models. Yes, you can use transformations to include some nonlinear functions into a linear model. But, you have to be able to express those functions in a linear form. See the example of using log functions in my post about modelling curvature. The log functions fit the linear model specification. I also show this in my discussion about log-log plots.
As for the R-squared, if you can use a transformation in a linear model to fit an underlying nonlinear function, your software will give you an R-squared value. Be aware that the R-squared applies to the transformed data rather than the original data. The variance structure of the transformed data is completely different than the raw data. In this case, R-squared describes something fundamentally different. For transformations, use R-squared to understand how well the model fits the transformed data but do not think that it describes how well the model fits the original data.
aron bereket says
thank you for your effort .but i could not differentiate the real exact difference between linear and non linear .could you please assist /
Jim Frost says
Hi Aron, the key point to remember is the linear models follow the one form that I show in this post. If it doesn’t follow that form, it is nonlinear. So, make sure you understand that form!
Priscilla Branch says
Also…could you sort the posts in descending order?
Priscilla Branch says
Thank you SO much for this refresher on statistics. I haven’t touched this stuff in over 20 years, but your contributions are helping me dust off the cobwebs—AND have a better appreciation for the importance of outcomes that are studied and analyzed thoroughly. You present this material simply and elegantly. I pray that my analyses are also thoughtful, simple–and make sense!!
Jim Frost says
You’re very welcome! And, thanks so much for the kind comments. They really made my day!
I’m not sure what you mean by sorting the posts in descending order–most recent first to older? That should be how they’re sorted currently.
I’m also working on a book about regression analysis that should be out in the first quarter of 2019. All of the content will be in a logical order in that book!
Joy says
Thank you so much for prompt reply. Just to clear another small confusion, I read somewhere that linear regressions are linear in parameters do Y ~ $\beta_2^{2} x_2 …$ won’t be linear, and that
is because we have quadratic on \beta_2, would not the same logic applied the above equation.
Thanks for the link, going over it now
Jim Frost says
Hi Joy, quadratics are commonly used to model curvature in linear models. They’re ok! In fact, I use a quadratic in the BMI example in this very post. Also in this post, I talk about how linear models are linear in the parameters. I’d reread that section. Additionally, read the post that I mention in my previous comment. I think that’ll help!
Joy says
Hi Jim,
I have a confusion about the core definition of non-linearity. I absolutely love your definition about writing the model as IV * parameter. I am wondering what happens if someone takes a non linear function on the parameters. That is Y ~ exp(\beta_1) * x_1 …. Is it still a linear model. In my definition yes, but it would be great if you shed some light.
Jim Frost says
Yes it is! And, to see a natural log transformation, you can read my post about modeling curvature, which is related way to use a linear model to fit an otherwise nonlinear function.
Akis says
Thank you for the answer. I really appreciate it. I hope you keep spreading comprehensive knowledge.
Akis says
So, this is the first time reading your articles and I find them really interesting and comprehensive, good job with this. I have a quick question though. As you mentioned:
“Linear models can also contain log terms and inverse terms to follow different kinds of curves and yet continue to be linear in the parameters.”
Then, what helps us understand which model is linear or not? I mean, if linear models can have inverse terms too, then why is the model above (density and electron mobility) a nonlinear one?
I am also trying to find some other sources that point out this difference and it would be greatly appreciated if u could link some references.
Thanks in advance.
Jim Frost says
Hi Akis,
As I describe in this post, to be considered a linear model, the form of the model must fit a very specific format. However, you can transform the variables that fit within this format. If it doesn’t fit this very specific format, it’s a nonlinear model. Again, it is based on the form of the model that I describe in this post.
The density and electron mobility is nonlinear because it doesn’t fit that specific linear form that I describe.
Keep in mind that the difference between linear and nonlinear is the form and not whether the data have curvature. Nonlinear regression is more flexible in the types of curvature it can fit because its form is not so restricted. In fact, both types of model can sometimes fit the same type of curvature. To determine which type of model, assess the form.
I don’t have a reference handy for you. However, these are the basic properties of these types of models and any textbook about linear models and nonlinear models will talk about these forms.
shashi says
How you decide to use linear regression or non linear regression ?
Jim Frost says
Hi Shashi,
I wrote a blog post about this topic specifically! How to Choose Between Linear and Nonlinear Regression.
I also talk about it in a post about curve fitting.
Read those two posts and you’ll have your answer!
Ashutosh Kumar says
Dear Jim,
I have read a couple of your articles now and sharing it around as well and they are really very helpful and easy to interpret. Thank you for this !!
With regards to this post, my question is that – In cases when we use a higher order polynomial term in the linear regression model, to mode the curvature, does this not fail the multicollinearity assumption because now we cannot change the value of x keeping x^2 constant?
Jim Frost says
Hi Ashutosh,
Thank you so much for you kind words! And, thanks for sharing my articles. I really appreciate that!
Yes! You’re pretty much correct with that. However, the assumption only excludes perfect correlation. Some degree of correlation is OK but if it increases too much it becomes problematic. For information about the assumptions (including this one), read my post about the classical OLS assumptions.
In terms of polynomials specifically, yes, these terms often increase multicollinearity to problematic levels. To correct for this type of multicollinearity, you can center the continuous variables. Read about this approach in my post about multicollinearity. This method typically reduces multicollinearity caused by polynomials to acceptable levels.
I hope this helps!
Natalie says
You have helped me understand statistics!!!! Thank you
Jim Frost says
You’re very welcome! I’m happy to hear that my website has helped you with statistics!
jw says
Hey Jim,
I was just researching this topic and found a similar article elsewhere.
The time stamp on your article is older so I hope that they were copying from you and not the other way around. I just wanted to let you know about this, maybe you already did 🙂
Cheers
Jim Frost says
Hi JW,
Thanks so much for pointing this out to me. Everything is OK because I am the author of both. I actually completely rewrote these articles so the text is different. They’re basically different articles on the same topic. In fact, if you use the Internet Archive Wayback Machine and look at older versions, you’ll see that I am listed as the author. For unknown reasons, the organization removed most authors’ names from their blog posts.
Thanks again!
Alisha Bansal says
Hi Jim, found this article very helpful. Many many thanks for posting this!
Alee says
Hello Sir!
if the property values is dependent variable and (house characteristics including, no of beds rooms, larger area, green spaces near to home and other amenities) are independent variables, how can we apply liner and non linear regression on them?
Jim Frost says
High Alee!
Long, long ago I had a professor who published a study about this exact subject! By the way, he found that the single thing a homeowner could do to increase the value of their home is to use add a bathroom! He used linear regression for his analysis.
As for how to perform this analysis, you simply use the sale prices as the dependent variable. And, you include all of the house characteristics as independent variables. In this manner, you can see how changes in the independent variables relate to changes in the average sale price of a home. For example, you can see how the average sales price changes when you add a square foot or add a bathroom!
I hope this helps!
Amir says
Dear Jim,
The way you teach statistics is exclusive! It comes from deep experience, and this makes people to bypass all the fear and fuss around mathematical expressions – which were developed to explain the world around us. As Einstein said, “if you can not explain it simple you have not understand well”. Thanks for explaining simple and thanks for understanding statistics well.
Amir
Jim Frost says
Hi Amir, thank you so much for your kind words–that means a lot to me! I strongly believe that statistics doesn’t have to be scary! I’m happy that you have found my posts helpful. By the way, that’s a great quote from Einstein! –Jim