What is Simple Linear Regression?
Simple linear regression (SLR) is a statistical method that quantifies the linear relationship between two continuous variables—a single predictor (independent variable) and an outcome (dependent variable). It estimates the relationship by finding the best fitting straight line through data points.
Simple linear regression has the following two primary goals:
- Understanding the relationship between the variables.
- Predicting the outcome based on the value of the independent variable.
This method typically estimates the best-fitting line using the least squares method, which identifies the line that minimizes the squared differences between the observed values and predicted values (i.e., the residual sum of squares (RSS)).
Simple Linear Regression Equation
The standard notation for a simple linear regression equation is the following:
Where:
- Y is the dependent variable (DV or the outcome you’re predicting).
- X is the independent variable (IV or the predictor you’re using).
- β0 is the intercept (predicted value of Y when X = 0).
- β1 is the slope (the average change in Y for a one-unit increase in X).
- ε represents the random error or residual, indicating how individual observations deviate from the predicted values.
To understand the relationship between the IV and DV, analysts examine the slope coefficient (β1) to determine the strength and direction of the predictor’s association with the outcome.
Predicting outcomes involves substituting specific values of the independent variable (X) into the simple linear regression equation.
Example Interpret Results
For example, a researcher studying the relationship between hours spent studying (predictor) and test scores (outcome) might fit a simple linear regression model. Suppose the estimated regression equation is:
The slope coefficient for hours studying (4) indicates that each additional hour spent studying corresponds with an average increase of 4 points in test score. If a student studies 5 hours, the predicted test score would be:
The graph below displays the line of best fit with the data points for this model.
Simple linear regression provides not only a way to predict the outcome but also a quantitative understanding of how strongly the predictor influences it. Analysts typically evaluate the model’s effectiveness by examining R-squared, which indicates the proportion of variation in the outcome explained by the predictor, and by assessing residual plots to verify that the linear model satisfies underlying assumptions.
« Back to Glossary Index