The residual sum of squares (RSS) measures the difference between your observed data and the model’s predictions. It is the portion of variability your regression model does not explain, also known as the model’s error. Use RSS to evaluate how well your model fits the data. [Read more…] about Residual Sum of Squares (RSS) Explained

# Regression

## Omitted Variable Bias: Definition, Avoiding & Example

## What is Omitted Variable Bias?

Omitted variable bias (OVB) occurs when a regression model excludes a relevant variable. The absence of these critical variables can skew the estimated relationships between variables in the model, potentially leading to erroneous interpretations. This bias can exaggerate, mask, or entirely flip the direction of the estimated relationship between an independent and dependent variable. [Read more…] about Omitted Variable Bias: Definition, Avoiding & Example

## What is a Parsimonious Model? Benefits and Selecting

## What is a Parsimonious Model?

A parsimonious model in statistics is one that uses relatively few independent variables to obtain a good fit to the data. [Read more…] about What is a Parsimonious Model? Benefits and Selecting

## Sum of Squares: Definition, Formula & Types

## What is the Sum of Squares?

The sum of squares (SS) is a statistic that measures the variability of a dataset’s observations around the mean. It’s the cumulative total of each data point’s squared difference from the mean. [Read more…] about Sum of Squares: Definition, Formula & Types

## Root Mean Square Error (RMSE)

## What is the Root Mean Square Error?

The root mean square error (RMSE) measures the average difference between a statistical model’s predicted values and the actual values. Mathematically, it is the standard deviation of the residuals. Residuals represent the distance between the regression line and the data points. [Read more…] about Root Mean Square Error (RMSE)

## Least Squares Regression: Definition, Formulas & Example

A least squares regression line represents the relationship between variables in a scatterplot. The procedure fits the line to the data points in a way that minimizes the sum of the squared vertical distances between the line and the points. It is also known as a line of best fit or a trend line. [Read more…] about Least Squares Regression: Definition, Formulas & Example

## Linear Regression Equation Explained

A linear regression equation describes the relationship between the independent variables (IVs) and the dependent variable (DV). It can also predict new values of the DV for the IV values you specify. [Read more…] about Linear Regression Equation Explained

## Linear Regression

## What is Linear Regression?

Linear regression models the relationships between at least one explanatory variable and an outcome variable. These variables are known as the independent and dependent variables, respectively. When there is one independent variable (IV), the procedure is known as simple linear regression. When there are more IVs, statisticians refer to it as multiple regression. [Read more…] about Linear Regression

## Mean Squared Error (MSE)

Mean squared error (MSE) measures the amount of error in statistical models. It assesses the average squared difference between the observed and predicted values. When a model has no error, the MSE equals zero. As model error increases, its value increases. The mean squared error is also known as the mean squared deviation (MSD). [Read more…] about Mean Squared Error (MSE)

## Orthogonal: Models, Definition & Finding

Orthogonality is a mathematical property that is beneficial for statistical models. It’s particularly helpful when performing factorial analysis of designed experiments. [Read more…] about Orthogonal: Models, Definition & Finding

## Independent and Dependent Variables: Differences & Examples

Independent variables and dependent variables are the two fundamental types of variables in statistical modeling and experimental designs. Analysts use these methods to understand the relationships between the variables and estimate effect sizes. What effect does one variable have on another?

In this post, learn the definitions of independent and dependent variables, how to identify each type, how they differ between different types of studies, and see examples of them in use. [Read more…] about Independent and Dependent Variables: Differences & Examples

## Understanding Historians’ Rankings of U.S. Presidents using Regression Models

Historians rank the U.S. Presidents from best to worse using all the historical knowledge at their disposal. Frequently, groups, such as C-Span, ask these historians to rank the Presidents and average the results together to help reduce bias. The idea is to produce a set of rankings that incorporates a broad range of historians, a vast array of information, and a historical perspective. These rankings include informed assessments of each President’s effectiveness, leadership, moral authority, administrative skills, economic management, vision, and so on. [Read more…] about Understanding Historians’ Rankings of U.S. Presidents using Regression Models

## Proxy Variables: The Good Twin of Confounding Variables

Proxy variables are easily measurable variables that analysts include in a model in place of a variable that cannot be measured or is difficult to measure. Proxy variables can be something that is not of any great interest itself, but has a close correlation with the variable of interest. [Read more…] about Proxy Variables: The Good Twin of Confounding Variables

## Variance Inflation Factors (VIFs)

Variance Inflation Factors (VIFs) measure the correlation among independent variables in least squares regression models. Statisticians refer to this type of correlation as multicollinearity. Excessive multicollinearity can cause problems for regression models.

In this post, I focus on VIFs and how they detect multicollinearity, why they’re better than pairwise correlations, how to calculate VIFs yourself, and interpreting VIFs. If you need a refresher about the types of problems that multicollinearity causes and how to fix them, read my post: Multicollinearity: Problems, Detection, and Solutions. [Read more…] about Variance Inflation Factors (VIFs)

## How to Perform Regression Analysis using Excel

Excel can perform various statistical analyses, including regression analysis. It is a great option because nearly everyone can access Excel. This post is an excellent introduction to performing and interpreting regression analysis, even if Excel isn’t your primary statistical software package.

[Read more…] about How to Perform Regression Analysis using Excel

## New eBook Release! Regression Analysis: An Intuitive Guide

I’m thrilled to announce the release of my first book! *Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models*.

If you like the clear writing style I use on this website, you’ll love this book! The end of the post displays the entire table of contents! [Read more…] about New eBook Release! Regression Analysis: An Intuitive Guide

## Confounding Variable: Definition & Examples

## Confounding Variable Definition

In studies examining possible causal links, a confounding variable is an unaccounted factor that impacts both the potential cause and effect and can distort the results. Recognizing and addressing these variables in your experimental design is crucial for producing valid findings. Statisticians also refer to confounding variables that cause bias as confounders, omitted variables, and lurking variables. [Read more…] about Confounding Variable: Definition & Examples

## The Gauss-Markov Theorem and BLUE OLS Coefficient Estimates

The Gauss-Markov theorem states that if your linear regression model satisfies the first six classical assumptions, then ordinary least squares (OLS) regression produces unbiased estimates that have the smallest variance of all possible linear estimators. [Read more…] about The Gauss-Markov Theorem and BLUE OLS Coefficient Estimates

## 7 Classical Assumptions of Ordinary Least Squares (OLS) Linear Regression

Ordinary Least Squares (OLS) is the most common estimation method for linear models—and that’s true for a good reason. As long as your model satisfies the OLS assumptions for linear regression, you can rest easy knowing that you’re getting the best possible estimates. [Read more…] about 7 Classical Assumptions of Ordinary Least Squares (OLS) Linear Regression

## Regression Tutorial with Analysis Examples

Regression analysis mathematically describes the relationship between independent variables and the dependent variable. It also allows you to predict the mean value of the dependent variable when you specify values for the independent variables. In this regression tutorial, I gather together a wide range of posts that I’ve written about regression analysis. My tutorial helps you go through the regression content in a systematic and logical order. [Read more…] about Regression Tutorial with Analysis Examples