Mean squared error (MSE) measures the amount of error in statistical models. It assesses the average squared difference between the observed and predicted values. When a model has no error, the MSE equals zero. As model error increases, its value increases. The mean squared error is also known as the mean squared deviation (MSD). [Read more…] about Mean Squared Error (MSE)

# Regression

## Orthogonality

Orthogonality is a mathematical property that is beneficial for statistical models. It’s particularly helpful when performing factorial analysis of designed experiments. [Read more…] about Orthogonality

## Independent and Dependent Variables

Independent variables and dependent variables are the two fundamental types of variables in statistical modeling and experimental designs. Analysts use these methods to understand the relationships between the variables and estimate effect sizes. What effect does one variable have on another?

In this post, learn the definitions of independent and dependent variables, how to identify each type, how they differ between different types of studies, and see examples of them in use. [Read more…] about Independent and Dependent Variables

## Understanding Historians’ Rankings of U.S. Presidents using Regression Models

Historians rank the U.S. Presidents from best to worse using all the historical knowledge at their disposal. Frequently, groups, such as C-Span, ask these historians to rank the Presidents and average the results together to help reduce bias. The idea is to produce a set of rankings that incorporates a broad range of historians, a vast array of information, and a historical perspective. These rankings include informed assessments of each President’s effectiveness, leadership, moral authority, administrative skills, economic management, vision, and so on. [Read more…] about Understanding Historians’ Rankings of U.S. Presidents using Regression Models

## Proxy Variables: The Good Twin of Confounding Variables

Proxy variables are easily measurable variables that analysts include in a model in place of a variable that cannot be measured or is difficult to measure. Proxy variables can be something that is not of any great interest itself, but has a close correlation with the variable of interest. [Read more…] about Proxy Variables: The Good Twin of Confounding Variables

## Variance Inflation Factors (VIFs)

Variance Inflation Factors (VIFs) measure the correlation among independent variables in least squares regression models. Statisticians refer to this type of correlation as multicollinearity. Excessive multicollinearity can cause problems for regression models.

In this post, I focus on VIFs and how they detect multicollinearity, why they’re better than pairwise correlations, how to calculate VIFs yourself, and interpreting VIFs. If you need a refresher about the types of problems that multicollinearity causes and how to fix them, read my post: Multicollinearity: Problems, Detection, and Solutions. [Read more…] about Variance Inflation Factors (VIFs)

## How to Perform Regression Analysis using Excel

Excel can perform various statistical analyses, including regression analysis. It is a great option because nearly everyone can access Excel. This post is an excellent introduction to performing and interpreting regression analysis, even if Excel isn’t your primary statistical software package.

[Read more…] about How to Perform Regression Analysis using Excel

## New eBook Release! Regression Analysis: An Intuitive Guide

I’m thrilled to announce the release of my first ebook! *Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models*.

If you like the clear writing style I use on this website, you’ll love this book! The end of the post displays the entire table of contents! You can also download a Free Sample that includes the complete Table of Contents and the first two chapters. Go to My Store to download the ebook sample. [Read more…] about New eBook Release! Regression Analysis: An Intuitive Guide

## Confounding Variables Can Bias Your Results

In research studies, confounding variables influence both the cause and effect that the researchers are assessing. Consequently, if the analysts do not include these confounders in their statistical model, it can exaggerate or mask the real relationship between two other variables. By omitting confounding variables, the statistical procedure is forced to attribute their effects to variables in the model, which biases the estimated effects and confounds the genuine relationship. Statisticians refer to this distortion as omitted variable bias.

[Read more…] about Confounding Variables Can Bias Your Results

## The Gauss-Markov Theorem and BLUE OLS Coefficient Estimates

The Gauss-Markov theorem states that if your linear regression model satisfies the first six classical assumptions, then ordinary least squares (OLS) regression produces unbiased estimates that have the smallest variance of all possible linear estimators. [Read more…] about The Gauss-Markov Theorem and BLUE OLS Coefficient Estimates

## 7 Classical Assumptions of Ordinary Least Squares (OLS) Linear Regression

Ordinary Least Squares (OLS) is the most common estimation method for linear models—and that’s true for a good reason. As long as your model satisfies the OLS assumptions for linear regression, you can rest easy knowing that you’re getting the best possible estimates. [Read more…] about 7 Classical Assumptions of Ordinary Least Squares (OLS) Linear Regression

## Regression Tutorial with Analysis Examples

Regression analysis mathematically describes the relationship between independent variables and the dependent variable. It also allows you to predict the mean value of the dependent variable when you specify values for the independent variables. In this regression tutorial, I gather together a wide range of posts that I’ve written about regression analysis. My tutorial helps you go through the regression content in a systematic and logical order. [Read more…] about Regression Tutorial with Analysis Examples

## Choosing the Correct Type of Regression Analysis

Regression analysis mathematically describes the relationship between a set of independent variables and a dependent variable. There are numerous types of regression models that you can use. This choice often depends on the kind of data you have for the dependent variable and the type of model that provides the best fit. In this post, I cover the more common types of regression analyses and how to decide which one is right for your data. [Read more…] about Choosing the Correct Type of Regression Analysis

## Understanding Interaction Effects in Statistics

Interaction effects occur when the effect of one variable depends on the value of another variable. Interaction effects are common in regression analysis, ANOVA, and designed experiments. In this blog post, I explain interaction effects, how to interpret them in statistical designs, and the problems you will face if you don’t include them in your model. [Read more…] about Understanding Interaction Effects in Statistics

## When Should I Use Regression Analysis?

Use regression analysis to describe the relationships between a set of independent variables and the dependent variable. Regression analysis produces a regression equation where the coefficients represent the relationship between each independent variable and the dependent variable. You can also use the equation to make predictions.

As a statistician, I should probably tell you that I love all statistical analyses equally—like parents with their kids. But, shhh, I have secret! Regression analysis is my favorite because it provides tremendous flexibility, which makes it useful in so many different circumstances. In fact, I’ve described regression analysis as taking correlation to the next level!

In this blog post, I explain the capabilities of regression analysis, the types of relationships it can assess, how it controls the variables, and generally why I love it! You’ll learn when you should consider using regression analysis. [Read more…] about When Should I Use Regression Analysis?

## Using Log-Log Plots to Determine Whether Size Matters

Log-log plots display data in two dimensions where both axes use logarithmic scales. When one variable changes as a constant power of another, a log-log graph shows the relationship as a straight line. In this post, I’ll show you why these graphs are valuable and how to interpret them. [Read more…] about Using Log-Log Plots to Determine Whether Size Matters

## When Do You Need to Standardize the Variables in a Regression Model?

Standardization is the process of putting different variables on the same scale. In regression analysis, there are some scenarios where it is crucial to standardize your independent variables or risk obtaining misleading results.

In this blog post, I show when and why you need to standardize your variables in regression analysis. Don’t worry, this process is simple and helps ensure that you can trust your results. In fact, standardizing your variables can reveal essential findings that you would otherwise miss! [Read more…] about When Do You Need to Standardize the Variables in a Regression Model?

## Why Are There No P Values in Nonlinear Regression?

Nonlinear regression analysis cannot calculate P values for the independent variables in your model. Why not? And, what do you use instead? Those are the topics of this blog post. [Read more…] about Why Are There No P Values in Nonlinear Regression?

## Five Regression Analysis Tips to Avoid Common Problems

Regression is a very powerful statistical analysis. It allows you to isolate and understand the effects of individual variables, model curvature and interactions, and make predictions. Regression analysis offers high flexibility but presents a variety of potential pitfalls. Great power requires great responsibility!

In this post, I offer five tips that will not only help you avoid common problems but also make the modeling process easier. I’ll close by showing you the difference between the modeling process that a top analyst uses versus the procedure of a less rigorous analyst. [Read more…] about Five Regression Analysis Tips to Avoid Common Problems

## Understand Precision in Predictive Analytics to Avoid Costly Mistakes

Precision in predictive analytics refers to how close the model’s predictions are to the observed values. The more precise the model, the closer the data points are to the predictions. When you have an imprecise model, the observations tend to be further away from the predictions, thereby reducing the usefulness of the predictions. If you have a model that is not sufficiently precise, you risk making costly mistakes! [Read more…] about Understand Precision in Predictive Analytics to Avoid Costly Mistakes