What is a Parsimonious Model?
A parsimonious model in statistics is one that uses relatively few independent variables to obtain a good fit to the data. [Read more…] about What is a Parsimonious Model? Benefits and Selecting
Making statistics intuitive
A parsimonious model in statistics is one that uses relatively few independent variables to obtain a good fit to the data. [Read more…] about What is a Parsimonious Model? Benefits and Selecting
The sum of squares (SS) is a statistic that measures the variability of a dataset’s observations around the mean. It’s the cumulative total of each data point’s squared difference from the mean. [Read more…] about Sum of Squares: Definition, Formula & Types
The root mean square error (RMSE) measures the average difference between a statistical model’s predicted values and the actual values. Mathematically, it is the standard deviation of the residuals. Residuals represent the distance between the regression line and the data points. [Read more…] about Root Mean Square Error (RMSE)
A least squares regression line represents the relationship between variables in a scatterplot. The procedure fits the line to the data points in a way that minimizes the sum of the squared vertical distances between the line and the points. It is also known as a line of best fit or a trend line. [Read more…] about Least Squares Regression: Definition, Formulas & Example
A linear regression equation describes the relationship between the independent variables (IVs) and the dependent variable (DV). It can also predict new values of the DV for the IV values you specify. [Read more…] about Linear Regression Equation Explained
Linear regression models the relationships between at least one explanatory variable and an outcome variable. These variables are known as the independent and dependent variables, respectively. When there is one independent variable (IV), the procedure is known as simple linear regression. When there are more IVs, statisticians refer to it as multiple regression. [Read more…] about Linear Regression
Mean squared error (MSE) measures the amount of error in statistical models. It assesses the average squared difference between the observed and predicted values. When a model has no error, the MSE equals zero. As model error increases, its value increases. The mean squared error is also known as the mean squared deviation (MSD). [Read more…] about Mean Squared Error (MSE)
Orthogonality is a mathematical property that is beneficial for statistical models. It’s particularly helpful when performing factorial analysis of designed experiments. [Read more…] about Orthogonal: Models, Definition & Finding
Independent variables and dependent variables are the two fundamental types of variables in statistical modeling and experimental designs. Analysts use these methods to understand the relationships between the variables and estimate effect sizes. What effect does one variable have on another?
In this post, learn the definitions of independent and dependent variables, how to identify each type, how they differ between different types of studies, and see examples of them in use. [Read more…] about Independent and Dependent Variables: Differences & Examples
Historians rank the U.S. Presidents from best to worse using all the historical knowledge at their disposal. Frequently, groups, such as C-Span, ask these historians to rank the Presidents and average the results together to help reduce bias. The idea is to produce a set of rankings that incorporates a broad range of historians, a vast array of information, and a historical perspective. These rankings include informed assessments of each President’s effectiveness, leadership, moral authority, administrative skills, economic management, vision, and so on. [Read more…] about Understanding Historians’ Rankings of U.S. Presidents using Regression Models
Proxy variables are easily measurable variables that analysts include in a model in place of a variable that cannot be measured or is difficult to measure. Proxy variables can be something that is not of any great interest itself, but has a close correlation with the variable of interest. [Read more…] about Proxy Variables: The Good Twin of Confounding Variables
Variance Inflation Factors (VIFs) measure the correlation among independent variables in least squares regression models. Statisticians refer to this type of correlation as multicollinearity. Excessive multicollinearity can cause problems for regression models.
In this post, I focus on VIFs and how they detect multicollinearity, why they’re better than pairwise correlations, how to calculate VIFs yourself, and interpreting VIFs. If you need a refresher about the types of problems that multicollinearity causes and how to fix them, read my post: Multicollinearity: Problems, Detection, and Solutions. [Read more…] about Variance Inflation Factors (VIFs)
Excel can perform various statistical analyses, including regression analysis. It is a great option because nearly everyone can access Excel. This post is an excellent introduction to performing and interpreting regression analysis, even if Excel isn’t your primary statistical software package.
[Read more…] about How to Perform Regression Analysis using Excel
I’m thrilled to announce the release of my first book! Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models.
If you like the clear writing style I use on this website, you’ll love this book! The end of the post displays the entire table of contents! [Read more…] about New eBook Release! Regression Analysis: An Intuitive Guide
In research studies, confounding variables influence both the cause and effect that the researchers are assessing. Consequently, if the analysts do not include these confounders in their statistical model, it can exaggerate or mask the real relationship between two other variables. By omitting confounding variables, the statistical procedure is forced to attribute their effects to variables in the model, which biases the estimated effects and confounds the genuine relationship. Statisticians refer to this distortion as omitted variable bias.
[Read more…] about Confounding Variables Can Bias Your Results
The Gauss-Markov theorem states that if your linear regression model satisfies the first six classical assumptions, then ordinary least squares (OLS) regression produces unbiased estimates that have the smallest variance of all possible linear estimators. [Read more…] about The Gauss-Markov Theorem and BLUE OLS Coefficient Estimates
Ordinary Least Squares (OLS) is the most common estimation method for linear models—and that’s true for a good reason. As long as your model satisfies the OLS assumptions for linear regression, you can rest easy knowing that you’re getting the best possible estimates. [Read more…] about 7 Classical Assumptions of Ordinary Least Squares (OLS) Linear Regression
Regression analysis mathematically describes the relationship between independent variables and the dependent variable. It also allows you to predict the mean value of the dependent variable when you specify values for the independent variables. In this regression tutorial, I gather together a wide range of posts that I’ve written about regression analysis. My tutorial helps you go through the regression content in a systematic and logical order. [Read more…] about Regression Tutorial with Analysis Examples
Regression analysis mathematically describes the relationship between a set of independent variables and a dependent variable. There are numerous types of regression models that you can use. This choice often depends on the kind of data you have for the dependent variable and the type of model that provides the best fit. In this post, I cover the more common types of regression analyses and how to decide which one is right for your data. [Read more…] about Choosing the Correct Type of Regression Analysis
An interaction effect occurs when the effect of one variable depends on the value of another variable. Interaction effects are common in regression models, ANOVA, and designed experiments. In this post, I explain interaction effects, the interaction effect test, how to interpret interaction models, and describe the problems you can face if you don’t include them in your model. [Read more…] about Understanding Interaction Effects in Statistics