The Gauss-Markov theorem states that if your linear regression model satisfies the first six classical assumptions, then ordinary least squares (OLS) regression produces unbiased estimates that have the smallest variance of all possible linear estimators. [Read more…] about The Gauss-Markov Theorem and BLUE OLS Coefficient Estimates
Ordinary Least Squares (OLS) is the most common estimation method for linear models—and that’s true for a good reason. As long as your model satisfies the OLS assumptions for linear regression, you can rest easy knowing that you’re getting the best possible estimates. [Read more…] about 7 Classical Assumptions of Ordinary Least Squares (OLS) Linear Regression
Typically, quality improvement analysts use control charts to assess business processes and don’t have hypothesis tests in mind. Do you know how control charts provide tremendous benefits in other settings and with hypothesis testing? Spoilers—control charts check an assumption that we often forget about for hypothesis tests! [Read more…] about Use Control Charts with Hypothesis Tests
Heteroscedasticity means unequal scatter. In regression analysis, we talk about heteroscedasticity in the context of the residuals or error term. Specifically, heteroscedasticity is a systematic change in the spread of the residuals over the range of measured values. Heteroscedasticity is a problem because ordinary least squares (OLS) regression assumes that all residuals are drawn from a population that has a constant variance (homoscedasticity).
To satisfy the regression assumptions and be able to trust the results, the residuals should have a constant variance. In this blog post, I show you how to identify heteroscedasticity, explain what produces it, the problems it causes, and work through an example to show you several solutions. [Read more…] about Heteroscedasticity in Regression Analysis
As you fit regression models, you might need to make a choice between linear and nonlinear regression models. The field of statistics can be weird. Despite their names, both forms of regression can fit curvature in your data. So, how do you choose? In this blog post, I show you how to choose between linear and nonlinear regression models. [Read more…] about How to Choose Between Linear and Nonlinear Regression
If you were able to make predictions about something important to you, you’d probably love that, right? It’s even better if you know that your predictions are sound. In this post, I show how to use regression analysis to make predictions and determine whether they are both unbiased and precise. [Read more…] about Making Predictions with Regression Analysis
In regression analysis, curve fitting is the process of specifying the model that provides the best fit to the specific curves in your dataset. Curved relationships between variables are not as straightforward to fit and interpret as linear relationships. [Read more…] about Curve Fitting using Linear and Nonlinear Regression
Nonparametric tests don’t require that your data follow the normal distribution. They’re also known as distribution-free tests and can provide benefits in certain situations. Typically, people who perform statistical hypothesis tests are more comfortable with parametric tests than nonparametric tests.
You’ve probably heard it’s best to use nonparametric tests if your data are not normally distributed—or something along these lines. That seems like an easy way to choose, but there’s more to the decision than that. [Read more…] about Nonparametric Tests vs. Parametric Tests
Nonlinear regression is an extremely flexible analysis that can fit most any curve that is present in your data. R-squared seems like a very intuitive way to assess the goodness-of-fit for a regression model. Unfortunately, the two just don’t go together. R-squared is invalid for nonlinear regression. [Read more…] about R-squared Is Not Valid for Nonlinear Regression
Use residual plots to check the assumptions of an OLS linear regression model. If you violate the assumptions, you risk producing results that you can’t trust. Residual plots display the residual values on the y-axis and fitted values, or another variable, on the x-axis. After you fit a regression model, it is crucial to check the residual plots. If your plots display unwanted patterns, you can’t trust the regression coefficients and other numeric results.
In this post, I explain the conceptual reasons why residual plots help ensure that your regression model is valid. I’ll also show you what to look for and how to fix the problems. [Read more…] about Check Your Residual Plots to Ensure Trustworthy Regression Results!
Welch’s ANOVA is an alternative to the traditional analysis of variance (ANOVA) and it offers some serious benefits. One-way ANOVA determines whether differences between the means of at least three groups are statistically significant. For decades, introductory statistics classes have taught the classic Fishers one-way ANOVA that uses the F-test. It’s a standard statistical analysis, and you might think it’s pretty much set in stone by now. Surprise, there’s a significant change occurring in the world of one-way ANOVA! [Read more…] about Benefits of Welch’s ANOVA Compared to the Classic One-Way ANOVA
How do you analyze Likert scale data? Likert scales are the most broadly used method for scaling responses in survey studies. Survey questions that ask you to indicate your level of agreement, from strongly agree to strongly disagree, use the Likert scale. The data in the worksheet are five-point Likert scale data for two groups. [Read more…] about How to Analyze Likert Scale Data
Who would’ve thought that an old TV game show could inspire a statistical problem that has tripped up mathematicians and statisticians with Ph.Ds? The Monty Hall problem has confused people for decades. In the game show, Let’s Make a Deal, Monty Hall asks you to guess which closed door a prize is behind. The answer is so puzzling that people often refuse to accept it! The problem occurs because our statistical assumptions are incorrect.