Residual plots display the residual values on the y-axis and fitted values, or another variable, on the x-axis. After you fit a regression model, it is crucial to check the residual plots. If your plots display unwanted patterns, you can’t trust the regression coefficients and other numeric results. In this post, I explain the conceptual reasons why residual plots help ensure that your regression model is valid. I’ll also show you what to look for and how to fix the problems. [Read more…] about Check Your Residual Plots to Ensure Trustworthy Regression Results!
The F-test of overall significance indicates whether your linear regression model provides a better fit to the data than a model that contains no independent variables. In this post, I look at how the F-test of overall significance fits in with other regression statistics, such as R-squared. R-squared tells you how well your model fits the data, and the F-test is related to it. [Read more…] about How to Interpret the F-test of Overall Significance in Regression Analysis
Multicollinearity occurs when independent variables in a regression model are correlated. This correlation is a problem because independent variables should be independent. If the degree of correlation between variables is high enough, it can cause problems when you fit the model and interpret the results. [Read more…] about Multicollinearity in Regression Analysis: Problems, Detection, and Solutions
Welch’s ANOVA is an alternative to the traditional analysis of variance (ANOVA) and it offers some serious benefits. One-way ANOVA determines whether differences between the means of at least three groups are statistically significant. For decades, introductory statistics classes have taught the classic Fishers one-way ANOVA that uses the F-test. It’s a standard statistical analysis, and you might think it’s pretty much set in stone by now. Surprise, there’s a significant change occurring in the world of one-way ANOVA! [Read more…] about Benefits of Welch’s ANOVA Compared to the Classic One-Way ANOVA
When is Easter this year? I ask this question every year! This year, Easter occurs on April 16, 2017. Next year, Easter falls on April 1, 2018. I have a hard time remembering when it occurs in any given year. I think that March Easters are both early and unusual. Is that true?
Being a statistician, my first thought is to study the distribution of Easter dates. By analyzing the distribution, we can determine which dates are rare and which are common. How unusual are Easter dates in March? Are there patterns in the dates? [Read more…] about When is Easter this Year?
How do you analyze Likert scale data? Likert scales are the most broadly used method for scaling responses in survey studies. Survey questions that ask you to indicate your level of agreement, from strongly agree to strongly disagree, use the Likert scale. The data in the worksheet are five-point Likert scale data for two groups. [Read more…] about How to Analyze Likert Scale Data
The difference between linear and nonlinear regression models isn’t as straightforward as it sounds. You’d think that linear equations produce straight lines and nonlinear equations model curvature. Unfortunately, that’s not correct. Both types of models can fit curves to your data—so that’s not the defining characteristic. In this post, I’ll teach you how to identify linear and nonlinear regression models. [Read more…] about The Difference between Linear and Nonlinear Regression Models
The standard error of the regression (S) and R-squared are two key goodness-of-fit measures for regression analysis. While R-squared is the most well-known amongst the goodness-of-fit statistics, I think it is a bit over-hyped. [Read more…] about Standard Error of the Regression vs. R-squared
The Chi-square test of independence determines whether there is a statistically significant relationship between categorical variables. It is a hypothesis test that answers the question—do the values of one categorical variable depend on the value of other categorical variables? [Read more…] about Chi-Square Test of Independence and an Example
Multivariate ANOVA (MANOVA) extends the capabilities of analysis of variance (ANOVA) by assessing multiple dependent variables simultaneously. ANOVA statistically tests the differences between three or more group means. For example, if you have three different teaching methods and you want to evaluate the average scores for these groups, you can use ANOVA. However, ANOVA does have a drawback. It can assess only one dependent variable at a time. This limitation can be an enormous problem in certain circumstances because it can prevent you from detecting effects that actually exist. [Read more…] about Multivariate ANOVA (MANOVA) Benefits and When to Use It
Repeated measures designs, also known as a within-subjects designs, can seem like oddball experiments. When you think of a typical experiment, you probably picture an experimental design that uses mutually exclusive, independent groups. These experiments have a control group and treatment groups that have clear divisions between them. Each subject is in only one of these groups. [Read more…] about Repeated Measures Designs: Benefits and an ANOVA Example
Happy Saint Patrick’s Day! This holiday got me thinking about four-leaf clovers and probability theory. Now, I know that four-leaf clovers are not Shamrocks. And, it is shamrocks that are actually associated with St. Patrick’s Day. A shamrock is a young patch of three-leaf white clover that grows in winter. Nonetheless, the holiday started me thinking about four-leaf clovers and probabilities. [Read more…] about How Probability Theory Can Help You Find More Four-Leaf Clovers
When it comes to hypothesis testing, statistics help you avoid opinions about when an effect is large and how many samples you need to collect. Opinions about these things can be way off—even among those who regularly perform experiments and collect data! This can lead you to draw the incorrect conclusions. Always perform the correct hypothesis tests so you understand the strength of your evidence.
Back in 2014, House Speaker John Boehner resigned, and then Kevin McCarthy refused the position of Speaker of the House before the vote. The Republican’s search for a new speaker ultimately led to Paul Ryan. Simultaneously, the Republican Freedom Caucus was making the news with a potential shutdown of the government that was controversial even amongst some Republicans. [Read more…] about Statistical Analysis of the Republican Establishment Split
I love astronomy! The discovery of thousands of exoplanets has made it only more exciting. You often hear about the really weird planets in the news. You know, things like low density puffballs, hot Jupiters, rogue planets, planets that orbit their star in hours, and even a Jupiter mass planet that is one huge diamond! As neat as these discoveries are, I also want to know how Earth fits in. [Read more…] about Statistics, Exoplanets, and the Search for Earthlike Planets
Who would’ve thought that an old TV game show could inspire a statistical problem that has tripped up mathematicians and statisticians with Ph.Ds? The Monty Hall problem has confused people for decades. In the game show, Let’s Make a Deal, Monty Hall asks you to guess which closed door a prize is behind. The answer is so puzzling that people often refuse to accept it! The problem occurs because our statistical assumptions are incorrect.