The likelihood ratio test (LRT) is a statistical method that compares the fit of two nested models where one model is a simplified version of the other. The test evaluates whether the more complex model significantly improves the explanation of the data compared to the simpler one. It does this by comparing the likelihoods of each model: how likely each model is to have produced the observed data.
Analysts frequently use a likelihood ratio test in the following contexts:
- In various types of regression models, to better detect the combined effect of a group of predictors, which may be overlooked when examining each term’s p-value individually.
- To determine whether a three-parameter version of a probability distribution provides a significantly better fit than a simpler two-parameter version of the same distribution.
- In linear and generalized linear mixed models (LMMs and GLMMs), to evaluate whether including random slopes or other random effect structures improves model fit.
- In structural equation modeling (SEM), to formally compare nested models and assess whether additional paths or latent constructs enhance explanatory power.
- In generalized linear models (GLMs), the likelihood ratio test is often referred to as a deviance test because the test statistic is based on the difference in model deviance between nested models.
The test statistic is calculated using the formula:
LRT = –2 × (log-likelihood of the simpler model – log-likelihood of the complex model)
The LRT value follows a chi-square distribution, where the degrees of freedom equal the number of additional parameters in the complex model. A large test statistic suggests that the more complex model fits the data better, and the p-value tells you whether the improvement is statistically significant.
For example, suppose a researcher is modeling disease risk. The simpler model includes age and gender, while the more complex model adds smoking status. The likelihood ratio test compares how well each model fits the data. If the test is significant, it suggests that adding smoking status significantly improves the model’s ability to predict disease risk, and it should be retained.