What is the Chi Square Goodness of Fit Test?
The chi-square goodness of fit test evaluates whether proportions of categorical or discrete outcomes in a sample follow a population distribution with hypothesized proportions. In other words, when you draw a random sample, do the observed proportions follow the values that theory suggests.
Analysts frequently use the chi-square goodness of fit test to determine whether the proportions of categorical outcomes are all equal. Or the analyst can specify the set of proportions to include in the test. Alternatively, this test can evaluate whether observed outcomes follow a discrete probability distribution, such as the Poisson distribution.
When to Use the Chi Squared Test for Goodness of Fit?
As a hypothesis test, the chi-square goodness of fit test allows you to use your sample to draw conclusions about an entire population. For example, use this test to answer the following questions. Was your sample drawn from a population where the proportions of:
- Red, green, blue, and yellow candies are equal?
- Cards from a bottomless deck in an online poker game follow the expectations for a fair game?
- Local car colors follow the global distribution?
- Monthly car accident counts at an intersection follow the Poisson distribution?
- Do the leading digits of numbers in a dataset follow Benford’s law?
Suppose we theorize that a candy’s manufacturing process produces equal numbers of red, green, blue, and yellow candies. If this suspicion is correct, each color comprises 25% of the population. However, if we were to randomly sample the candy, our sample won’t match the population proportions exactly, thanks to random sampling error. We might find 35% red candies, 15% green, 22% blue, and 28% yellow.
Is this difference from our expectations large enough to disprove our hypothesis? Or can we chalk up the difference to random sampling error? The chi-square goodness of fit test can help us out!
Chi-Square Goodness of Fit Test Details
The chi-square goodness of fit test takes counts of observed and expected outcomes and evaluates the differences between them. The process converts the count for each outcome into a proportion of all outcomes.
When the differences between the observed and expected counts are sufficiently large, the test results are statistically significant. You did not draw the sample from a population with the hypothesized proportions.
The null and alternative hypotheses for the chi-square goodness of fit test are the following:
- Null: The sample data follow the hypothesized distribution.
- Alternative: The sample data do not follow the hypothesized distribution.
When the p-value for the chi-square goodness of fit test is less than your significance level, reject the null hypothesis. Your data favor the hypothesis that the sample does not follow the hypothesized distribution.
Let’s work through two examples using the chi square goodness of fit test! In one example, we’ll specify the test proportions, and in the other, we’ll see whether our data follow the Poisson distribution. The data for both examples are available in this CSV file: DiscreteGOF.
Learn more about how Chi Square Tests Work.
Chi Squared Test for Goodness of Fit Example
PPG Industries researched global new car colors in 2012. In this example, we want to determine whether a random sample of local car colors follows the global distribution. To perform this chi-square goodness of fit test, we need to know the global proportions and our regional sample proportions.
In this form of the test, the global proportions are the expected values, while the local sample proportions are the observed values.
The table below contains the data.
The OurState column contains the count of car colors we observed. The global proportions are from PPG Industries.
The Chi-square goodness of fit test determines whether our local distribution differs from the global distribution.
This table displays a frequency distribution and a relative frequency distribution. For more information, read my post, Relative Frequencies and Their Distributions.
Interpreting the Test Results
The chi-square goodness of fit test assesses the differences between the observed and expected proportions. Because the p-value is less than the significance level, we reject the null hypothesis and conclude that these differences are statistically significant. We conclude that we did not draw our local random sample from a population that follows the global proportions.
The Contribution to Chi-square column indicates that the largest differences occur with gray and red cars. By comparing the observed to expected counts, we can see that our sample has a higher proportion of grey cars and a lower proportion of red cars than the global test proportions.
Chi-Square Goodness of Fit Test for the Poisson Distribution
The chi-square goodness of fit test can evaluate a sample and see if it follows the Poisson distribution.
The Poisson distribution is a discrete probability distribution that can model counts of events or attributes in a fixed observation space. Many but not all count processes follow this distribution. Consequently, analysts often need to verify whether a set of counts follows the Poisson distribution.
The Poisson distribution is discrete because its values must be integers. Because it uses discrete counts, we can use the chi-square goodness of fit test to evaluate whether data follow the Poisson distribution.
For the Poisson version of this test, the null and alternative hypotheses are the following:
- Null: The sample data follow the Poisson distribution.
- Alternative: The sample data do not follow the Poisson distribution.
The test uses the same process as the previous example. However, instead of the analyst specifying the expected counts and proportions, the procedure use values that the Poisson distribution expects. Typically, your software calculates them for you.
Let’s work through an example where a safety inspector monitors car accidents at a bustling intersection. The inspector enters the counts of monthly accidents as shown below.
Each cell signifies the count of accidents for a month. The full dataset covers 50 months.
Now, let’s perform the test!
Related post: Using the Poisson Distribution
Interpreting the Poisson Test Results
The statistical output with its observed and expected counts looks similar to the previous example. In this example, the software calculates the expected counts using the Poisson distribution.
Because the p-value is greater than our significance level of 0.05, we fail to reject the null hypothesis. For distribution tests, failing to reject the null suggests that the data follow the specified distribution. We can conclude that our count data follow the Poisson distribution.
Various analyses assume the data follow the Poisson distribution, including Poisson rate analyses and the U chart. Our data are suitable for these analyses.
Finally, the examples in this post involve comparing the p-value to the significance level. As an alternative to using the p-value, you can compare the chi-square value, which is the test statistic, to the critical value in a chi-square table to determine statistical significance. The two methodologies always agree and allow you to draw the same conclusions.