Intervals are estimation methods in statistics that use sample data to produce ranges of values that are likely to contain the population value of interest. In contrast, point estimates are single value estimates of a population value. Of the different types of statistical intervals, confidence intervals are the most well-known. However, certain kinds of analyses and situations call for other types of ranges that provide different information.

In this post, I’ll compare confidence intervals, prediction intervals, and tolerance intervals, so you’ll know when to use each type. I’ll include an example of each type of range to make them easier to understand!

## What are Confidence Intervals?

Confidence interval calculations take sample data and produce a range of values that likely contains the population parameter that you are interested in. For example, the confidence interval of the mean [9 11] suggests that the population mean is likely to be between 9 and 11.

Different random samples drawn from the same population are liable to produce slightly different confidence intervals. If you collect numerous random samples from the same population and calculate a confidence interval for each sample, a certain proportion of the ranges contain the population parameter. That percentage is the confidence level.

For example, a 95% confidence level indicates that if you draw 20 random samples from the same population, you’d expect 19 of the confidence intervals to include the population value. The confidence interval procedure is useful because it produces ranges that usually contain the parameter.

Use confidence intervals to produce ranges for all types of population parameters. A confidence interval for a population mean is probably the most common type, but you can also use these ranges for the standard deviation, proportions, rates of occurrence, regression coefficients, and the differences between populations.

### Example of a Confidence Interval

Suppose that you randomly sample a product, measure the strength, and the 95% confidence interval is 100 – 120 units. You can be 95% confident that the mean strength of the entire population falls within this range. However, the 95% confidence level does not indicate that 95% of observations fall within this range. To draw that type of conclusion, we need to use a different kind of interval.

Here are some important considerations for confidence intervals.

- As you draw larger and larger random samples from the same population, the confidence intervals tend to become narrower.
- As you increase the confidence level for a given same sample, say from 95% to 99%, the range becomes wider. At first, this fact might seem counter-intuitive, but think about it. To have greater confidence that an interval contains the parameter, it makes sense that the range must become wider. Conversely, a narrower range is less likely to include the parameter, which lowers your confidence.
- A confidence interval for the mean says nothing about the dispersion of values around the mean.

For a graphical representation that makes these concepts more intuitive, please read my blog post: How Confidence Intervals and Confidence Levels Work.

## What Are Prediction Intervals?

After you fit a regression model, you can obtain prediction intervals. These intervals predict the value of the dependent variable given specific settings of the independent variables. I’ll cover two types of prediction intervals that provide different types of predictions.

### Confidence interval of the prediction

A confidence interval of the prediction is a range that likely contains the mean value of the dependent variable given specific values of the independent variables. Like regular confidence intervals, these intervals provide a range for the population average. In this case, it’s a particular population defined by the values of your independent variables. Similarly, these ranges don’t tell you anything about the spread of the individual data points around the population mean.

Going back to our product strength example, let’s assume it is a plastic product, and our independent variables are the plastic type (A or B) and the processing temperature. After we fit our model, the statistical software can produce the confidence interval of the prediction for specific settings.

We want to predict the mean strength for our product if we use plastic type A with a processing temperature of 125 degrees Celsius. The resulting confidence interval of the prediction is 140 – 150. These results indicate we can be 95% confident that the population defined by plastic type A and 125C has a mean that falls within this range. However, it provides no indication of the distribution of strength values for individual products.

### Prediction interval

A prediction interval is a range that likely contains the value of the dependent variable for a single new observation given specific values of the independent variables. With this type of interval, we’re predicting ranges for individual observations rather than the mean value.

Let’s use the same model and the same values that we used above. The statistical software produces a prediction interval of 130 – 160. We can be 95% confident that the strength of the next individual item produced using our settings will fall within this range.

There is greater uncertainty when you predict an individual value rather than the mean value. Consequently, a prediction interval is always wider than the confidence interval of the prediction.

We can predict the range for an individual observation, but we need a model. For more information, read my post about using regression to make predictions.

## What Are Tolerance Intervals?

Use tolerance intervals to answer the question, “what range of values covers X% of the population?” If you want to know the range where most values fall, use a tolerance interval.

A tolerance interval is a range that likely contains a specific proportion of a population. For example, you might want to know where 99% of the population falls for a particular characteristic. With tolerance intervals, we are specifically dealing with the spread of individual values around the mean.

To create a tolerance interval, you need to specify both the confidence level and the proportion. The confidence level is required because we’re still working with samples and their inherent uncertainties.

For example, we want to create a tolerance interval where we’ll be 95% confident that the interval contains 99% of the population.

I think it’s a lot easier to understand confidence intervals using an example!

### Example of a tolerance interval

As the plastic manufacturer, we need to know the strength of our product. However, we need to know more than just the mean strength. It’s important to understand the distribution of the individual values around the average.

For instance, the mean strength can be higher than our minimum requirement, which sounds great. However, if the spread around the average is too broad, too many products can fall below the minimum required strength.

To create a tolerance interval, we’ll start by randomly sampling 100 plastic products and recording their strengths. Download the CSV data file: Strength. Here is the statistical output for tolerance intervals.

Tolerance intervals are sensitive to the distribution of the data. In the output, the normality test indicates that our plastic strength data are normally distributed. Therefore, we’ll use the Normal interval, which is 110—140 (rounded values). We can be 95% confident that at least 99% of all strength values for the product will be between 110 and 140.

How do we use these tolerance interval results? As the manufacturer, we need to compare the tolerance limits to our client’s requirements. If our tolerance interval is broader than the requirements, our production process produces too many defects.

### Tolerance Intervals vs Confidence Intervals

To help distinguish confidence intervals from tolerance intervals, here are some key differences.

A confidence interval estimates only the mean and the sampling error determines the width of a confidence interval. As the sample size approaches the whole population, the sample error decreases and the width of the CI approaches zero as it converges on the single value of the population mean.

A tolerance interval reflects the spread of values around the average. Both the sampling error and the dispersion of values in the entire population determine the widths of these ranges. As the sample size approaches the whole population, tolerance intervals don’t converge on a zero width. Instead, they converge on the actual width of the population associated with the percentage you specify.

The width is based on percentiles. For example, to determine where 99% of the population lies, the software determines the data values that correspond to the 99.5^{th} percentile and the 0.5^{th} percentile (99.5 – 0.5 = 99% of the population). Tolerance interval calculations factor in the sampling error associated with the sample estimates of the percentiles.

Tolerance intervals can help you identify cases where excess variation can cause problems. Compare your requirements to the tolerance intervals to determine whether excessive variation is a problem for your study area.

Confidence intervals are the most well-known ranges in statistics. However, you might need to use a different type of range based on your specific needs.

Limbu M. Limbu says

19 out of 25 intervals (95%) contain the population parameter (from the diagram above). I think should be 19 out of 20 intervals (95%) …….

Jim Frost says

Yes, indeed! Thank you! I’m off to make the edit now.

audiggerblog says

I really like your discussions and thoughts. As a geostatatician in mining, we tend to over complicate things and get caught up in the theoretical nuances of things. Your thoughts are giving me some clear and concise ways to think about some of these ideas in stats. Thank you.

Jim Frost says

Thank you very much for your kind comments. I really appreciate them!

Bruno says

Good job Jim! People usually hate statistics due the lack of simple and direct explanations like yours. I will be following your blog and will point it to some of my friends who are in need for it!

Jim Frost says

Thank you very much, Bruno! I always strive to provide clear explanations. I don’t think statistics has to be hard!

akroy1946 says

really knowledgeable writeup.

Jim Frost says

Thank you!

Steve Maggio says

Jim, How do you create a tolerance interval around a regression model. Most software adds a prediction interval around a regression model.

Jim Frost says

Hi Steve, a prediction interval is kind of like a tolerance interval in that you’re getting down to the distribution of individual observations with a given probability. Additionally, the standard error of the regression gives you similar information–the spread of the residuals around the fitted values. I’ve never heard of tolerance intervals for regression models though. I’m not sure if that’s possible–but you can get very similar information using the other tools. I did a quick Google search and there seems to be research in that area but I can’t point you to a specific statistical package that can do that for you. Here is what seems to be the major paper on tolerance intervals for linear regression.

Steve Maggio says

Thank you, Jim, I’m using a regression analysis to define the relationship between pay rate vs a job responsibility rating. I then use the residuals analysis, which is based on prediction intervals I believe, to identify any outliers. Is that correct?

Jim Frost says

I’m not sure about that. I’ve never used PIs to detect outliers. In fact, outliers can cause PIs to be wider, which would make the outliers less detectable. However, if there was a point that was far outside of the PI, it should make you wonder about it!

There are various other diagnostics for identifying outliers amongst the residuals that I’m more familiar with. You can look at the standardized value of the residuals. Standardized residuals greater than 2 and less than −2 are usually considered large. Although, you’d expect about 5% of the residuals to be unusual using this criterion, so it’s really just identifying candidates for further investigation. This approach sounds similar to a 95% PI approach. There are other measures such a Hi (assesses leverage) and Cook’s D (leverage and standardized value).

My preferred method is using the good old fashioned residuals by fitted values plot, along with the other residual plots. When it comes to assessing residuals in general, I place more weight on plots than the various numeric measures. It’s very easy to see unusual values on graphs. If you haven’t already, check out my post about residual plots. I suppose when it comes to justifying removing an outlier, it’s nice to have those numbers as support rather than just a perception from a graph. Although, even when you have the numbers, you need more of a justification than just the numbers. You need a reason for why the data point is truly invalid. At some point, I need to write a blog post about outliers!