## What is a Confidence Interval?

A confidence interval (CI) is a range of values that is likely to contain the value of an unknown population parameter. These intervals represent a plausible domain for the parameter given the characteristics of your sample data. Confidence intervals are derived from sample statistics and are calculated using a specified confidence level.

Population parameters are typically unknown because it is usually impossible to measure entire populations. By using a sample, you can estimate these parameters. However, the estimates rarely equal the parameter precisely thanks to random sampling error. Fortunately, inferential statistics procedures can evaluate a sample and incorporate the uncertainty inherent when using samples. Confidence intervals place a margin of error around the point estimate to help us understand how wrong the estimate might be.

You’ll frequently use confidence intervals to bound the sample mean and standard deviation parameters. But you can also create them for regression coefficients, proportions, rates of occurrence (Poisson), and the differences between populations.

**Related post**: Populations, Parameters, and Samples in Inferential Statistics

## What is the Confidence Level?

The confidence level is the long-run probability that a series of confidence intervals will contain the true value of the population parameter.

Different random samples drawn from the same population are likely to produce slightly different intervals. If you draw many random samples and calculate a confidence interval for each sample, a percentage of them will contain the parameter.

The confidence level is the percentage of the intervals that contain the parameter. For 95% confidence intervals, an average of 19 out of 20 include the population parameter, as shown below.

The CI procedure provides meaningful estimates because it produces ranges that usually contain the parameter. Hence, they present plausible values for the parameter.

Technically, you can create CIs using any confidence level between 0 and 100%. However, the most common confidence level is 95%. Analysts occasionally use 99% and 90%.

**Related posts**: Populations and Samples and Parameters vs. Statistics,

## How to Interpret Confidence Intervals

A confidence interval indicates where the population parameter is likely to reside. For example, a 95% confidence interval of the mean [9 11] suggests you can be 95% confident that the population mean is between 9 and 11.

Confidence intervals also help you navigate the uncertainty of how well a sample estimates a value for an entire population.

These intervals start with the point estimate for the sample and add a margin of error around it. The point estimate is the best guess for the parameter value. The margin of error accounts for the uncertainty involved when using a sample to estimate an entire population.

The width of the confidence interval around the point estimate reveals the precision. If the range is narrow, the margin of error is small, and there is only a tiny range of plausible values. That’s a precise estimate. However, if the interval is wide, the margin of error is large, and the actual parameter value is likely to fall *somewhere* within that more extensive range. That’s an imprecise estimate.

Ideally, you’d like a narrow confidence interval because you’ll have a much better idea of the actual population value!

For example, imagine we have two different samples with a sample mean of 10. It appears both estimates are the same. Now let’s assess the 95% confidence intervals. One interval is [5 15] while the other is [9 11]. The latter range is narrower, suggesting a more precise estimate.

That’s how CIs provide more information than the point estimate (e.g., sample mean) alone.

**Related post**: Precision vs. Accuracy

### Confidence Intervals for Effect Sizes

Confidence intervals are similarly helpful for understanding an effect size. For example, if you assess a treatment and control group, the mean difference between these groups is the estimated effect size. A 2-sample t-test can construct a confidence interval for the mean difference.

In this scenario, consider both the size and precision of the estimated effect. Ideally, an estimated effect is both large enough to be meaningful *and* sufficiently precise for you to trust. CIs allow you to assess both of these considerations! Learn more about this distinction in my post about Practical vs. Statistical Significance.

Learn more about how confidence intervals and hypothesis tests are similar.

**Related post**: Effect Sizes in Statistics

### Avoid a Common Misinterpretation of Confidence Intervals

A frequent misuse is applying confidence intervals to the distribution of sample values. Remember that these ranges apply only to population parameters, not the data values.

For example, a 95% confidence interval [10 15] indicates that we can be 95% confident that the *parameter* is within that range.

However, it does NOT indicate that 95% of the sample values occur in that range.

If you need to use your sample to find the proportion of data values likely to fall within a range, use a tolerance interval instead.

**Related post**: See how confidence intervals compare to prediction intervals and tolerance intervals.

## What Affects the Widths of Confidence Intervals?

Ok, so you want narrower CIs for their greater precision. What conditions produce tighter ranges?

Sample size, variability, and the confidence level affect the widths of confidence intervals. The first two are characteristics of your sample, which I’ll cover first.

### Sample Variability

Variability present in your data affects the precision of the estimate. Your confidence intervals will be broader when your sample standard deviation is high.

It makes sense when you think about it. When there is a lot of variability present in your sample, you’re going to be less sure about the estimates it produces. After all, a high standard deviation means your sample data are really bouncing around! That’s not conducive for finding precise estimates.

Unfortunately, you often don’t have much control over data variability. You can institute measurement and data collection procedures that reduce outside sources of variability, but after that, you’re at the mercy of the variability inherent in your subject area. But, if you can reduce external sources of variation, that’ll help you reduce the width of your confidence intervals.

### Sample Size

Increasing your sample size is the primary way to reduce the widths of confidence intervals because, in most cases, you can control it more than the variability. If you don’t change anything else and only increase the sample size, the ranges tend to narrow. Need even tighter CIs? Just increase the sample size some more!

Theoretically, there is no limit, and you can dramatically increase the sample size to produce remarkably narrow ranges. However, logistics, time, and cost issues will constrain your maximum sample size in the real world.

In summary, larger sample sizes and lower variability reduce the margin of error around the point estimate and create narrower confidence intervals. I’ll point out these factors again when we get to the formula later in this post.

**Related post**: Sample Statistics Are Always Wrong (to Some Extent)!

## Changing the Confidence Level

The confidence level also affects the confidence interval width. However, this factor is a methodology choice separate from your sample’s characteristics.

If you increase the confidence level (e.g., 95% to 99%) while holding the sample size and variability constant, the confidence interval widens. Conversely, decreasing the confidence level (e.g., 95% to 90%) narrows the range.

I’ve found that many students find the effect of changing the confidence level on the width of the range to be counterintuitive.

Imagine you take your knowledge of a subject area and indicate you’re 95% confident that the correct answer lies between 15 and 20. Then I ask you to give me your confidence for it falling between 17 and 18. The correct answer is less likely to fall within the narrower interval, so your confidence naturally decreases.

Conversely, I ask you about your confidence that it’s between 10 and 30. That’s a much wider range, and the correct value is more likely to be in it. Consequently, your confidence grows.

Confidence levels involve a tradeoff between confidence and the interval’s spread. To have more confidence that the parameter falls within the interval, you must widen the interval. Conversely, your confidence necessarily decreases if you use a narrower range.

## Confidence Interval Formula

Confidence intervals account for sampling uncertainty by using critical values, sampling distributions, and standard errors. The precise formula depends on the type of parameter you’re evaluating. The most common type is for the mean, so I’ll stick with that.

You’ll use critical Z-values or t-values to calculate your confidence interval of the mean. T-values produce more accurate confidence intervals when you do not know the population standard deviation. That’s particularly true for sample sizes smaller than 30. For larger samples, the two methods produce similar results. In practice, you’d usually use a t-value.

Below are the confidence interval formulas for both Z and t. However, you’d only use one of them.

Where:

- x̄ = the sample mean, which is the point estimate.
- Z = the critical z-value
- t = the critical t-value
- s = the sample standard deviation
- s / √n = the standard error of the mean

The only difference between the two formulas is the critical value. If you’re using the critical z-value, you’ll always use 1.96 for 95% confidence intervals. However, for the t-value, you’ll need to know the degrees of freedom and then look up the critical value in a t-table or online calculator.

To calculate a confidence interval, take the critical value (Z or t) and multiply it by the standard error of the mean (SEM). This value is known as the margin of error (MOE). Then add and subtract the MOE from the sample mean (x̄) to produce the upper and lower limits of the range.

**Related posts**: Critical Values, Standard Error of the Mean, and Sampling Distributions

### Interval Widths Revisited

Think back to the discussion about the factors affecting the confidence interval widths. The formula helps you understand how that works. Recall that the critical value * SEM = MOE.

Smaller margins of error produce narrower confidence intervals. By looking at this equation, you can see that the following conditions create a smaller MOE:

- Smaller critical values, which you obtain by decreasing the confidence level.
- Smaller standard deviations, because they’re in the numerator of the SEM.
- Large samples sizes, because its square root is in the denominator of the SEM.

## How to Find a Confidence Interval

Let’s move on to using these formulas to find a confidence interval! For this example, I’ll use a fuel cost dataset that I’ve used in other posts: FuelCosts. The dataset contains a random sample of 25 fuel costs. We want to calculate the 95% confidence interval of the mean.

However, imagine we have only the following summary information instead of the dataset.

- Sample mean: 330.6
- Standard deviation: 154.2
- N = 25

Fortunately, that’s all we need to calculate our 95% confidence interval of the mean.

We need to decide on using the critical Z or t-value. I’ll use a critical t-value because the sample size (25) is less than 30. However, if the summary didn’t provide the sample size, we could use the Z-value method for an approximation.

My next step is to look up the critical t-value using my t-table. In the table, I’ll choose the alpha that equals 1 – the confidence level (1 – 0.95 = 0.05) for a two-sided test. Below is a truncated version of the t-table. Click for the full t-distribution table.

In the table, I see that for a two-sided interval with 25 – 1 = 24 degrees of freedom and an alpha of 0.05, the critical value is 2.064.

### Entering Values into the Confidence Interval Formula

Let’s enter all of this information into the formula.

First, I’ll calculate the margin of error:

Next, I’ll take the sample mean and add and subtract the margin of error from it:

- 330.6 + 63.6 = 394.2
- 330.6 – 63.6 = 267.0

The 95% confidence interval of the mean for fuel costs is 267.0 – 394.2. We can be 95% confident that the population mean falls within this range.

If you had used the critical z-value (1.96), you would enter that into the formula instead of the t-value (2.064) and obtain a slightly different confidence interval. However, t-values produce more accurate results, particularly for smaller samples like this one.

As an aside, the Z-value method always produces narrower confidence intervals than t-values when your sample size is less than infinity. So, basically always! However, that’s not good because Z-values underestimate the uncertainty when you’re using a sample estimate of the standard deviation rather than the actual population value. And you practically never know the population standard deviation.

## Reference

Neyman, J. (1937). Outline of a Theory of Statistical Estimation Based on the Classical Theory of Probability. *Philosophical Transactions of the Royal Society A*. **236** (767): 333–380.

Stephen Brown says

Hi Jim,

This was an excellent article, thank you! I have a question: when computing a CI in its single-sample t-test module, SPSS appears to use the difference between population and sample means as a starting point (so the formula would be (X-bar-mu) +/- tcv(SEM)). I’ve consulted multiple stats books, but none of them compute a CI that way for a single-sample t-test. Maybe I’m just missing something and this is a perfectly acceptable way of doing things (I mean, SPSS does it :-)), but it yields substantially different lower and upper bounds from a CI that uses the traditional X-bar as a starting point. Do you have any insights? Many thanks in advance!

Stephen

Jim Frost says

Hi Stephen,

I’m not an SPSS user but that formula is confusing. They presented this formula as being for the CI of a sample mean?

I’m not sure why they’re subtracting Mu. For one thing, you almost never know what Mu is because you’d have to measure the entire population. And, if you knew Mu, you wouldn’t need to perform a t-test! Why would you use a sample mean (X-bar) if you knew the population mean? None of it makes sense to me. It must be an error of some kind even if just of documentation.

D says

Are there strict distinctions between the terms “confident”, “likely”, and “probability”? I’ve seen a number of other sources exclaim that for a given calculated confidence interval, the frequentist interpretation of that is the parameter is either in or not in that interval. They say another frequent misinterpretation is that the parameter lies within a calculated interval with a 95% probability.

It’s very confusing to balance that notion with practical casual communication of data in non-research settings.

Jim Frost says

Hi,

It is a confusing issue.

In this strictest technical sense, the confidence level is probability that applies to the process but NOT an individual confidence interval. There are several reasons for that.

In the frequentist framework, the probability that an individual CI contains the parameter is either 100% or 0%. It’s either in it or out. The parameter is not a random variable. However, because you don’t know the parameter value, you don’t know which of those two conditions is correct. That’s the conceptual approach. And the mathematics behind the scenes are complementary to that. There’s just no way to calculate the probability that an individual CI contains the parameter.

On the other hand, the process behind creating the intervals will cause X% of the CIs at the Xth confidence level to include that parameter. So, for all 95% CIs, you’d expect 95% of them to contain the parameter value. The confidence level applies to the process, not the individual CIs. Statisticians intentionally used the term “confidence” to describe that as opposed to “probability” hoping to make that distinction.

So, the 95% confidence applies the process but not individual CIs.

However, if you’re thinking that if 95% of many CIs contain the parameter, then surely a single CI has a 95% probability. From a technical standpoint, that is NOT true. However, it sure sounds logical. Most statistics make intuitive sense to me, but I struggle with that one myself. I’ve asked other statisticians to get their take on it. The basic gist of their answers is that there might be other information available which can alter the actual probability. Not all CIs produced by the process have the same probability. For example, if an individual CI is a bit higher or lower than most other CIs for the same thing, the CIs with the unusual values will have lower probabilities for containing the parameters.

I think that makes sense. The only problem is that you often don’t know where your individual CI fits in. That means you don’t know the probability for it specifically. But you do know the overall probability for the process.

The answer for this question is never totally satisfying. Just remember that there is no mathematical way in the frequentist framework to calculate the probability that an individual CI contains the parameter. However, the overall process is designed such that all CIs using a particular confidence level will have the specified proportion containing the parameter. However, you can’t apply that overall proportion to your individual CI because on the technical side there’s no mathematical way to do that and conceptually, you don’t know where your individual CI fits in the entire distribution of CIs.