The normal distribution is the most important probability distribution in statistics because it fits many natural phenomena. For example, heights, blood pressure, measurement error, and IQ scores follow the normal distribution. It is also known as the Gaussian distribution and the bell curve.

The normal distribution is a probability function that describes how the values of a variable are distributed. It is a symmetric distribution where most of the observations cluster around the central peak and the probabilities for values further away from the mean taper off equally in both directions. Extreme values in both tails of the distribution are similarly unlikely.

In this blog post, you’ll learn how to use the normal distribution, its parameters, and how to calculate Z-scores to standardize your data and find probabilities.

## Example of Normally Distributed Data: Heights

Height data are normally distributed. The distribution in this example fits real data that I collected from 14-year-old girls during a study.

As you can see, the distribution of heights follows the typical pattern for all normal distributions. Most girls are close to the average (1.512 meters). Small differences between an individual’s height and the mean occur more frequently than substantial deviations from the mean. The standard deviation is 0.0741m, which indicates the typical distance that individual girls tend to fall from mean height.

The distribution is symmetric. The number of girls shorter than average equals the number of girls taller than average. In both tails of the distribution, extremely short girls occur as infrequently as extremely tall girls.

## Parameters of the Normal Distribution

As with any probability distribution, the parameters for the normal distribution define its shape and probabilities entirely. The normal distribution has two parameters, the mean and standard deviation. The normal distribution does not have just one form. Instead, the shape changes based on the parameter values, as shown in the graphs below.

### Mean

The mean is the central tendency of the distribution. It defines the location of the peak for normal distributions. Most values cluster around the mean. On a graph, changing the mean shifts the entire curve left or right on the X-axis.

### Standard deviation

The standard deviation is a measure of variability. It defines the width of the normal distribution. The standard deviation determines how far away from the mean the values tend to fall. It represents the typical distance between the observations and the average.

On a graph, changing the standard deviation either tightens or spreads out the width of the distribution along the X-axis. Larger standard deviations produce distributions that are more spread out.

When you have narrow distributions, the probabilities are higher that values won’t fall far from the mean. As you increase the spread of the distribution, the likelihood that observations will be further away from the mean also increases.

### Population parameters versus sample estimates

The mean and standard deviation are parameter values that apply to entire populations. For the normal distribution, statisticians signify the parameters by using the Greek symbol μ (mu) for the population mean and σ (sigma) for the population standard deviation.

Unfortunately, population parameters are usually unknown because it’s generally impossible to measure an entire population. However, you can use random samples to calculate estimates of these parameters. Statisticians represent sample estimates of these parameters using x̅ for the sample mean and s for the sample standard deviation.

**Related posts**: Measures of Central Tendency and Measures of Variability

## Common Properties for All Forms of the Normal Distribution

Despite the different shapes, all forms of the normal distribution have the following characteristic properties.

- They’re all symmetric. The normal distribution cannot model skewed distributions.
- The mean, median, and mode are all equal.
- Half of the population is less than the mean and half is greater than the mean.
- The Empirical Rule allows you to determine the proportion of values that fall within certain distances from the mean. More on this below!

While the normal distribution is essential in statistics, it is just one of many probability distributions, and it does not fit all populations. To learn how to determine whether the normal distribution provides the best fit to your sample data, read my post about How to Identify the Distribution of Your Data.

## The Empirical Rule for the Normal Distribution

When you have normally distributed data, the standard deviation becomes particularly valuable. You can use it to determine the proportion of the values that fall within a specified number of standard deviations from the mean. For example, in a normal distribution, 68% of the observations fall within +/- 1 standard deviation from the mean. This property is part of the Empirical Rule, which describes the percentage of the data that fall within specific numbers of standard deviations from the mean for bell-shaped curves.

Mean +/- standard deviations | Percentage of data contained |

1 | 68% |

2 | 95% |

3 | 99.7% |

Let’s look at a pizza delivery example. Assume that a pizza restaurant has a mean delivery time of 30 minutes and a standard deviation of 5 minutes. Using the Empirical Rule, we can determine that 68% of the delivery times are between 25-35 minutes (30 +/- 5), 95% are between 20-40 minutes (30 +/- 2*5), and 99.7% are between 15-45 minutes (30 +/-3*5). The chart below illustrates this property graphically.

## Standard Normal Distribution and Standard Scores

As we’ve seen above, the normal distribution has many different shapes depending on the parameter values. However, the standard normal distribution is a special case of the normal distribution where the mean is zero and the standard deviation is 1. This distribution is also known as the Z-distribution.

A value on the standard normal distribution is known as a standard score or a Z-score. A standard score represents the number of standard deviations above or below the mean that a specific observation falls. For example, a standard score of 1.5 indicates that the observation is 1.5 standard deviations above the mean. On the other hand, a negative score represents a value below the average. The mean has a Z-score of 0.

Suppose you weigh an apple and it weighs 110 grams. There’s no way to tell from the weight alone how this apple compares to other apples. However, as you’ll see, after you calculate its Z-score, you know where it falls relative to other apples.

## Standardization: How to Calculate Z-scores

Standard scores are a great way to understand where a specific observation falls relative to the entire distribution. They also allow you to take observations drawn from normally distributed populations that have different means and standard deviations and place them on a standard scale. This standard scale enables you to compare observations that would otherwise be difficult.

This process is called standardization, and it allows you to compare observations and calculate probabilities across different populations. In other words, it permits you to compare apples to oranges. Isn’t statistics great!

To standardize your data, you need to convert the raw measurements into Z-scores.

To calculate the standard score for an observation, take the raw measurement, subtract the mean, and divide by the standard deviation. Mathematically, the formula for that process is the following:

X represents the raw value of the measurement of interest. Mu and sigma represent the parameters for the population from which the observation was drawn.

After you standardize your data, you can place them within the standard normal distribution. In this manner, standardization allows you to compare different types of observations based on where each observation falls within its own distribution.

## Example of Using Standard Scores to Make an Apples to Oranges Comparison

Suppose we literally want to compare apples to oranges. Specifically, let’s compare their weights. Imagine that we have an apple that weighs 110 grams and an orange that weighs 100 grams.

If we compare the raw values, it’s easy to see that the apple weighs more than the orange. However, let’s compare their standard scores. To do this, we’ll need to know the properties of the weight distributions for apples and oranges. Assume that the weights of apples and oranges follow a normal distribution with the following parameter values:

Apples | Oranges | |

Mean weight grams | 100 | 140 |

Standard Deviation | 15 | 25 |

Now we’ll calculate the Z-scores:

- Apple = 110-100/15 = 0.667
- Orange = 100-140/25 = -1.6

The Z-score for the apple (0.667) is positive, which means that our apple weighs more than the average apple. It’s not an extreme value by any means, but it is above average for apples. On the other hand, the orange has fairly negative Z-score (-1.6). It’s pretty far below the mean weight for oranges. I’ve placed these Z-values in the standard normal distribution below.

While our apple weighs more than our orange, we are comparing a somewhat heavier than average apple to a downright puny orange! Using Z-scores, we’ve learned how each fruit fits within its own distribution and how they compare to each other.

## Finding Areas Under the Curve of a Normal Distribution

The normal distribution is a probability distribution. As with any probability distribution, the proportion of the area that falls under the curve between two points on a probability distribution plot indicates the probability that a value will fall within that interval. To learn more about this property, read my post about Understanding Probability Distributions.

Typically, I use statistical software to find areas under the curve. However, when you’re working with the normal distribution and convert values to standard scores, you can calculate areas by looking up Z-scores in a Standard Normal Distribution Table.

Because there are an infinite number of different normal distributions, publishers can’t print a table for each distribution. However, you can transform the values from any normal distribution into Z-scores, and then use a table of standard scores to calculate probabilities.

### Using a Table of Z-scores

Let’s take the Z-score for our apple (0.667) and use it to determine its weight percentile. A percentile is the proportion of a population that falls below a specific value. Consequently, to determine the percentile, we need to find the area that corresponds to the range of Z-scores that are less than 0.667. In the portion of the table below, the closest Z-score to ours is 0.65, which we’ll use.

The trick with these tables is to use the values in conjunction with the properties of the normal distribution to calculate the probability that you need. The table value indicates that the area of the curve between -0.65 and +0.65 is 48.43%. However, that’s not what we want to know. We want the area that is less than a Z-score of 0.65.

We know that the two halves of the normal distribution are mirror images of each other. So, if the area for the interval from -0.65 and +0.65 is 48.43%, then the range from 0 to +0.65 must be half of that: 48.43/2 = 24.215%. Additionally, we know that the area for all scores less than zero is half (50%) of the distribution.

Therefore, the area for all scores up to 0.65 = 50% + 24.215% = 74.215%

Our apple is at approximately the 74^{th} percentile.

Below is a probability distribution plot produced by statistical software that shows the same percentile along with a graphical representation of the corresponding area under the curve. The value is slightly different because we used a Z-score of 0.65 from the table while the software uses the more precise value of 0.667.

## Other Reasons Why the Normal Distribution is Important

In addition to all of the above, there are several other reasons why the normal distribution is crucial in statistics.

- Some statistical hypothesis tests assume that the data follow a normal distribution. However, as I explain in my post about parametric and nonparametric tests, there’s more to it than only whether the data are normally distributed.
- Linear and nonlinear regression both assume that the residuals follow a normal distribution. Learn more in my post about assessing residual plots.
- The central limit theorem states that as the sample size increases, the sampling distribution of the mean follows a normal distribution even when the underlying distribution of the original variable is non-normal.

That was quite a bit about the normal distribution! Hopefully, you can understand that it is crucial because of the many ways that analysts use it.

MG says

Thank you very much for your great post. Cheers from MA

Jim Frost says

You’re very welcome! I’m glad it was helpful! ๐

Fernando Antunez says

Jim, it is my understanding that the normal distribution is unique and it is the one that follows to perfection the 68 95 99.7%. The rest of the distributions are “approximately” normal, as you say when they get wider. They are still symmetric but not normal because they lost perfection to the empirical rule. I was taught this by a professor when I was doing my master;s in Stats

Jim Frost says

Hi Fernando, all normal distributions (for those cases where you input any values for the mean and standard deviation parameters) follow the Empirical Rule (68%, 95%, 99.7%). There are other symmetric distributions that aren’t quite normal distributions. I think you’re referring to these symmetric distributions that have thicker or thinner tails than normal distributions should. Kurtosis measures the thickness of the tails. Distributions with high kurtosis have thicker tails and those with low kurtosis has thinner tails. If a distribution has thicker or thinner tails than the true normal distribution, then the Empirical Rule doesn’t hold true. How off the rule is depends on how different the distribution is from a true normal distribution. Some of these distributions can be considered approximately normal.

However, this gets confusing because you can have true normal distributions that have wider spreads than other normal distributions. This spread doesn’t necessarily make them non-normal. The example of the wider distribution that I show in the Standard Deviation section

isa true normal distribution. These wider normal distributions follow the Empirical Rule. If you have sample data and are trying to determine whether they follow a normal distribution, perform a normality test.On the other hand, there are other distributions that are not symmetrical at all and very different from the normal distribution. They’re different by more than just the thickness of the tails. For example the lognormal distribution can model very skewed distributions. Some of these distributions are nowhere close to being approximately normal!

So, you can have a wide variety of non-normal distributions that range from approximately normal to not close at all!

Khursheed Ahmad Ganaie says

I was eagerly waitng fr ths topic ..

Normal distribution

Thnks a lott ,,,,,,

Jim Frost says

You’re very welcome, Khursheed!

John-Harold says

Another great post. Simple, clear and direct language and logic.

Jim Frost says

Thanks so much! That’s always my goal–so your kind words mean a lot to me!

Masum Ahmed says

your are far better than my teachers. Thank you Jim

Jim Frost says

Thank you, Masum!

Muhammad Arif says

dear Jim, tell me please what is normality?. and how we can understand to use normal or any other distribution for a data set?

Jim Frost says

Hi Muhammad, you’re in the right place to find the information that you need! This blog post tells you all about the normal distribution. Normality simply refers to data that are normally distributed (i.e., the data follow the normal distribution).

I have links in this post to another post called Understand Probability Distributions that tells you about other distributions. And yet another link to a post that tells you How to Determine the Distribution of Your Data.

Muhammad Arif says

Many Many thanks for help dear Jim sir!

Jim Frost says

You’re very welcome! ๐

Asis Kumar Dirghangi says

Excellent…..

Jim Frost says

Thank you, Asis!

Josh Pius says

I’m glad I stumbled across your blog ๐ Wonderful work!! I’ve gained an new perspective on what statistics could mean to me

Jim Frost says

Hi Josh, that is awesome! My goal is to show that statistics can actually be exciting! So, your comment means a lot to me! Thanks!

Aashay Sukhthankar says

Hi Jim. What exactly do you mean by a true normal distribution. You’ve not used the word “true” anywhere in your post. Just plain normal distribution.

Jim Frost says

Hi Aashay, sorry about the confusing terminology. What I meant by true normal distribution is one that follows a normal distribution to mathematically perfect degree. For example, the graphs of all the normal distributions in this post are true normal distributions because the statistical software graphs them based on the equation for the normal distribution plus the parameter values for the inputs.

By the way, there is not one shape that corresponds to a true normal distribution. Instead, there are an infinite number and they’re all based on the infinite number of different means and standard deviations that you can input into the equation for the normal distribution.

Typically, data don’t follow the normal distribution exactly. A distribution test can determine whether the deviation from the normal distribution is statistically significant.

In the comment where I used this terminology, I was just trying to indicate how as a distribution deviated from a true normal distribution, the Empirical Rule also deviates.

I hope this helps.

Carlos says

Hello Jim, first of all, your page is very good, it has helped me a lot to understand statistics.

Query, then when I have a data set that is not distributed normally, should I first transform them to normal and then start working them? Greetings from Chile, CLT

Jim Frost says

Hi Carlos,

This gets a little tricky. For one thing, it depends what you want do with the data. If you’re talking about hypothesis tests, you can often use the regular tests with non-normal data when you have a sufficiently large sample size. “Sufficiently large” isn’t really even that large. You can also use nonparametric tests for nonnormal data. There are several issues to consider, which I write about in my post that compares parametric and nonparametric hypothesis tests.

That should help clarify some of the issues. After reading that, let me now if you have any additional questions. Generally, I’m not a fan of transforming data because it completely changes the properties of your data.

Mona says

Do natural phenomena such as hemoglobin levels or the weight of ants really follow a normal distribution? If you add up a large number of random events, you get a normal distribution.

Jim Frost says

To obtain a normal distribution, you need the random errors to have an equal probability of being positive and negative and the errors are more likely to be small than large.

Many datasets will naturally follow the normal distribution. For example, the height data in this blog post are real data and they follow the normal distribution. However, not all datasets and variables have that tendency. The weight data for the same subjects that I used for the weight data are not normally distributed. Those data are right skewed–which you can read about in my post about identifying the distribution of a dataset.

Qaz says

Sir kindly guide me. I have panel data. My all variables are not normally distributed. data is in ratios form. My question is that , For descriptive statistics and correlation analysis, do i need to use raw data in its original form?? and transformed data for regression analysis only?

Moreover, which transformation method should be used for ratios, when data is highly positively or negatively skewed. I tried, log, difference, reciprocal, but could not get the normality.

Kindly help me. Thank You

Sanjay Sinha says

Fantastic way of explaining

Jim Frost says

Thank you, Sanjay!

Noor Nawaz says

Nice work sir…