The normal distribution is the most important probability distribution in statistics because it fits many natural phenomena. For example, heights, blood pressure, measurement error, and IQ scores follow the normal distribution. It is also known as the Gaussian distribution and the bell curve.
The normal distribution is a probability function that describes how the values of a variable are distributed. It is a symmetric distribution where most of the observations cluster around the central peak and the probabilities for values further away from the mean taper off equally in both directions. Extreme values in both tails of the distribution are similarly unlikely.
In this blog post, you’ll learn how to use the normal distribution, about its parameters, and how to calculate Z-scores to standardize your data and find probabilities.
Example of Normally Distributed Data: Heights
Height data are normally distributed. The distribution in this example fits real data that I collected from 14-year-old girls during a study.
As you can see, the distribution of heights follows the typical pattern for all normal distributions. Most girls are close to the average (1.512 meters). Small differences between an individual’s height and the mean occur more frequently than substantial deviations from the mean. The standard deviation is 0.0741m, which indicates the typical distance that individual girls tend to fall from mean height.
The distribution is symmetric. The number of girls shorter than average equals the number of girls taller than average. In both tails of the distribution, extremely short girls occur as infrequently as extremely tall girls.
Parameters of the Normal Distribution
As with any probability distribution, the parameters for the normal distribution define its shape and probabilities entirely. The normal distribution has two parameters, the mean and standard deviation. The normal distribution does not have just one form. Instead, the shape changes based on the parameter values, as shown in the graphs below.
Mean
The mean is the central tendency of the distribution. It defines the location of the peak for normal distributions. Most values cluster around the mean. On a graph, changing the mean shifts the entire curve left or right on the X-axis.
Standard deviation
The standard deviation is a measure of variability. It defines the width of the normal distribution. The standard deviation determines how far away from the mean the values tend to fall. It represents the typical distance between the observations and the average.
On a graph, changing the standard deviation either tightens or spreads out the width of the distribution along the X-axis. Larger standard deviations produce distributions that are more spread out.
When you have narrow distributions, the probabilities are higher that values won’t fall far from the mean. As you increase the spread of the distribution, the likelihood that observations will be further away from the mean also increases.
Population parameters versus sample estimates
The mean and standard deviation are parameter values that apply to entire populations. For the normal distribution, statisticians signify the parameters by using the Greek symbol μ (mu) for the population mean and σ (sigma) for the population standard deviation.
Unfortunately, population parameters are usually unknown because it’s generally impossible to measure an entire population. However, you can use random samples to calculate estimates of these parameters. Statisticians represent sample estimates of these parameters using x̅ for the sample mean and s for the sample standard deviation.
Related posts: Measures of Central Tendency and Measures of Variability
Common Properties for All Forms of the Normal Distribution
Despite the different shapes, all forms of the normal distribution have the following characteristic properties.
- They’re all symmetric. The normal distribution cannot model skewed distributions.
- The mean, median, and mode are all equal.
- Half of the population is less than the mean and half is greater than the mean.
- The Empirical Rule allows you to determine the proportion of values that fall within certain distances from the mean. More on this below!
While the normal distribution is essential in statistics, it is just one of many probability distributions, and it does not fit all populations. To learn how to determine whether the normal distribution provides the best fit to your sample data, read my posts about How to Identify the Distribution of Your Data and Assessing Normality: Histograms vs. Normal Probability Plots.
The Empirical Rule for the Normal Distribution
When you have normally distributed data, the standard deviation becomes particularly valuable. You can use it to determine the proportion of the values that fall within a specified number of standard deviations from the mean. For example, in a normal distribution, 68% of the observations fall within +/- 1 standard deviation from the mean. This property is part of the Empirical Rule, which describes the percentage of the data that fall within specific numbers of standard deviations from the mean for bell-shaped curves.
Mean +/- standard deviations | Percentage of data contained |
1 | 68% |
2 | 95% |
3 | 99.7% |
Let’s look at a pizza delivery example. Assume that a pizza restaurant has a mean delivery time of 30 minutes and a standard deviation of 5 minutes. Using the Empirical Rule, we can determine that 68% of the delivery times are between 25-35 minutes (30 +/- 5), 95% are between 20-40 minutes (30 +/- 2*5), and 99.7% are between 15-45 minutes (30 +/-3*5). The chart below illustrates this property graphically.
Standard Normal Distribution and Standard Scores
As we’ve seen above, the normal distribution has many different shapes depending on the parameter values. However, the standard normal distribution is a special case of the normal distribution where the mean is zero and the standard deviation is 1. This distribution is also known as the Z-distribution.
A value on the standard normal distribution is known as a standard score or a Z-score. A standard score represents the number of standard deviations above or below the mean that a specific observation falls. For example, a standard score of 1.5 indicates that the observation is 1.5 standard deviations above the mean. On the other hand, a negative score represents a value below the average. The mean has a Z-score of 0.
Suppose you weigh an apple and it weighs 110 grams. There’s no way to tell from the weight alone how this apple compares to other apples. However, as you’ll see, after you calculate its Z-score, you know where it falls relative to other apples.
Standardization: How to Calculate Z-scores
Standard scores are a great way to understand where a specific observation falls relative to the entire distribution. They also allow you to take observations drawn from normally distributed populations that have different means and standard deviations and place them on a standard scale. This standard scale enables you to compare observations that would otherwise be difficult.
This process is called standardization, and it allows you to compare observations and calculate probabilities across different populations. In other words, it permits you to compare apples to oranges. Isn’t statistics great!
To standardize your data, you need to convert the raw measurements into Z-scores.
To calculate the standard score for an observation, take the raw measurement, subtract the mean, and divide by the standard deviation. Mathematically, the formula for that process is the following:
X represents the raw value of the measurement of interest. Mu and sigma represent the parameters for the population from which the observation was drawn.
After you standardize your data, you can place them within the standard normal distribution. In this manner, standardization allows you to compare different types of observations based on where each observation falls within its own distribution.
Example of Using Standard Scores to Make an Apples to Oranges Comparison
Suppose we literally want to compare apples to oranges. Specifically, let’s compare their weights. Imagine that we have an apple that weighs 110 grams and an orange that weighs 100 grams.
If we compare the raw values, it’s easy to see that the apple weighs more than the orange. However, let’s compare their standard scores. To do this, we’ll need to know the properties of the weight distributions for apples and oranges. Assume that the weights of apples and oranges follow a normal distribution with the following parameter values:
Apples | Oranges | |
Mean weight grams | 100 | 140 |
Standard Deviation | 15 | 25 |
Now we’ll calculate the Z-scores:
- Apple = (110-100) / 15 = 0.667
- Orange = (100-140) / 25 = -1.6
The Z-score for the apple (0.667) is positive, which means that our apple weighs more than the average apple. It’s not an extreme value by any means, but it is above average for apples. On the other hand, the orange has fairly negative Z-score (-1.6). It’s pretty far below the mean weight for oranges. I’ve placed these Z-values in the standard normal distribution below.
While our apple weighs more than our orange, we are comparing a somewhat heavier than average apple to a downright puny orange! Using Z-scores, we’ve learned how each fruit fits within its own distribution and how they compare to each other.
Finding Areas Under the Curve of a Normal Distribution
The normal distribution is a probability distribution. As with any probability distribution, the proportion of the area that falls under the curve between two points on a probability distribution plot indicates the probability that a value will fall within that interval. To learn more about this property, read my post about Understanding Probability Distributions.
Typically, I use statistical software to find areas under the curve. However, when you’re working with the normal distribution and convert values to standard scores, you can calculate areas by looking up Z-scores in a Standard Normal Distribution Table.
Because there are an infinite number of different normal distributions, publishers can’t print a table for each distribution. However, you can transform the values from any normal distribution into Z-scores, and then use a table of standard scores to calculate probabilities.
Using a Table of Z-scores
Let’s take the Z-score for our apple (0.667) and use it to determine its weight percentile. A percentile is the proportion of a population that falls below a specific value. Consequently, to determine the percentile, we need to find the area that corresponds to the range of Z-scores that are less than 0.667. In the portion of the table below, the closest Z-score to ours is 0.65, which we’ll use.
The trick with these tables is to use the values in conjunction with the properties of the normal distribution to calculate the probability that you need. The table value indicates that the area of the curve between -0.65 and +0.65 is 48.43%. However, that’s not what we want to know. We want the area that is less than a Z-score of 0.65.
We know that the two halves of the normal distribution are mirror images of each other. So, if the area for the interval from -0.65 and +0.65 is 48.43%, then the range from 0 to +0.65 must be half of that: 48.43/2 = 24.215%. Additionally, we know that the area for all scores less than zero is half (50%) of the distribution.
Therefore, the area for all scores up to 0.65 = 50% + 24.215% = 74.215%
Our apple is at approximately the 74^{th} percentile.
Below is a probability distribution plot produced by statistical software that shows the same percentile along with a graphical representation of the corresponding area under the curve. The value is slightly different because we used a Z-score of 0.65 from the table while the software uses the more precise value of 0.667.
Related post: Percentiles: Interpretations and Calculations
Other Reasons Why the Normal Distribution is Important
In addition to all of the above, there are several other reasons why the normal distribution is crucial in statistics.
- Some statistical hypothesis tests assume that the data follow a normal distribution. However, as I explain in my post about parametric and nonparametric tests, there’s more to it than only whether the data are normally distributed.
- Linear and nonlinear regression both assume that the residuals follow a normal distribution. Learn more in my post about assessing residual plots.
- The central limit theorem states that as the sample size increases, the sampling distribution of the mean follows a normal distribution even when the underlying distribution of the original variable is non-normal.
That was quite a bit about the normal distribution! Hopefully, you can understand that it is crucial because of the many ways that analysts use it.
If you’re learning about statistics and like the approach I use in my blog, check out my Introduction to Statistics eBook!
rebecca chaison says
What does random variable X̄ (capital x-bar) mean? How would you describe it?
Jim Frost says
X-bar refers to the variable’s mean.
Seidu says
Very helpful
sookhooreea luqmaan says
hi
how can i compare between binomial, normal and poisson distribution?
Oumayma Bounouh says
Dear jim
Thank you very much for your post. It clarifies many notions.
I have an issue I hope you have the answer.
To combie forecasting models, I have chosen to calculte the weights based on the normal distribution. This latter is fitted on the past observation of the data I am forecasting. In this case are the weights equal to the PDF or should I treat it as an error measure, so it would be equal to 1/PDF ???
Jude says
My problem in interpreting normal and poisson distribution remains. When you want to calculate the probability of selling a random number of apples in a week for instance and you want to work this out with excel spreadsheet How do you know when to subtract your answer from one or not? Is the mean the sole reference?
Thibert says
Hi Jim,
Thank you for your your post.
I have one small question concerning the Empirical Rule (68%, 95%, 99.7%):
In a normal distribution, 68% of the observations will fall between +/- 1 standard deviation from the mean.
For example, the lateral deviation of a dart from the middle of the bullseye is defined by a normal distribution with a mean of 0 cm and a standard deviation of 5 cm. Would it be possible to affirm that there is a probability of 68% that the dart will hit the board inside a ring of radius of 5 cm?
I’m confused because for me the probability of having a lateral deviation smaller than the standard deviation (x < 1 m ) is 84%.
Thank you,
Jim Frost says
Hi Thibert,
If it was me playing darts, the standard deviation would be much higher than 5 cm!
So, your question is really about two aspects: accuracy and precision.
Accuracy has to do with where the darts fall on average. The mean of zero indicates that on average the darts center on the bullseye. If it had been a non-zero value, the darts would have been centered elsewhere.
The standard deviation has to do with precision, which is how close to the target the darts tend to hit. Because the darts clustered around the bullseye and have a standard deviation of 5cm, you’d be able to say that 68% of darts will fall within 5cm of the bullseye assuming the distances follow a normal distribution (or at least fairly close).
I’m not sure what you’re getting at with the lateral deviation being less than the standard deviation? I thought you were defining it as the standard deviation? I’m also not sure where you’re getting 84% from? It’s possible I’m missing something that you’re asking about.
Silvia says
Hi Jim,
I hope this question is relevant. I’ve been trying to find an answer to this question for quite some time.
Is it possible to correlate two samples if one is normally distributed and the other is not normally distributed?
Many thanks for your time.
Jim Frost says
Hi Silvia,
When you’re talking about Pearson’s correlation between two continuous variables, it assumes that the two variables follow a bivariate normal distribution. Defining that is a bit complicated! Read here for a technical definition. However, when you have more than 25 observations, you can often disregard this assumption.
Additionally, as I write in my post about correlation, you should graph the data. Sometimes it graph is an obvious way to know when you won’t get good results!
I hope that helps!
ARCHANA DIXIT says
Awesome explanation Jim, all doubts about Z score got cleared up. by any chance do you have a soft copy of your book. or is it available in India?
Thanks.
Jim Frost says
Hi Archana, I’m glad this post was helpful!
You can get my ebooks from anywhere in the world. Just go to My Store.
My Introduction to Statistics book, which is the one that covers the normal distribution among others, is also available in print. You should be able to order that from your preferred online retailer or ask a local bookstore to order it (ISBN: 9781735431109).
ararc says
super explanation
Atharva says
you can use Python Numpy library random.normal
adnan says
Experimentalists always aspire to have data having normal distribution but in real it shifts
from the normal distribution behaviour. How his issue is addressed to approximate the
values
Jim Frost says
Hi Adnan,
I’m always surprised at how often the normal distribution actually fits real data. And, in regression analysis, is often not hard to get the residuals to follow a normal distribution. However, when the data/residuals absolutely don’t follow the normal distribution, all is not lost! For one thing, the central limit theorem allows you to use many parametric tests even with nonnormal data. You can also use nonparametric tests with nonnormal data. And, while I always consider it a last resort, you can transform the data so it follows the normal distribution.
Javeria says
Blood pressure of 150 doctors was recorded. The mean BP was found to be 12.7 mmHG. The standard deviation was calculated to be 6mmHG. If blood pressure is normally distributed then how many doctors will have systolic blood pressure above 133 mmHG?
Jim Frost says
Hi,
Calculate the Z-score for the value in question. I’m guessing that is 13.3 mmHG rather than 133! I show how to do that in this article. Then use a Z-table to look up that Z-score, which I also show in this article. You can find online Z-tables to help you out.
DOSA says
Good day professor
I would like to what is the different between ” sampling on the mean of value” and “normal distribution”.
I really appreciate any help from you
Thank
Jim Frost says
Hi Dosa,
I’m not really clear about what you’re asking. Normal distribution is a probability function that explains how values of a population/sample are distributed. I’m not sure what you mean by “sampling on the mean of value”? However, if you take a sample, you can calculate the mean for that sample. If you collected a random sample, then the sample mean is an unbiased estimator of the population mean. Further, if the population follows a normal distribution, then the mean also serves as one of the two parameters for the normal distribution, the other being the standard deviation.
Jeff says
Hello sir! I am a student, and have little knowledge about statistics and probability. How can I answer this (normal curve analysis), given by my teacher, here as follows: A production machine has a normally distributed daily output in units. The average daily output is 4000 and daily output standard deviation is 500. What is the probability that the production of one random day will be below 3580?
Thank you so much and God bless you! 🙂
Jim Frost says
Hi Jeff,
You’re looking at the right article to calculate your answer! The first step is for you to calculate your Z-score. Look for the section titled–Standardization: How to Calculate Z-scores. You need to calculate the Z-score for the value of 3580.
After calculating your z-score, look at the section titled–Using a Table of Z-scores. You won’t be able to use the little snippet of a table that include there, but there are online Z score tables. You need to find the proportion of the area under the curve to left of your z-score. That proportion is your probability! Hint: Because the value you’re considering (3580) is below the mean (4000), you will have a negative Z-score.
If you’d like me to verify your answer, I’d be happy to do that. Just post it here.
Chanachok chokwitthaya says
I would like to cite your book in my journal paper but I can’t find its ISBN. Could you please provide me the ISBN?
Myo Hein says
Yours works really helped me to understand about normal distributions. Thank you so much
Michael says
Wow I loved this post, for someone who knows nothing about statistics, it really helped me understand why you would use this in a practical sense. I’m trying to follow a course on Udemy on BI that simply describes Normal Distribution and how it works, but without giving any understanding of why its used and how it could be used with examples. So, having the apples and oranges description really helped me!
Jim Frost says
Hi Michael,
Your kind comment totally made my day! Thanks so much!
Jay says
Hi Sir,
I am still a newbie in statistics. I have curious question.
I have always heard people saying they need to make data to be of normal distribution before running prediction models.
And i have heard many methods, one which is standardisation and others are log transformation/cube root etc.
If i have a dataset that has both age and weight variables and the distribution are not normal. Should i use transform them using the Z score standardisation or can i use other methods such as log transformation to normalise them? Or should i log transform them first, and then standardise them again using Z score?
I can’t really wrap my head around these..
Thank you much!
Jim Frost says
Hi Jay,
Usually for predictive models, such as using regression analysis, it’s the residuals that have to be normally distributed rather than the dependent variable itself. If you the residuals are not normal, transforming the dependent variable is one possible solution. However, that should be a last resort. There are other possible solutions you should try first, which I describe in my post about least squares assumptions.
Nitin Manepalli says
Sir, Can I have the reference ID of yours to add to my paper
Dr. Ramnath Takiar says
In case of any skewed data, some transformation like log transformation can be attempted. In most of the cases, the log transformation reduces the skewness. With transformed Mean and SD, find the 95% confidence Interval that is Mean – 2SD to Mean+2SD. Having obtained the transformed confidence interval, take antilog of the lower and upper limit. Now, any value not falling in the confidence interval can be treated as an outlier.
Uma Shankar says
Hi Jim, Thanks for the wonderful explanation. I have been doing a Target setting exercise and my data is skewed. In this scenario, how to apprpach Target setting? Also, how to approach outlier detection for skewed data. Thanks in advance.
dbadrysys says
Why do we need to use z-score when the apple and orange have the same unit measurement (gram) ?
Jim Frost says
Even when you use the same measurement, z-scores can still provide helpful information. In the example, I show how the z-scores for each show where they fall within their own distribution and they also highlight the fact that we’re comparing a very underweight orange to a somewhat overweight apple.
dbadrysys says
Hi Jim,
In the “Example of Using Standard Scores to Make an Apples to Oranges Comparison” section, Could you explain detail the meaning when we have a z-score of apple and orange ?
Thanks.
Jim Frost says
I compare those two scores and explain what they mean. I’m not sure what more you need?
Khalid Rashid says
Hi!
I have a data report which gives Mean = 1.91, S.D. = 1.06, N=22. The data range is between 1 and 5. Is it possible to generate the 22 points of the data from this information.
Thanks.
Jim Frost says
Hi Khalid,
Unfortunately, you can’t reconstruct a dataset using those values.
Midhat Zahra says
Okay..now I’ve got it. Thank you so much. And your post is really helpful to me. Actually because of this I can complete my notes..thank you..✨
Midhat Zahra says
In different posts about Normal Distribution they have written Varience as a parameter even my teacher also include Varience as the parameter.
So it’s really confusing that on what basis the standard deviation is as parameter and on what basis the others say Varience as parameter.
And I’m really sorry for bothering you again and again…🙂
Jim Frost says
Hi Midhat,
I don’t know why they have confused those two terms but they are different. Standard deviation and variances are definitely different but related. Variance is not a parameter for the normal distribution. The square root of the variance is the standard deviation, which is a parameter.
Midhat Zahra says
Hi!
It’s really helpful.. thank you so much.
But I have a confusion that the one of the parameter of normal Distribution is Standard deviation. Is we can also say that the parameter of standard deviation is “Varience” .
Jim Frost says
Hi Midhat,
Standard deviations and variances are two different measures of variation. They are related but different. The standard deviation is the square root of the variance. Read my post about measures of variability and focus on the sections about those measures for more information.
Kanchana Baradhwaj says
Hey Jim,
This is a great explanation for why we standardize values and the significance of a z-score. You managed to explain a concept that multiple professors and online trainings were unable to explain.
Though I was able to understand the formulae and how to calculate all these values, I was unable to understand WHY we needed to do it. Your post made that very clear to me!
Thank you for taking the time to put this together and for picking examples that make so much sense!
Kanchana
Will says
Hi Jim, thanks for an awesome blog. Currently I am busy with an assignment for university where I got a broad task, I have to find out if a specific independent variable and a specific dependent variable are linearly related in a hedonic pricing model.
In plain English, would checking for the linear relationship mean that I check the significance level of the specific independent variable within the broader hedonic pricing model? If so, should I check for anything else? If I am completely wrong, what would you advise me to do instead?
Sorry for such a long question, but me and classmates are a bit lost over the ambiguity of the assignment, as we are all not that familiar with statistics.
I thank you for your time!
Jim Frost says
Hi Will,
Measures of statistical significance won’t indicate the nature of the relationship between two variables. For example, if you have a curved, positive relationship between X and Y, you might still obtain a significant result if you fit a straight line relationship between the two. To really see the nature of the relationship between variables, you should graph them in a scatterplot.
I hope this helps!
Ibrahim says
your blog is awesome
I’v confusion …
When we add or subtract 0.5 area?
Jim Frost says
Hi Ibrahim,
Sorry, but I don’t understand what you’re asking. Can you provide more details?
Anupama Balan Menon says
Hello Jim!
How did (30+-2)*5 = 140-160 become 20 to 40 minutes?
Looking forward to your reply..
Thanks!
Jim Frost says
Hi Anupama,
You have to remember your order of operations in math! You put your parentheses in the wrong place. What I wrote is equivalent to 30 +/- (2*5). Remember, multiplication before addition and subtraction. 🙂
Mark Anthony Sulite says
what are the three different ways to find probabilities for normal distribution?
Jim Frost says
Hi Mark, if I understand your question correctly, you’ll find your answers in this blog post.
MBOOWA says
for really you have opened my eyes
Daniel says
Hi jim,why is normal distribution important.
how can you access normality using graphical techniques like histogram and box plot
Jim Frost says
Hi Daniel,
I’d recommend using a normal probability plot to graphically assess normality. I write about it in this post that compares histograms and normal probability plots.
I hope this helps!
Cynthia Dickerson says
Check your Pearson’s coefficient of skew. 26 “high outliers” sounds to me like you have right-tailed aka positive skew, possibly. Potentially, it is only moderately skewed so you can still assume normality. If it is highly skewed, you need to transform it and then do calculations. Transforming is way easier than it sounds; Google can show you how to do that.
Jim Frost says
Hi Cynthia,
This is a case where diagnosing the situation can be difficult without the actual dataset. For others, here’s the original comment in question.
On the one hand, having 26 high outliers and only 3 low outliers does give the impression of a skew. However, we can’t tell the extremeness of the high versus low outliers. Perhaps the high outliers are less extreme?
On the other hand, the commenter wrote that a normality test indicated the distribution is normally distributed and that a histogram also looks normally distributed. Furthermore, the fact that the mean and median are close together suggests it is a symmetric distribution rather than skewed.
There are a number of uncertainties as well. I don’t know the criteria the original commenter is using to identify outliers. And, I was hoping to determine the sample size. If it’s very large, then even 26 outliers is just a small fraction and might be within the bounds of randomness.
On the whole, the bulk of the evidence suggests that the data follow a normal distribution. It’s hard to say for sure. But, it sounds like we can rule out a severe skew at the very least.
You mention using a data transformation. And, you’re correct, they’re fairly easy to use. However, I’m not a big fan of transforming data. I consider it a last resort and not my “go to” option. The problem is that you’re analyzing the transformed data rather than the original data. Consequently, the results are not intuitive. Fortunately, thanks the central limit theorem, you often don’t need to transform the data even when they are skewed. That’s not to say that I’d never transform data, I’d just look for other options first.
You also mention checking the Pearson’s coefficient of skewness, which is a great idea. However, for this specific case, it’s probably pretty low. You calculate this coefficient by finding the difference between the mean and median, multiplying that by three, and then dividing by the standard deviation. For this case, the commenter indicated the the mean and median were very close together, which means the numerator in this calculation is small and, hence, the coefficient of skewness is small. But, you’re right, it’s a good statistic to look at in general.
Thanks for writing and adding your thoughts to the discussion!
Prudence says
Thank you very much Jim,I understand this better
Umang Gada says
Hi Jim,
What if my distribution has a like 26 outliers on the high end and 3 on the low end and still my mean and median happen to be pretty close. the distribution on a histogram looks normal too. and the ryan joiner test produces the p-value of >1.00. will this distribution be normal?
Jim Frost says
Hi Umang,
Based on what you write, it sure sounds like it’s normally distributed. What’s your sample size and how are you defining outliers?
Eider Diaz says
i just want to say thanks a lot Jim, greetings from mexico
Jim Frost says
Hi Eider,
You’re very welcome!!! 🙂
HERBERT says
Thank you very much Jim. You have simplified this for me and I found it very easy to understand everything.
Sameer says
Hi, Jim,
Your blog is wonderful. Thanks a lot.
Nikola says
Dear JIm,
I want to compare trends of R&D expenditures before and after crisis, and i was planning to use paired t test or its non parametric alternative. But, before of that, i employed normality tests, and i have had one problem. But, normality test shows that one variable has normal, and other has non normal distribution. So, my question is should i use T paired test or it non parametric alternative. You can see results in the table.
Thank you.
Kolm.Smirn Stat(p) SHapiro-Wilk Stat(p)
Before crisis 0.131(0.200) 0.994(0.992)
After crisis 0.431(0.003) 0.697(0.009)
Jim Frost says
Hi Nikola,
There are several other issues in addition to normality that you should consider. And, nonnormally distributed data doesn’t necessarily indicate you can’t use a parametric test, such as a paired t-test. I detail the various things you need to factor into your decision in this blog post: Parametric vs. Nonparametric tests. That post should answer your questions!
Chankey Pathak says
I’m trying to refresh Stats & Probability after being away from it for about 10 years. Your blog is really helping me out.
PAKA says
Very useful post. I will be visiting your blog again!
Ushawu says
Great. The simple yet practical explanation helped me a lot
Michel says
You made an error with the calculation for the apples and oranges: You said 110-100/15 = 0.667 but that is wrong because 110-100/15=110-6.667=103.333
Jim Frost says
Hi Michel,
Thanks for catching that! Actually, the answer is correct (0.667), but I should have put parentheses in the correct places. I’ll add those. Although, I did define how to calculate Z-scores with an equation in the previous section.
For that example, a score of 110 in a population that has a mean of 100 and a standard deviation of 15 has a Z-score of 0.667. It is two-thirds of a standard deviation above the mean. If it truly had a Z-score of 103.333, it would be 103 standard deviations above the mean which is remarkably far out in the tail of the distribution!
Alana says
Thank you Jim
Is there a tutorial that you know of that explains how to do this please
Alana says
Hi Jim
How did you produce your apa style graphs I need to show where my score lies on a normal distribution
Did you use Spss to produce the graphs shown here please
Jim Frost says
Hi Alana,
I used Minitab statistical software to create the graphs.
L Udaya Simha says
Material is very informative. Taken the extract of this for my lecture.
Udaya Simha
Alana says
Hello Jim
I was hoping that you could help me with my z scores for my assignment. Specifically I need help with interpreting the data!!!!
I need to compare my my z scores for each of the big five personality traits to that of my peers in the unit
The population mean for Openness was 85.9 with standard deviation of 11.8. My score was 71. Which gives me a z score of -1.26
The population mean for Agreeableness was 91.5, standard deviation was 11. My score was 94. Which gives me a z score of 0.23
Now the part I am having trouble with is I need to work out how much higher, lower or approximately average I am on each trait domain compared to my peers and I literally have no idea how I go about this!
I understand that a score of 0.23 is in the range of one SD above the mean but it is only slightly above the mean which would make my agreeableness score approximately average to my peers, is this correct ? and is there a more statistical way of determining how far above or below the mean say in % way or via percentile rank
please help
P.S I think your site is wonderful and I am now able to graph my assignment appropriately because of you! your site is fantastic
safin ghoghabori says
Pretty much good..😊
Elizabeth says
Hi Jim,
This is great. I’ve got a class of kids with chrome books and I’m trying to teach with tools we have. Namely Google sheets. Excel uses many of the same Stats functions. I don’t like to have them use any function unless I can really explain what it does. I want to know the math behind it. But some of the math is beyond what they would have. Still I like them to have a visual idea of what’s happening. I think we rely too much on calculator/ spreadsheet functions without really understanding what they do and how they work. Most of the time the functions are straight forward. But this one was weird.
I ran through 8 Stats books and I really didn’t get a good feeling of how it worked. I can approximate a normal distribution curve of a dataset using norm.dist(), but I wanted to know more about why it worked.
First we will look at a few generic datasets. Then they will pull in stock data and they will tell me if current stock prices fall within 1 standard deviation of a years worth of data. Fun.
Thanks!!
Elizabeth
Jim Frost says
Hi Elizabeth,
That sounds fantastic that you’re teaching them these tools! And, I entirely agree that we often rely to much on functions and numbers without graphing what we’re doing.
For this particular function, a graph would make it very clear. I do explain probability functions in the post that I link you to in my previous comment, and I use graphs for both discrete and continuous distributions. Unfortunately, I don’t show a cumulative probability function (I should really add that!). For the example I describe, imagine the bell curve of a normal distribution, the value of 42 is above the mean, and you shade the curve for all values less than equal to 42. You’re shading about 90.87% of the distribution for the cumulative probability.
That does sound like fun! 🙂
Elizabeth W Dillard says
Hi Jim,
This is really neat.
I’ve been looking at the formula norm.dist(x, Mean, StandardDev, False) in Excel and Google Sheets.
I’m trying to understand what it is actually calculating.
I’m just getting back into Statistics – and this one is stumping me.
This is where x is a point in the dataset
Thanks!
Jim Frost says
Hi Elizabeth,
I don’t use Excel for statistics, but I did take a look into this function.
Basically, you’re defining the parameters of a normal distribution (mean and standard deviation) and supply an X-value that you’re interested in. You can use this Excel function to derive the cumulative probability for your X-value or the probability of that specific value. Here’s an example that Microsoft uses on its Help page for the norm.dist function.
If you have a normal distribution that has a mean of 40, standard deviation of 1.5, and you’re interested in the properties of the value 42 for this distribution. This function indicates that the cumulative probability for this value is 0.90. In other words, the probability that values in this distribution will be less than or equal to 42 is 90.87%. Said in another way, values of 42 and less comprise about 90.87% of this distribution.
Alternatively, this Excel function can calculate the probability of an observation having the value of 42 exactly. There’s a caveat because this distribution is for a continuous variable and it is unlikely that an observation will have a value of exactly 42 out to a infinite number of decimal places. So, these calculations use a small range of values that includes 42 and calculates the probability that a value falls within that small range. That’s known as the probability distribution function (PDF). In this case, the probability of a value being 42 equals approximately 10.9%.
For more information about PDFs, please read my post about Understanding Probability Distributions.
Z Table says
Hey Jim. This is a fantastic post. I came across a lot of people asking the significance of normal distribution (more people should) and I was looking for an answer that puts its as eloquently as you did. Thank you for writing this.
Jim Frost says
Hi, thank you so much! I really appreciate your kind words! 🙂
Sudhakar says
Excellent Jim, great explanation. I have a doubt, you used some software to calculate Z-score and to display graphs right, can you please let me know which software you used for the same?
Jim Frost says
Hi Sudhakar,
I’m using Minitab statistical software.
Thanks for reading!
Sule Suleiman Taura says
Great to have met someone like Jim who can explain Statistics in plain language for everyone to understand. Another questions are; a) what is the function of probability distribution and would one use a probability distribution?
Jim Frost says
Hi,
I’ve written a post all about probability distributions. I include the link to it in this post, but here it is again: Understanding Probability Distributions.
Bhaskar says
Very nice explanation .
Xavier says
Finally I found a post which explains normal distribution in plain english. It helped me a lot to understand the basic concepts. Thank you very much, Jim
Jim Frost says
You’re very welcome, Xavier! It’s great to hear that it was helpful!
Jimmy says
Hi Jim thanks for this. How large a number makes normal distribution?
Jim Frost says
Hi, I don’t understand your question. A sample of any size can follow a normal distribution. However, when your sample is very small, it’s hard to determine which distribution it follows. Additionally, there is no sample size that guarantees your data follows a normal distribution. For example, you can have a very large sample size that follows a skewed, non-normal distribution.
Are you possibly thinking about the central limit theorem? This theorem states that the sampling distribution of the mean follows a normal distribution if your sample size is sufficiently large. If this is what you’re asking about, read my post on the central limit theorem for more information.
Ranjan venkatesh says
best post ever, thanks a lot
Jim Frost says
Thanks, Ranjan!
Arjul Islam says
Great work
So many confusion cleared
mt says
thank you very much for this very good explanation of normal distribution 👍🙌🏻
Jim Frost says
Thank you!
Rajendra Prabhu says
During my B.E (8 semester course), we had “Engg. Maths.” for four semesters, and in the one semester we had Prob & Stat. (along with other topics), which was purely theoretical even though we had lots of exercises and problems, could not digest and didnt knew its practical significane, (i.e., how and where to apply and use) and again in MTech (3 sem course) we had one subject “Reliability Analysis and Design of Structures” , but this was relatively more practically oriented. While working in Ready Mix Concrete industry and while doing PhD in concrete, I came across this Normal Distribution concept, where concrete mix design is purely based on Std Dev and Z score, and also concrete test results are assesed statistically for their performance monitoring, acceptace criteria, non-compliance etc., where normal distribution is the back-bone. However because of my thirst to gain knowledge, to fully understand, a habit of browsing internet (I wanted Confidence Interval concept) made me to meet your website accidentally.
I observed your effort in explaining the topic in a simple, meaningful and understandable manner, even for a person with non-science or Engg background can learn from scratch with zero-background. That’s great.
My heart felt gratitude and regards and appreciate you for your volunteering mentality (broad mind) in sharing your knowledge from your experience to the needy global society.
Thank you once again,
Rajendra Prabhu
NASI says
THANK YOU FOR YOUR HELP
VERY USEFUL
williams kwarah says
thank you, very useful
Ali says
Jim, you truly love what you are doing, and saving us at the same time. i just want to say thank you i was about to give up on statistics because of formulas with no words
Jim Frost says
Hi Ali, Thank you so much! I really appreciate your kind words! Yes, I absolutely love statistics. I also love helping other learn and appreciate statistics as well. I don’t always agree with the usual way statistics is taught, so I wanted to provide an alternative!
Lucyna says
I was frustrated in my statistics learning by the lecturer’s focus on formulae. While obviously critical, they were done in isolation so I could not see the underlying rationale and where they fit in. Your posts make that very clear, and explain the context, the connections and limitations while also working through the calculations. Thank you.
Jim Frost says
Hi Lucyna,
First, I’m so happy to hear that my post have been helpful! What you describe are exactly my goals for my website. So, your kind words mean so much to me! Thank you!
Noor Nawaz says
Nice work sir…
Sanjay Sinha says
Fantastic way of explaining
Jim Frost says
Thank you, Sanjay!
Qaz says
Sir kindly guide me. I have panel data. My all variables are not normally distributed. data is in ratios form. My question is that , For descriptive statistics and correlation analysis, do i need to use raw data in its original form?? and transformed data for regression analysis only?
Moreover, which transformation method should be used for ratios, when data is highly positively or negatively skewed. I tried, log, difference, reciprocal, but could not get the normality.
Kindly help me. Thank You
Mona says
Do natural phenomena such as hemoglobin levels or the weight of ants really follow a normal distribution? If you add up a large number of random events, you get a normal distribution.
Jim Frost says
To obtain a normal distribution, you need the random errors to have an equal probability of being positive and negative and the errors are more likely to be small than large.
Many datasets will naturally follow the normal distribution. For example, the height data in this blog post are real data and they follow the normal distribution. However, not all datasets and variables have that tendency. The weight data for the same subjects that I used for the weight data are not normally distributed. Those data are right skewed–which you can read about in my post about identifying the distribution of a dataset.
Carlos says
Hello Jim, first of all, your page is very good, it has helped me a lot to understand statistics.
Query, then when I have a data set that is not distributed normally, should I first transform them to normal and then start working them? Greetings from Chile, CLT
Jim Frost says
Hi Carlos,
This gets a little tricky. For one thing, it depends what you want do with the data. If you’re talking about hypothesis tests, you can often use the regular tests with non-normal data when you have a sufficiently large sample size. “Sufficiently large” isn’t really even that large. You can also use nonparametric tests for nonnormal data. There are several issues to consider, which I write about in my post that compares parametric and nonparametric hypothesis tests.
That should help clarify some of the issues. After reading that, let me now if you have any additional questions. Generally, I’m not a fan of transforming data because it completely changes the properties of your data.
Aashay Sukhthankar says
Hi Jim. What exactly do you mean by a true normal distribution. You’ve not used the word “true” anywhere in your post. Just plain normal distribution.
Jim Frost says
Hi Aashay, sorry about the confusing terminology. What I meant by true normal distribution is one that follows a normal distribution to mathematically perfect degree. For example, the graphs of all the normal distributions in this post are true normal distributions because the statistical software graphs them based on the equation for the normal distribution plus the parameter values for the inputs.
By the way, there is not one shape that corresponds to a true normal distribution. Instead, there are an infinite number and they’re all based on the infinite number of different means and standard deviations that you can input into the equation for the normal distribution.
Typically, data don’t follow the normal distribution exactly. A distribution test can determine whether the deviation from the normal distribution is statistically significant.
In the comment where I used this terminology, I was just trying to indicate how as a distribution deviated from a true normal distribution, the Empirical Rule also deviates.
I hope this helps.
Josh Pius says
I’m glad I stumbled across your blog 🙂 Wonderful work!! I’ve gained an new perspective on what statistics could mean to me
Jim Frost says
Hi Josh, that is awesome! My goal is to show that statistics can actually be exciting! So, your comment means a lot to me! Thanks!
Asis Kumar Dirghangi says
Excellent…..
Jim Frost says
Thank you, Asis!
Muhammad Arif says
Many Many thanks for help dear Jim sir!
Jim Frost says
You’re very welcome! 🙂
Muhammad Arif says
dear Jim, tell me please what is normality?. and how we can understand to use normal or any other distribution for a data set?
Jim Frost says
Hi Muhammad, you’re in the right place to find the information that you need! This blog post tells you all about the normal distribution. Normality simply refers to data that are normally distributed (i.e., the data follow the normal distribution).
I have links in this post to another post called Understand Probability Distributions that tells you about other distributions. And yet another link to a post that tells you How to Determine the Distribution of Your Data.
Masum Ahmed says
your are far better than my teachers. Thank you Jim
Jim Frost says
Thank you, Masum!
John-Harold says
Another great post. Simple, clear and direct language and logic.
Jim Frost says
Thanks so much! That’s always my goal–so your kind words mean a lot to me!
Khursheed Ahmad Ganaie says
I was eagerly waitng fr ths topic ..
Normal distribution
Thnks a lott ,,,,,,
Jim Frost says
You’re very welcome, Khursheed!
Fernando Antunez says
Jim, it is my understanding that the normal distribution is unique and it is the one that follows to perfection the 68 95 99.7%. The rest of the distributions are “approximately” normal, as you say when they get wider. They are still symmetric but not normal because they lost perfection to the empirical rule. I was taught this by a professor when I was doing my master;s in Stats
Jim Frost says
Hi Fernando, all normal distributions (for those cases where you input any values for the mean and standard deviation parameters) follow the Empirical Rule (68%, 95%, 99.7%). There are other symmetric distributions that aren’t quite normal distributions. I think you’re referring to these symmetric distributions that have thicker or thinner tails than normal distributions should. Kurtosis measures the thickness of the tails. Distributions with high kurtosis have thicker tails and those with low kurtosis has thinner tails. If a distribution has thicker or thinner tails than the true normal distribution, then the Empirical Rule doesn’t hold true. How off the rule is depends on how different the distribution is from a true normal distribution. Some of these distributions can be considered approximately normal.
However, this gets confusing because you can have true normal distributions that have wider spreads than other normal distributions. This spread doesn’t necessarily make them non-normal. The example of the wider distribution that I show in the Standard Deviation section is a true normal distribution. These wider normal distributions follow the Empirical Rule. If you have sample data and are trying to determine whether they follow a normal distribution, perform a normality test.
On the other hand, there are other distributions that are not symmetrical at all and very different from the normal distribution. They’re different by more than just the thickness of the tails. For example the lognormal distribution can model very skewed distributions. Some of these distributions are nowhere close to being approximately normal!
So, you can have a wide variety of non-normal distributions that range from approximately normal to not close at all!
MG says
Thank you very much for your great post. Cheers from MA
Jim Frost says
You’re very welcome! I’m glad it was helpful! 🙂