• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar
  • My Store
  • Glossary
  • Home
  • About Me
  • Contact Me

Statistics By Jim

Making statistics intuitive

  • Graphs
  • Basics
  • Hypothesis Testing
  • Regression
  • ANOVA
  • Probability
  • Time Series
  • Fun

Lognormal Distribution

By Jim Frost 1 Comment

The lognormal distribution is a continuous probability distribution that models right-skewed data. The shape of the lognormal distribution is comparable to the Weibull and loglogistic distributions.

Graph of the lognormal distribution.Statisticians use this distribution to model growth rates that are independent of size, which frequently occurs in biology and financial areas. It also models time to failure in reliability studies, rainfall amounts, species abundance, and the number of moves in chess games. Read my post to see how it models global income distributions. In my post about how to identify the distribution of your data, I discover that the lognormal distribution provides the best fit to my data for the body fat percentages of middle school girls.

As the name implies, the lognormal distribution is related to logs and the normal distribution. Let’s see how that works!

If your data follows a lognormal distribution and you transform it by taking the natural log of all values, the new values will fit a normal distribution. In other words, when your variable X follows a lognormal distribution, Ln(X) fits a normal distribution. Hence, you take the logs and get a normal distribution . . . lognormal.

You can exponentiate a normal distribution (exp (X)) to obtain the lognormal distribution. In this manner, you can transform back and forth between pairs of related lognormal and normal distributions.

The sum of many independent and identically distributed (IID) variables frequently produces a normal distribution. However, the product of many IID variables creates a lognormal distribution. Consider the following to understand why:

If y = x1x2x3, then ln(y) = ln(x1) + ln(x2) + ln(x3)

Because of the multiplication process behind lognormal distributions, the geometric mean can be a better measure of central tendency than the arithmetic mean for this distribution.

Lognormal Distribution Parameters

There are several ways to parameterize the lognormal distribution. I’ll use the location, scale, and threshold parameters. The values of the location and scale parameters relate to the normal distribution that the log-transformed data follow, which statisticians also refer to as the logged distribution.

Specifically, when you have a normal distribution with the mean of µ and a standard deviation of σ, the lognormal distribution uses these values as its location and scale parameters, respectively.

Threshold Parameter

The threshold parameter defines the minimum value in a lognormal distribution. All values must be greater than the threshold. Therefore, negative threshold values let the distribution handle both positive and negative values. Zero allows the distribution to contain only positive values.

When you hold the location and scale parameters constant, the threshold shifts the distribution left and right, as shown below.

Graph that illustrates the effect of changing the threshold in the lognormal distribution.

Lognormal Location Parameter (µ)

The location represents the peak (mean, median, and mode) of the normally distributed data. In the lognormal distribution, take e and raise it by the location value (elocation) to find the median of the lognormal distribution.

In the graph below, I hold the threshold and scale parameters constant to highlight the effect of changing the location parameter.

Graph that illustrates the effect of changing the location parameter in the lognormal distribution.

The plot below is from my post where I use these distributions to model global incomes. It illustrates how the location parameter is the median of the lognormal distribution.

Income distribution for the United States in 2006.

I’ve shaded 50% of the distribution, which corresponds to the median value of 28,788. You can also obtain this value by taking e and raising it by the location value. In this case, e10.2677 = 28,788.

Scale Parameter (σ)

The scale represents the standard deviation of the normally distributed data.

In the chart below, I hold the threshold and location parameters constant to emphasize the effect of changing the scale parameter.

Graph that highlights the effect of changing the scale parameter in the lognormal distribution.

Share this:

  • Tweet

Related

Filed Under: Probability Tagged With: distributions, graphs

Reader Interactions

Comments

  1. vanceh says

    February 16, 2022 at 8:04 pm

    HI Jim,
    I’m interested in forecasting expected values of processes that have lognormal distributions. The literature seems to always use the arithmetic mean (AM) for these forecasts, but my understanding is that the probability of meeting or exceeding forecasts based on the probability weighed drops with increased variance. For example, if I throw a dice 10K times, multiplying the values from the throws together it appears to me that the standard methods for this problem forecast the “Expected Value” as E[X]^10K, where E[X} = 3.5. This seem absurd to me, because the Law of Large Numbers would insist that the distribution of those throws would be quite close to uniform. This would give an result close to the Geometric mean of a standard dice’s values to the Nth power ~ 2.9938^10K. The 3.5^10K result by my calculations would require a 26+ sigma event (!) to meet or exceed–hardly something one should “expect”. Where am I going wrong here? Thanks, Vance

    Reply

Comments and Questions Cancel reply

Primary Sidebar

Meet Jim

I’ll help you intuitively understand statistics by focusing on concepts and using plain English so you can concentrate on understanding your results.

Read More...

Buy My Introduction to Statistics eBook!

New! Buy My Hypothesis Testing eBook!

Buy My Regression eBook!

Subscribe by Email

Enter your email address to receive notifications of new posts by email.

    I won't send you spam. Unsubscribe at any time.

    Follow Me

    • FacebookFacebook
    • RSS FeedRSS Feed
    • TwitterTwitter
    • Popular
    • Latest
    Popular
    • How To Interpret R-squared in Regression Analysis
    • How to Interpret P-values and Coefficients in Regression Analysis
    • Measures of Central Tendency: Mean, Median, and Mode
    • Normal Distribution in Statistics
    • Multicollinearity in Regression Analysis: Problems, Detection, and Solutions
    • How to Interpret the F-test of Overall Significance in Regression Analysis
    • Understanding Interaction Effects in Statistics
    Latest
    • Sampling Methods: Different Types in Research
    • Beta Distribution: Uses, Parameters & Examples
    • Geometric Distribution: Uses, Calculator & Formula
    • What is Power in Statistics?
    • Conditional Distribution: Definition & Finding
    • Marginal Distribution: Definition & Finding
    • Content Validity: Definition, Examples & Measuring

    Recent Comments

    • Chris Anderson on Guide to Data Types and How to Graph Them in Statistics
    • James on Introduction to Bootstrapping in Statistics with an Example
    • Khursheed Ahmad on Sampling Methods: Different Types in Research
    • Jim Frost on Interpreting Correlation Coefficients
    • Jim Frost on Interpreting Correlation Coefficients

    Copyright © 2022 · Jim Frost · Privacy Policy