The beta distribution is a continuous probability distribution that models random variables with values falling inside a finite interval. Use it to model subject areas with both an upper and lower bound for possible values. Analysts commonly use it to model the time to complete a task, the distribution of order statistics, and the prior distribution for binomial proportions in Bayesian analysis.
The standard beta distribution uses the interval [0,1]. This range is ideal for modeling probabilities, particularly for experiments with only two outcomes. However, other intervals are possible.
Related post: Understanding Probability Distributions
Beta Distribution Parameters and Notation
Unlike other distributions with shape and scale parameters, the beta distribution has two shape parameters, α and β. Both parameters must be positive values.
Additionally, statisticians denote the finite interval’s upper and lower bounds as a and b, respectively.
How to Calculate Beta Distribution Parameters
Statistical software can use maximum likelihood estimation to find the parameters for the beta distribution. This process estimates the parameters that produce the best fitting curve for your data. Alternatively, you can perform simple calculations using the outcome of a binomial experiment to find the appropriate parameters, which I show in the next section.
The beta distribution is particularly flexible at modeling different curves within the interval, including symmetrical, left and right-skewed, U and inverted U shapes, and straight lines. The standard form can illustrate all curves, so I’ll use it in the examples below. However, keep in mind that the upper and lower bounds do not need to be 0 and 1.
Note that my software refers to α and β as First and Second, respectively.
Beta vs Binomial Distribution and Updating Prior Probabilities
The beta distribution has a close relationship with the binomial distribution. First, remember that the binomial distribution models the number of successes in a specific number of trials when you have binary data. Now, consider that the number of successes divided by the number of trials is a binomial proportion, which is a probability. The beta distribution models the likelihood of success in Bernoulli Trials and captures its uncertainty. Learn more about the Binomial Distribution.
Suppose you sell breakfast cereal and perform a simple experiment. You randomly select ten people to try your cereal and a competitor’s. When a subject says your cereal is better, it’s a success. Seven out of 10 (70%) said your cereal is better. Because there are only two possible outcomes (success/failure), it’s a binomial experiment. Let’s use the beta distribution to model the results.
For this type of experiment, calculate the beta parameters as follows:
- α = k + 1
- β = n – k + 1
- k = number of successes
- n = number of trials.
Additionally, use this method to update your prior probabilities in a Bayesian analysis after you obtain additional information from a new binomial experiment. Simply add the new successes and trials to parameters of the prior probability’s beta distribution. As α and β increase, the distribution narrows, reflecting the greater precision of the larger sample size.
For our experiment, we have 7 successes and 10 trials:
- α = 7 + 1 = 8
- β = 10 – 7 + 1 = 4
Assessing the Results
Using these parameter values produces the following beta distribution.
In closing, I’ll emphasize the relationship between the beta and binomial distributions. The graph below displays the binomial distribution for our experiment. Notice the similarities? Both curves peak at 0.7 or 7/10 and are similarly left-skewed. In the binomial distribution, take the number of successes and divide by 10 (the number of trials) to obtain the probabilities in the beta distribution.