What is a Random Variable?
A random variable is a variable where chance determines its value. They can take on either discrete or continuous values, and understanding the properties of each type is essential in many statistical applications. Random variables are a key concept in statistics and probability theory.
While randomness defines both discrete and continuous variables, their values are not entirely unpredictable. The probability of each value is well-defined and quantifiable using probability functions. By understanding the properties of these probability functions, you can make predictions and draw conclusions about real-world phenomena. These quantifiable properties make random variables a useful concept in statistics.
In this blog post, you’ll learn about their properties and see examples of both discrete and continuous random variables.
Discrete Random Variable
A discrete random variable has distinct values that are countable and finite or countably infinite. This data type often occurs when you are counting the number of event occurrences. For example, discrete random variables include the following:
- The number of heads that come up during a series of coin tosses.
- The number of library books checked out per hour.
Analysts denote the variable as X and its possible values as x1, x2, …, xn.
The probability of X having a value of x for its ith observation equals pi: P (X = xi) = pi.
Using this notation, discrete random variables must satisfy these conditions:
- All possible discrete values must have probabilities between zero and one: 0 < pi ≤ 1.
- The total probability for all possible k values must equal 1: p1 + p2 + p3 + . . . + pk = 1.
When these conditions are satisfied, one of the possible values will occur during every opportunity.
The probability distribution of a discrete random variable is called a probability mass function (PMF). The following are two common types of PMFs:
Choosing the correct PMF involves the nature of your data and the probabilities you want to find. Learn more about Probability Mass Functions.
The number of heads that appear during a series of five coin tosses is a discrete random variable that follows the binomial distribution. We can use that distribution to determine the likelihood of obtaining 0 to 5 heads. The graph below displays the probability for each possible outcome.
Continuous Random Variable
A continuous random variable has values that are uncountably infinite and form a continuous range of values. They can take on any value within a range. In fact, there are infinite values between any two values.
This data type often occurs when you measure a quantity on a scale. For example, continuous random variables include the following:
- Height and weight.
- Time and duration.
Analysts denote a continuous random variable as X and its possible values as x, just like the discrete version. However, unlike discrete random variables, the chances of X taking on a specific value for continuous data is zero. In other words: P (X = x) = 0, where x is any specific value.
Instead, probabilities greater than zero only exist for ranges of values, such as P(a ≤ X ≤ b), where a and b are the lower and upper bounds of the range.
A probability density function (PDF) describes the probability distribution of a continuous random variable. These functions use a curve displaying probability densities, which are ranges of one unit.
Continuous random variables must satisfy the following:
- Probabilities for all ranges of X are greater than or equal to zero: P(a ≤ X ≤ b) ≥ 0.
- The total area under the curve equals one: P(-∞ ≤ X ≤ + ∞) = 1.
The likelihood of X falling within a particular range of values corresponds to the range’s area under the PDF curve, which requires using an integral. Read about Integrals (Wikipedia).
The following are common types of PDFs:
Before using a PDF for a continuous random variable, Identify the Distribution of Your Data. Learn more about Probability Distributions: Definition & Calculations.
Body fat percentage is a continuous random variable. In preteen girls, it follows a lognormal distribution. Suppose a researcher needs to find participants with a body fat percentage between 20 and 24 percent. What is the likelihood that the next candidate will fall within that range?
The lognormal distribution graph indicates that the probability of body fat falling in the range of 20 to 24% is 0.1864. This information can help the researcher determine how many candidates they’ll need to assess to obtain a sufficient sample size.
Random variables are a fundamental concept in statistics and probability theory. Discrete random variables take on countable, specific values, while continuous random variables assume uncountably infinite values. Understanding the properties of both types is crucial in many statistical applications.