What is Cumulative Frequency?
Cumulative frequency is the running total of frequencies in a table. Use cumulative frequencies to answer questions about how often a characteristic occurs above or below a particular value. It is also known as a cumulative frequency distribution.
For example, how many students are in the 4th grade or lower at a school?
Cumulative frequency builds on the concepts of frequency and frequency distribution.
- Frequency: The number of times a value occurs in a dataset. For example, there are 12 4th graders in the school.
- Frequency distribution: A table that lists all values in the dataset and how many times each one occurs. Learn more about Frequency Tables.
In this post, learn how to find and construct cumulative frequency distributions for both discrete and continuous data. I’ll also show you how to create less than and greater than versions of these tables.
How to Find Cumulative Frequency
Finding a cumulative frequency distribution makes the most sense when your data have a natural order. The natural ordering allows the cumulative running total to be meaningful. With a minor change, the process works with both discrete and continuous data. Learn more about the differences between Discrete vs. Continuous Data.
For example, the grades in a school, months of a year, or age in years are discrete values with a logical order. Alternatively, when you have continuous data, you can create ranges of values known as classes. In this case, frequencies are counts of how often continuous data fall within each class.
Calculate cumulative frequency by starting at the top of a frequency table and working your way down. Take each row’s frequency and add all preceding rows. By summing the current and previous rows, you calculate the running total.
Let’s use this method to find cumulative frequency for discrete and continuous data.
Construct the Cumulative Frequency Distribution for Discrete Data
The example below shows you how to construct a cumulative frequency distribution for a discrete dataset of school grades (1 – 6). Notice how each row takes the previous cumulative frequency and then adds the frequency for that row to calculate the running total.
For example, if we look at the 3rd grade row of the table, we’ll see that the cumulative frequency is 58. This result tells us that 58 students are in the third grade and lower.
In this table, the cumulative frequency for the highest value equals the total number of observations in the dataset because all values are less than or equal to it. 6th grade is the highest value, and 88 students are less than or equal to it. Hence, we know there are 88 students in this dataset.
Construct the Cumulative Frequency Distribution for Continuous Data
When you have continuous data, you might not have any repeating values.
For example, no values repeat in the portion of the height data below. Consequently, you’d have a series of values, each having a frequency of one. These are actual data from a study I conducted involving preteen girls. The full dataset has 88 values. You can download the Excel file with the data and table: HeightFrequencyTable.
However, you can obtain meaningful information by grouping the values into ranges and finding the frequency for each class, as shown below.
Then, to create the cumulative frequency table, sum each row with all preceding rows just as we did for the discrete data example.
For example, by looking at the row for 1.46 – 1.51m, we know that 49 preteen girls (just over half the sample of 88) have heights that are less than or equal to 1.51m.
Less Than vs. Greater Than Forms of the Table
Both the preceding examples use the “less than” form of the table. When you look at those cumulative frequency tables, the value indicates the total number of observations that are less than or equal to a specific value. For example, 70 students are in 4th grade or lower.
However, what should you do when you need to understand frequencies that are greater than or equal to a particular value? Simply switch the order of values in the table to list them from highest to lowest. This process constructs a greater than cumulative frequency distribution.
In the example below, I’ll recreate the grade level table, but instead of listing the grades 1 → 6, I’ll switch it to 6 → 1. From that point, I’ll use the same method of summing the current row with all previous rows.
In this greater than distribution, the cumulative frequencies indicate the number of observations greater than or equal to a particular value. For example, 30 students are in 4th grade or higher.
In this table, the cumulative frequency for the lowest value equals the total number of observations because all observations are greater than or equal to it. 1st grade is the lowest value with 88 students greater than or equal to it.
The decision to use a less than or greater than cumulative frequency table depends on which form is most helpful for your subject area.
You can also show cumulative frequency on graphs. In the bar chart below, I added the orange cumulative line. By displaying it in a chart, it’s easy to find where most observations occur. Learn more about Bar Charts: Using, Examples and Interpreting.
In the graph, first and second graders comprise nearly half the school. As the grade levels progress from low to high, the orange line rises to the total number of students, 88.
Relative frequencies are a related concept. Click the link to learn about similarities and differences!