Discrete vs continuous data are two broad categories of numeric variables. Numeric variables represent characteristics that you can express as numbers rather than descriptive language.

When you have a numeric variable, you need to determine whether it is discrete or continuous.

In broad strokes, the critical factor is the following:

- You count discrete data.
- You measure continuous data.

Let’s dig a little deeper into the differences! I’ll explain the differences and provide examples of discrete vs continuous data.

**Related post**: What is a Variable?

## What is Discrete Data?

Discrete variables can only assume specific values that you cannot subdivide. Typically, you count them, and the results are integers. For example, if you work at an animal shelter, you’ll count the number of cats.

Discrete data can only take on specific values. For example, you might count 20 cats at the animal shelter. These variables cannot have fractional or decimal values. You can have 20 or 21 cats, but not 20.5! Natural numbers have discrete values.

Other examples of discrete variables include the following:

- The number of books you check out from the library.
- The number of heads in a sequence of coin tosses.
- The result of rolling a die.
- The number of patients in a hospital.
- The population of a country.

While discrete data have no decimal places, the average of these values can be fractional. For example, families can have only a discrete number of children: 1, 2, 3, etc. However, the average number of children per family can be 2.2.

Frequently, you’ll use bar charts to graph discrete data because the separate bars emphasize the distinct nature of each value. However, it’s appropriate to use other graphs as well.

When you have discrete values of a qualitative nature (i.e., attributes rather than numbers), it’s called categorical or nominal data.

## What is Continuous Data?

Continuous variables can assume any numeric value and can be meaningfully split into smaller parts. Consequently, they have valid fractional and decimal values. In fact, continuous data have an infinite number of potential values between any two points. Generally, you measure them using a scale.

When you see decimal places for individual values, you’re looking at a continuous variable.

Examples of continuous data include weight, height, length, time, and temperature.

Frequently, you’ll use histograms and scatterplots to graph continuous data. These graphs are designed to handle values that fall on a continuous spectrum and have decimal places.

## Discrete vs. Continuous Data Summary

Discrete Data |
Continuous Data |

Specific values that you cannot divide. | Infinite number of fractional values between any two values. |

Counting | Measuring |

Both types of variables are essential in statistics. At the animal shelter, after counting the cats, you’ll weigh them. The counts are discrete values while their weights are continuous. Chances are you’ll need to analyze both types of variables.

It’s vital to recognize discrete vs continuous data because there are different ways to graph and analyze them. To learn more about how to assess different types of variables, read the following posts:

- Levels of Measurement: Nominal, Ordinal, Interval, and Ratio Scales
- Variable Types and How to Graph Them
- Comparing Hypothesis Tests by Types of Variables
- Choosing Regression Analysis Based on Data Types
- Probability Distributions for Discrete and Continuous Variables

Richard says

Why does one have to add 1 before dividing by 2 to estimate the median position for discrete data, but not for continuous? Surely the middle of N samples, is the (N+1)/2 th sample, irrespective of whether the actual data samples themselves are discrete or continuous.

E.g. if I have 10 shoe sizes, the median would be the 5.5th value. But if I had 10 temperatures, the median value would be the 5th. How can the middle move? Surely the middle is the middle, irrespective of what it’s the actual middle of?

Any enlightenment, gratefully received!

Jim Frost says

Hi Richard,

I don’t know where you heard that rule for discrete vs. continuous data. It certainly wasn’t on my blog! There is no such rule for discrete vs. continuous. However, it does vary depending on whether you have an even vs. odd number of data points. The reason? Because there is no middle for a dataset with even numbers.

As I point out in this post, when you have an even number of data points (continuous or discrete), you take the average of the two innermost points. So, if you have 10 data points, it’s the average of the 5th and 6th values because there is no value at the 5.5th position! Data rank is itself a discrete variable, which means you cannot have a 5.5th ranked data point.

When you have an odd number, it’s the middle point. So, if you have 11 data points, the median is the 6th data point. It’s in this case where you can use the method of (N + 1) / 2. (11 + 1) / 2 = 6.

But that doesn’t work for an even number whether you add one or not. There is no middle point, so you need to take the average.

Reread this blog post carefully and see how you calculate the median for even and odd number of data points. It varies based on even/odd but not discrete vs. continuous.

I hope that helps!

John Nikola says

Thank you so much, Jim! I appreciate your help. Rest assured, I’ll share this information with my peers.

Jim Frost says

You’re very welcome! And thanks for sharing. You were quite right in being concerned by that practice!

John Nikola says

Hello Jim! Would it be considered “statistically valid” to transform a categorical variable into discrete values (for instance, from 7 categories to a numerical range of 1-7), and then incorporate it as an axis in a scatter plot. Then add a trendline and assume a potential relationship with the variable it’s plotted against? I’ve observed these types of visualizations among my peers in my data science capstone course.

Jim Frost says

Hi John,

No, that’s not valid. Categorical variables are divided into mutually exclusive categories that statisticians call levels. These levels do not have a natural order, nor do they provide any quantitative information. They are category names and all you can do is name the group to which each observation belongs. In short, you can’t place them on an X-axis with a meaningful order or meaningful distances between the groups. (You can show them spread along the X-axis but the distances between levels are meaningless and the groups have no natural order along it.)

Think of categorical variables like college major, profession, and literature genre. There’s no natural order to list them. There’s no distance between them. They are just different types.

Now, you can measure another variable, say income, and calculate averages based on a categorical variable. For example, average income by profession or college major. You can sort the categories, say college major, by income. However, that’s just for convenience and not based on a natural order for college major. It’s inappropriate to use a trendline in that example because there is no distance along the X-axis for the categorical variable. A trendline has a slope, which is the rise/run. However, with a categorical independent variable, the “run” is meaningless/incalculable.

In these cases, you need to use an analysis like ANOVA, which determines whether the group means are unequal. Using a trendline is inappropriate in cases where you have a categorical independent variable and a continuous dependent variable.

I hope that helps!