What is Ordinal Data?
Ordinal data have at least three categories that have a natural rank order. The categories are ranked, but the differences between ranks may not be equal. These data indicate the order of values but not the degree of difference between them. For example, first, second, and third places in a race are ordinal data. You can clearly understand the order of finishes. However, the time difference between first and second place might not be the same as between second and third place.
Ordinal data are prevalent in social science and survey research. These variables are relatively convenient for respondents to choose even when the underlying variable is complex, allowing you to compare the participants. For example, subject-area expertise can be tricky to measure using a continuous scale. However, ordinal data can make this evaluation much easier by using Beginner, Intermediate, and Expert ranking choices in a survey.
Likert scale items in a survey are ordinal data. These items typically have 5 or 7 possible responses.
While this data type is expedient, it has downsides that limit the valid summary values and analyses you can use. More on this later!
Learn more about Likert Scale: Survey Use & Examples.
Ordinal Data Examples
The key concept behind ordinal data is that it ranks observations. However, these ranks don’t indicate the relative degree of difference between two observations. For instance, you know that a high-income person earns more than a middle-income individual, but you don’t know how much more they make. Keep that in mind as you consider the following examples.
Ordinal Data | Ranks |
Expertise |
|
Education level |
|
Income |
|
Agreement level |
|
Frequency of activity |
|
Comparisons to Other Types of Variables
Ordinal data share properties with both nominal and continuous variables yet are distinct from either.
Nominal vs Ordinal Data
Ordinal and nominal data are discrete variables that define categories. Consequently, statisticians consider both types to be qualitative data.
However, you can rank ordinal data, which is impossible with nominal data.
For example, college major is nominal data; you can’t rank those categories using that variable alone. They’re simply names of distinct groups, such as statistics, political science, and psychology.
Conversely, ordinal data form groups that you can inherently rank. For example, the relative size of college majors at an institution can be small, medium, or large.
Related post: Discrete vs Continuous Variables
Continuous vs Ordinal
Ordinal and continuous data (both interval and ratio scale) can rank observations on a scale. In other words, you can record that one observation has more of a characteristic than another observation. However, as discussed earlier, ordinal data can’t describe the degree of difference between values, while a continuous variable can.
For example, the size of a college major at an institution can be small, medium, or large. If one major is large and another medium, you see that the former is larger than the latter. However, you don’t know the degree of difference.
Conversely, if you measure size using a continuous variable such as the number of students or budget, you can determine the degree of difference between two observations.
In some cases, you can choose to measure a variable either as continuous or ordinal data. Whenever practical, choose the continuous form because it retains more information and gives you more options during the analysis.
Amongst the various measurement scales, ordinal data fall between the nominal and interval scales. For more information, read Nominal, Ordinal, Interval, and Ratio Scales.
Ordinal Data Limitations
The inability to know the precise differences between observations limits the mathematical functions and summary statistics you can calculate for ordinal data.
While analysts often record values for these variables using numbers, such as 1-5 for a Likert scale of agreement, that doesn’t indicate all numeric calculations are valid.
You cannot meaningfully add and subtract values. For example, if you take ordinal data values of 1 and 2, you can’t trust that summing them to 3 is a valid result. Why?
When adding 1 and 2 to get 3, you’re assuming the difference between 1 and 2 equals the difference between 2 and 3 because they’re both one unit apart. However, that is not a safe assumption with this data type.
Because addition isn’t valid, you can’t subtract because it’s the inverse function. Also, calculating the mean is invalid because it involves addition and division (also invalid). Division is valid only for continuous variables using a ratio scale.
Analyzing Ordinal Data
So, what can you do with these variables? Which summary statistics are valid? And what kind of analyses can you perform?
Graphing
Bar graphs are great for displaying discrete variables. Consequently, they’re an excellent choice for visually understanding ordinal data.
The bar chart below displays a Likert scale item for service ratings from Very Poor to Very Good.
It’s easy to see that most patrons rated the service as Good.
Learn more about Bar Charts and Data Types and How to Graph Them.
Summary Statistics
Measures of central tendency and variability are two standard summary statistics to report with your results. However, the mean and standard deviation are questionable for ordinal data. Consequently, consider using the following alternatives:
- Mode and Median for the central tendency.
- Range, interquartile range, and interval for two percentiles for variability.
Click the links to learn more about these concepts and statistics.
Hypothesis Testing
Similarly, the standard hypothesis tests for the mean (e.g., t-tests and ANOVA) are questionable for this type of variable. Means tests are parametric hypothesis tests.
Instead, consider using nonparametric hypothesis tests as an alternative. They assess medians and ranks, making them perfect for ordinal data. These tests include Mood’s Median, Mann-Whitney, Wilcoxon, Friedman’s Test, and Spearman’s rho.
For more information, read my post about Parametric vs Nonparametric Hypothesis Tests and Spearman’s rho.
Finally, statisticians have some disputes over using hypothesis tests for the mean with Likert scale items. I discussed the critical reason in this post—the mean is not valid. However, some make the case that for specific Likert scales the differences between values are equal by design. If that is true, then the mean might be valid.
However, for parametric hypothesis testing, there are additional concerns for ordinal data. Specifically, these variables are less likely to satisfy the analysis’ assumptions. To learn more about this issue, including some answers about what to do, read my post about Analyzing Likert Scale Data.
Hi Jim, I have a question regarding a variable I am working on. What type of variable would you consider “grade average in sience” to be? I can see it being both ordinal and continious (interval). In my study I am going to investigate whether students confidence in science, and the amount of time they spend reading can predict their grade average at the end of the school year.
Hi, thank you for the great post! I also have a question. You say that mean and standard deviation are questionable for ordinal data. Although my groups are statistically different, the median for all groups is equal (=2), meaning that all my graph bars, which show the median, look the same. What should I do instead? I would have preferred to use the mean and SD since it’s easier for the reader to see a difference among the groups.
Hi Frida,
When you say that your groups are statistically significant, what do you mean? How are they different? That’ll help me answer your question.
If the frequency distributions for both groups look similar, they might not be significantly different. You can perform a nonparametric test to see test whether the means are different. If the distribution looks the same as you say, the means might not be different either.
If you can safely assume that the differences between each value in your ordinal scale are spaced equidistantly, some analysts think it’s ok to assess the mean. There is some debate among analysts over that, but some do that. So, if you can make that assumption, you might be ok with the mean and standard deviation.
Great post! Question. I have several ordinal variables which theoretically are all associated with well-being. They are measured on different scales – some 5 point and some 7 points. I want to create a composite variable. Your post made me wonder about the use of z scores for ordinal variables? I had thought about creating z scores for each variable and summing to create one score.
Hi Julie,
Sorry about the delay in replying. Sometimes things slip through the cracks!
Z-scores are specifically for continuous data that follow a normal distribution. So, it’s not appropriate for ordinal data. I suppose if your ordinal data followed a normal distribution, you might be able to use it for that purpose, but they are frequently non-normal.
Frequently, analysts will add a set of ordinal/Likert scores together, or average them, to create a composite variable. I’d probably go with that approach.
Although there is debate about whether an average is appropriate with ordinal data unless you’re sure that the values are equidistant. If the values are not equidistant, the average can be meaningless. The median is universally accepted as appropriate for ordinal data.
The technical details of ordinal/Likert can be a bit complicated and there is some debate about which measures and analyses are acceptable. But often analysts use a composite sum, average, or median of a set of Likert/ordinal values.
Please read my post about analyzing Likert scale data for a comparison between parametric vs. nonparametric analyses. Although, that post applies to individual items rather than a sum.
One of the reasons I point out these complications is because what is considered acceptable for ordinal data can vary by subject area and analysts. So, I’d also look into finding the acceptable norms for your situation.