What is Content Validity?
Content validity is the degree to which a test or assessment instrument evaluates all aspects of the topic, construct, or behavior that it is designed to measure. Do the items fully cover the subject? High content validity indicates that the test fully covers the topic for the target audience. Lower results suggest that the test does not contain relevant facets of the subject matter.
Investigators frequently assess content validity when creating a test or survey that appraises knowledge of a subject area. Does the test cover the full range of topics? In psychology, researchers often consider content validity when developing a psychological scale, such as a depression scale. Does the instrument cover the full range of dimensions related to the psychological construct or only a portion?
For example, imagine that I designed a test that evaluates how well students understand statistics at a level appropriate for an introductory college course. Content validity assesses my test to see if it covers suitable material for that subject area at that level of expertise. In other words, does my test cover all pertinent facets of the content area? Is it missing concepts?
Learn more about other Types of Validity and Reliability vs. Validity.
Content Validity Examples
Evaluating content validity is crucial for the following examples to ensure the tests assess the full range of knowledge and aspects of the psychological constructs:
- A test to obtain a license, such as driving or selling real estate.
- Standardized testing for academic purposes, such as the SAT and GRE.
- Tests that evaluate knowledge of subject area domains, such as biology, physics, and literature.
- A scale for assessing anger management.
- A questionnaire that evaluates coping abilities.
- A scale to assess problematic drinking.
How to Measure Content Validity
Measuring content validity involves assessing individual questions on a test and asking experts whether each one targets characteristics that the instrument is designed to cover. This process compares the test against its goals and the theoretical properties of the construct. Researchers systematically determine whether each item contributes, and that no aspect is overlooked.
Advanced content validity assessments use multivariate factor analysis to find the number of underlying dimensions that the test items cover. In this context, analysts can use factor analysis to determine whether the items collectively measure a sufficient number and type of fundamental factors. If the measurement instrument does not sufficiently cover the dimensions, the researchers should improve it. Learn more in my Guide to Factor Analysis with an Example.
Content Validity Ratio
For this overview, let’s look at a more intuitive approach.
Most assessment processes in this realm obtain input from subject matter experts. Lawshe* proposed a standard method for measuring content validity in psychology that incorporates expert ratings. This approach involves asking experts to determine whether the knowledge or skill that each item on the test assesses is “essential,” “useful, but not necessary,” or “not necessary.”
His method is essentially a form of inter-rater reliability about the importance of each item. You want all or most experts to agree that each item is “essential.”
Lawshe then proposes that you calculate the content validity ratio (CVR) for each question:
- Ne = Number of “essentials” for an item.
- N = Number of experts.
Using this formula, you’ll obtain values ranging from -1 (perfect disagreement) to +1 (perfect agreement) for each question. Values above 0 indicate that more than half the experts agree.
However, it’s essential to consider whether the agreement might be due to chance. Don’t worry! Critical values for the ratio can help you make that determination. These critical values depend on the number of experts. You can find them here: Critical Values for Lawshe’s CVR.
The content validity index (CVI) is the mean CVR for all items and it provides an overall assessment of the measurement instrument. Values closer to 1 are better.
Finally, CVR distinguishes between necessary and unnecessary questions, but it does not identify missing facets.
Lawshe, CH, A Quantitative Approach to Content Validity, Personnel Psychology, 1975, 28, 563-575.
Comments and Questions