• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar
  • My Store
  • Glossary
  • Home
  • About Me
  • Contact Me

Statistics By Jim

Making statistics intuitive

  • Graphs
  • Basics
  • Hypothesis Testing
  • Regression
  • ANOVA
  • Probability
  • Time Series
  • Fun

Covariates: Definition & Uses

By Jim Frost 2 Comments

What is a Covariate?

Covariates are continuous independent variables (or predictors) in a regression or ANOVA model. These variables can explain some of the variability in the dependent variable.

That definition of covariates is simple enough. However, the usage of the term has changed over time. Consequently, analysts can have drastically different contexts in mind when discussing covariates.

An image of someone understanding covariates.Historically, statisticians considered covariates to be a subtype of continuous predictors that appears only in ANOVA models, usually relating to designed experiments (DOE). Originally, they were part of experimental designs where the primary variables of interest are categorical factors that the researchers control.

In these designs, most other potential explanatory variables (confounders) are addressed by controlling the experimental environment and using a randomized design. However, analysts might be aware of uncontrollable variables that could influence the outcome in some studies.

These nuisance variables are covariates. They’re a nuisance because they can increase both variability and bias.

Including these nuisance variables as covariates in the model statistically controls their impact on the dependent variable, which can increase statistical power and reduce confounder bias. Learn more about How Confounders Can Bias Your Results.

So, the historical definition of a covariate is that it is:

  • In an experimental design where researchers set the categorical factors of primary interest.
  • A continuous, independent variable that researchers measure (as opposed to setting).
  • Uncontrollable and can’t be randomized (i.e., a nuisance).
  • Not a primary variable of interest even though it correlates with the outcome.

When you include a covariate in an ANOVA model, it becomes an ANCOVA model (Analysis of Covariance).

I’ve heard long-time researchers stick steadfastly to this definition and even firmly proclaim that the analytical procedure must enter a covariate into the model last to calculate the sums of squares correctly!

Learn more about Experimental Designs. and Independent vs. Dependent Variables.

Modern Usage

In current times, the historical definition of covariate has faded somewhat. Many analysts use this term as a synonym for a continuous predictor—not only for the specific subset of experimental design cases I describe above.

In current usage, a covariate might be a primary variable of interest in a non-DOE context!

In an analytical sense, the modern usage is valid. Covariates in the stricter context performs the same function as continuous predictors in the broader definition.

Just be aware that some analysts will have an extremely specific context in mind when discussing covariates. Others will be thinking in much broader terms!

Covariate Example

Let’s look at a covariate example that fits the original definition involving an experimental design.

Consider a manufacturing process where temperature and pressure are experimental factors. The experimenters set the temperature controls at A, B, and C and the pressure controls at X, Y, and Z. While temperature and pressure are continuous variables, the experiment treats them as categorical factors because the researchers set them to several specific values.

To minimize sources of variation and the effect of other variables, the researchers control the experimental environment as much as possible and use randomization to determine the settings for each experimental run. All in all, it’s a highly controlled, randomized experiment.

However, the researchers know from experience that humidity levels also affect the outcome. Unfortunately, humidity is much harder to control because it depends on outdoor conditions and is impossible to regulate throughout the manufacturing environment. Consequently, they record humidity as a covariate during each experimental run so the ANCOVA model can account for its effect.

The manufacturer is primarily interested in how Temperature and Pressure affect their manufacturing outcome. However, by including humidity as a covariate, the model can control for changing humidity conditions during the experiment.

Share this:

  • Tweet

Related

Filed Under: ANOVA Tagged With: conceptual, data types

Reader Interactions

Comments

  1. Collin says

    November 27, 2022 at 8:16 am

    Can’t the type of experimental design, say, completely randomized block design (RCBD), or Latin square design rule out the effects of a potential confounding factor. In other words, isn’t the consideration of a covariate only applicable to RCD trials?

    Reply
    • Jim Frost says

      November 27, 2022 at 6:54 pm

      Hi Collin,

      Blocked design, including Latin Square designs, are one way to handle nuisance variables. Blocks are essentially a categorical nuisance variable. For example, a block might represent days when you think the experimental conditions might change on the different days over which experimental runs occur. With blocks, you might not even be sure exactly what the nuisance is, or it might be a combination of variables, such as with blocking by day. Although, blocks can certainly represent known factors, such as material batches, shifts, etc. But either way, blocks are categorical.

      Covariates are another method for handling continuous nuisance variables. You’ll enter the nuisance variables with continuous values. Humidity is a good example of a covariate. It’s not categorical but quite clearly a continuous variable where you’d enter the percentage.

      And you can use blocks and covariates together too!

      Reply

Comments and Questions Cancel reply

Primary Sidebar

Meet Jim

I’ll help you intuitively understand statistics by focusing on concepts and using plain English so you can concentrate on understanding your results.

Read More...

Buy My Introduction to Statistics Book!

Cover of my Introduction to Statistics: An Intuitive Guide ebook.

Buy My Hypothesis Testing Book!

Cover image of my Hypothesis Testing: An Intuitive Guide ebook.

Buy My Regression Book!

Cover for my ebook, Regression Analysis: An Intuitive Guide for Using and Interpreting Linear Models.

Subscribe by Email

Enter your email address to receive notifications of new posts by email.

    I won't send you spam. Unsubscribe at any time.

    Follow Me

    • FacebookFacebook
    • RSS FeedRSS Feed
    • TwitterTwitter

    Top Posts

    • How to Interpret P-values and Coefficients in Regression Analysis
    • How To Interpret R-squared in Regression Analysis
    • Mean, Median, and Mode: Measures of Central Tendency
    • Multicollinearity in Regression Analysis: Problems, Detection, and Solutions
    • How to Interpret the F-test of Overall Significance in Regression Analysis
    • Choosing the Correct Type of Regression Analysis
    • How to Find the P value: Process and Calculations
    • Interpreting Correlation Coefficients
    • How to do t-Tests in Excel
    • Z-table

    Recent Posts

    • Fishers Exact Test: Using & Interpreting
    • Percent Change: Formula and Calculation Steps
    • X and Y Axis in Graphs
    • Simpsons Paradox Explained
    • Covariates: Definition & Uses
    • Weighted Average: Formula & Calculation Examples

    Recent Comments

    • Dave on Control Variables: Definition, Uses & Examples
    • Jim Frost on How High Does R-squared Need to Be?
    • Mark Solomons on How High Does R-squared Need to Be?
    • John Grenci on Normal Distribution in Statistics
    • Jim Frost on Normal Distribution in Statistics

    Copyright © 2023 · Jim Frost · Privacy Policy