What is a Control Variable?
Control variables, also known as controlled variables, are properties that researchers hold constant for all observations in an experiment. While these variables are not the primary focus of the research, keeping their values consistent helps the study establish the true relationships between the independent and dependent variables. The capacity to control variables directly is highest in experiments that researchers conduct under lab conditions. In observational studies, researchers can’t directly control the variables. Instead, they record the values of control variables and then use statistical procedures to account for them.
Note that control variables are different from control groups.
In science, researchers assess the effects that the independent variables have on the dependent variable. However, other variables can also affect the outcome. If the scientists do not control these other variables, they can distort the primary results of interest. In other words, left uncontrolled, those other factors become confounders that can bias the findings. The uncontrolled variables may be responsible for the changes in the outcomes rather than your treatment or experimental variables. Consequently, researchers control the values of these other variables.
Suppose you are performing an experiment involving different types of fertilizers and plant growth. Those are your primary variables of interest. However, you also know that soil moisture, sunlight, and temperature affect plant growth. If you don’t hold these variables constant for all observations, they might explain the plant growth differences you observe. Consequently, moisture, sunlight, and temperature are essential control variables for your study.
If you perform the study in a controlled lab setting, you can control these environmental conditions and keep their values constant for all observations in your experiment. That way, if you do see plant growth differences, you can be more confident that the fertilizers caused them.
When researchers use control variables, they should identify them, record their values, and include the details in their write-up. This process helps other researchers understand and replicate the results.
Related posts: Independent and Dependent Variables and Confounding Variables
Control Variables and Internal Validity
By controlling variables, you increase the internal validity of your research. Internal validity is the degree of confidence that a causal relationship exists between the treatment and the difference in outcomes. In other words, how likely is it that your treatment caused the differences you observe? Are the researcher’s conclusions correct? Or, can changes in the outcome be attributed to other causes?
If the relevant variables are not controlled, you might need to attribute the changes to confounders rather than the treatment. Control variables reduce the impact of confounding variables.
Controlled Variable Examples
Experiment | Control Variables |
Does a medicine reduce illness? |
|
Are different weight loss programs effective? |
|
Do kiln time and temperature affect clay pot quality? |
|
Does a supplement improve memory recall? |
|
How to Control Variables in Science
Scientists can control variables using several methods. In some cases, variables can be controlled directly. For example, researchers can control the growing conditions for the fertilizer experiment. Or use standardized procedures and processes for all subjects to reduce other sources of variation. These efforts attempt to eliminate all differences between the treatment and control groups other than the treatments themselves.
However, sometimes that’s not possible. Fortunately, there are other approaches.
Random assignment
In some experiments, there can be too many variables to control. Additionally, the researchers might not even know all the potential confounding variables. In these cases, they can randomly assign subjects to the experimental groups. This process controls variables by averaging out all traits across the experimental groups, making them roughly equivalent when the experiment begins. The randomness helps prevent any systematic differences between the experimental groups. Learn more in my post about Random Assignment in Experiments.
Statistical control
Directly controlled variables and random assignment are methods that equalize the experimental groups. However, they aren’t always feasible. In some cases, there are too many variables to control. In other situations, random assignment might not be possible. Try randomly assigning people to smoking and non-smoking groups!
Fortunately, statistical techniques, such as multiple regression analysis, don’t balance the groups but instead use a model that statistically controls the variables. The model accounts for confounding variables.
In multiple regression analysis, including a variable in the model holds it constant while the treatment variable fluctuates. This process allows you to isolate the role of the treatment while accounting for confounders. You can also use ANOVA and ANCOVA.
For more information, read my posts, When to Use Regression and ANOVA Overview.
SAM says
Dear Jim
Sir you are doing a good job. much appreciated. Could you please tell us how to read the values of control variables like ranges and what do they mean. For instance how to read this (F=1.83; p= 0.07). Thank YOU
Alfred Wassler says
In your explanation of control variables you use the example of a research study of plant fertilizers and their growth, wanting to control for moisture, sunshine and temperature. You state “Consequently, moisture, sunlight, and temperature are essential control variables for your study. These variables can be controlled by keeping their values constant for all observations in your experiment. You do not go further as to how you control for these values, particularly when such variables are continually changing.
Al Wassler
Jim Frost says
Hi Al,
Presumably, this experiment would occur in a lab setting where you can control these variables. Plants would be raised with the same humidity, soil moisture, and light conditions.
I’ll add some text to the article to clarify that. Thanks!
Dave says
Dear Jim,
I have a question please about when a control variable is also itself part of the dependent variable. I see this referred to in the medical research literature as ‘mathematical coupling’, where – for example – the beats per minute (BPM) is the dependent variable and researchers want to put minutes also as a control variable. This seems to create a problem because ‘minutes’ appears on both sides of the equation, and the medical literature talks about spurious correlation, and the model needing to be redesigned. But do you have a simple text or reference – ideally just plain statistics/OLS rather than linked to medical research – where this could be explained in theory terms ? What goes wrong in the regression when a variable is both a control variable and part of the dependent variable (perhaps as part of a ratio or measurement of change)? I just haven’t found a textbook reference that says definitively ‘you can’t have the same variable in both sides of the regression simultaneously’, so I’m not sure whether this violates OLS and so is something to avoid entirely (with a new model design or different research question) or to live with.
Any help would be great, thank you for your work,
Dave