What is Experimental Design?
An experimental design is a detailed plan for collecting and using data to identify causal relationships. Through careful planning, the design of experiments allows your data collection efforts to have a reasonable chance of detecting effects and testing hypotheses that answer your research questions.
An experiment is a data collection procedure that occurs in controlled conditions to identify and understand causal relationships between variables. Researchers can use many potential designs. The ultimate choice depends on their research question, resources, goals, and constraints. In some fields of study, researchers refer to experimental design as the design of experiments (DOE). Both terms are synonymous.
At its most basic level, an experiment involves researchers manipulating at least one independent variable (aka factors or inputs) under controlled conditions, and they measure the changes in the dependent variable (outcomes). An effective experimental design develops a systematic procedure that increases the ability to draw meaningful conclusions from the data and reduces the interference of other variables that the researchers aren’t studying.
Ultimately, the design of experiments helps ensure that your procedures and data will evaluate your research question effectively. Without an experimental design, you might waste your efforts in a process that, for many potential reasons, can’t answer your research question. In short, it helps you trust your results.
Learn more about Independent and Dependent Variables.
Design of Experiments: Goals & Settings
Experiments occur in many settings, ranging from psychology, social sciences, medicine, physics, engineering, and industrial and service sectors. Typically, experimental goals are to discover a previously unknown effect, confirm a known effect, or test a hypothesis.
Effects represent causal relationships between variables. For example, in a medical experiment, does the new medicine cause an improvement in health outcomes? If so, the medicine has a causal effect on the outcome.
An experimental design’s focus depends on the subject area and can include the following goals:
- Understanding the relationships between variables.
- Identifying the variables that have the largest impact on the outcomes.
- Finding the input variable settings that produce an optimal result.
For example, psychologists have conducted experiments to understand how conformity affects decision-making. Sociologists have performed experiments to determine whether ethnicity affects the public reaction to staged bike thefts. These experiments map out the causal relationships between variables, and their primary goal is to understand the role of various factors.
Conversely, in a manufacturing environment, the researchers might use an experimental design to find the factors that most effectively improve their product’s strength, identify the optimal manufacturing settings, and do all that while accounting for various constraints. In short, a manufacturer’s goal is often to use experiments to improve their products cost-effectively.
In a medical experiment, the goal might be to quantify the medicine’s effect and find the optimum dosage.
Developing an Experimental Design
Developing an experimental design involves planning that maximizes the potential to collect data that is both trustworthy and able to detect causal relationships. Specifically, these studies aim to see effects when they exist in the population the researchers are studying, preferentially favor causal effects, isolate each factor’s true effect from potential confounders, and produce conclusions that you can generalize to the real world.
To accomplish these goals, experimental designs carefully manage data validity and reliability, and internal and external experimental validity. When your experiment is valid and reliable, you can expect your procedures and data to produce trustworthy results.
An excellent experimental design involves the following:
- Lots of preplanning.
- Developing experimental treatments.
- Determining how to assign subjects to treatment groups.
The remainder of this article focuses on how experimental designs incorporate these essential items to accomplish their research goals.
Learn more about Data Reliability vs. Validity and Internal and External Experimental Validity.
Preplanning, Defining, and Operationalizing for Design of Experiments
Due to the numerous complex objectives associated with experimental designs, there are many issues, constraints, and tradeoffs to consider before you even start collecting data. This process involves an in-depth literature review so you can understand the current state of scientific knowledge surrounding your research question.
This phase of the design of experiments helps you identify critical variables, know how to measure them while ensuring reliability and validity, and understand the relationships between them. The review can also help you find ways to reduce sources of variability, which increases your ability to detect treatment effects. Notably, the literature review allows you to learn how similar studies designed their experiments and the challenges they faced.
Operationalizing a study involves taking your research question, using the background information you gathered, and formulating an actionable plan.
This process should produce a specific and testable hypothesis using data that you can reasonably collect given the resources available to the experiment.
- Null hypothesis: The jumping exercise intervention does not affect bone density.
- Alternative hypothesis: The jumping exercise intervention affects bone density.
To learn more about this early phase, read Five Steps for Conducting Scientific Studies with Statistical Analyses.
Formulating Treatments in Experimental Designs
In an experimental design, treatments are variables that the researchers control. They are the primary independent variables of interest. Researchers administer the treatment to the subjects or items in the experiment and want to know whether it causes changes in the outcome.
As the name implies, a treatment can be medical in nature, such as a new medicine or vaccine. But it’s a general term that applies to other things such as training programs, manufacturing settings, teaching methods, and types of fertilizers. I helped run an experiment where the treatment was a jumping exercise intervention that we hoped would increase bone density. All these treatment examples are things that potentially influence a measurable outcome.
Even when you know your treatment generally, you must carefully consider the amount. How large of a dose? If you’re comparing three different temperatures in a manufacturing process, how far apart are they? For my bone mineral density study, we had to determine how frequently the exercise sessions would occur and how long each lasted.
How you define the treatments in the design of experiments can affect your findings and the generalizability of your results.
Assigning Subjects to Experimental Groups
A crucial decision for all experimental designs is determining how researchers assign subjects to the experimental conditions—the treatment and control groups. The control group is often, but not always, the lack of a treatment. It serves as a basis for comparison by showing outcomes for subjects who don’t receive a treatment. Learn more about Control Groups.
How your experimental design assigns subjects to the groups affects how confident you can be that the findings represent true causal effects rather than mere correlation caused by confounders. Indeed, the assignment method influences how you control for confounding variables. This is the difference between correlation and causation.
Imagine a study finds that vitamin consumption correlates with better health outcomes. As a researcher, you want to be able to say that vitamin consumption causes the improvements. However, with the wrong experimental design, you might only be able to say there is an association. A confounder, and not the vitamins, might actually cause the health benefits.
Let’s explore some of the ways to assign subjects in design of experiments.
Completely Randomized Designs
A completely randomized experimental design randomly assigns all subjects to the treatment and control groups. You simply take each participant and use a random process to determine their group assignment. You can flip coins, roll a die, or use a computer. Randomized experiments must be prospective studies because they need to be able to control group assignment.
Random assignment in the design of experiments helps ensure that the groups are roughly equivalent at the beginning of the study. This equivalence at the start increases your confidence that any differences you see at the end were caused by the treatments. The randomization tends to equalize confounders between the experimental groups and, thereby, cancels out their effects, leaving only the treatment effects.
For example, in a vitamin study, the researchers can randomly assign participants to either the control or vitamin group. Because the groups are approximately equal when the experiment starts, if the health outcomes are different at the end of the study, the researchers can be confident that the vitamins caused those improvements.
Statisticians consider randomized experimental designs to be the best for identifying causal relationships.
If you can’t randomly assign subjects but want to draw causal conclusions about an intervention, consider using a quasi-experimental design.
Learn more about Randomized Controlled Trials and Random Assignment in Experiments.
Randomized Block Designs
Nuisance factors are variables that can affect the outcome, but they are not the researcher’s primary interest. Unfortunately, they can hide or distort the treatment results. When experimenters know about specific nuisance factors, they can use a randomized block design to minimize their impact.
This experimental design takes subjects with a shared “nuisance” characteristic and groups them into blocks. The participants in each block are then randomly assigned to the experimental groups. This process allows the experiment to control for known nuisance factors.
Blocking in the design of experiments reduces the impact of nuisance factors on experimental error. The analysis assesses the effects of the treatment within each block, which removes the variability between blocks. The result is that blocked experimental designs can reduce the impact of nuisance variables, increasing the ability to detect treatment effects accurately.
Suppose you’re testing various teaching methods. Because grade level likely affects educational outcomes, you might use grade level as a blocking factor. To use a randomized block design for this scenario, divide the participants by grade level and then randomly assign the members of each grade level to the experimental groups.
A standard guideline for an experimental design is to “Block what you can, randomize what you cannot.” Use blocking for a few primary nuisance factors. Then use random assignment to distribute the unblocked nuisance factors equally between the experimental conditions.
You can also use covariates to control nuisance factors. Learn about Covariates: Definition and Uses.
Observational Studies
In some experimental designs, randomly assigning subjects to the experimental conditions is impossible or unethical. The researchers simply can’t assign participants to the experimental groups. However, they can observe them in their natural groupings, measure the essential variables, and look for correlations. These observational studies are also known as quasi-experimental designs. Retrospective studies must be observational in nature because they look back at past events.
Imagine you’re studying the effects of depression on an activity. Clearly, you can’t randomly assign participants to the depression and control groups. But you can observe participants with and without depression and see how their task performance differs.
Observational studies let you perform research when you can’t control the treatment. However, quasi-experimental designs increase the problem of confounding variables. For this design of experiments, correlation does not necessarily imply causation. While special procedures can help control confounders in an observational study, you’re ultimately less confident that the results represent causal findings.
Learn more about Observational Studies.
For a good comparison, learn about the differences and tradeoffs between Observational Studies and Randomized Experiments.
Between-Subjects vs. Within-Subjects Experimental Designs
When you think of the design of experiments, you probably picture a treatment and control group. Researchers assign participants to only one of these groups, so each group contains entirely different subjects than the other groups. Analysts compare the groups at the end of the experiment. Statisticians refer to this method as a between-subjects, or independent measures, experimental design.
In a between-subjects design, you can have more than one treatment group, but each subject is exposed to only one condition, the control group or one of the treatment groups.
A potential downside to this approach is that differences between groups at the beginning can affect the results at the end. As you’ve read earlier, random assignment can reduce those differences, but it is imperfect. There will always be some variability between the groups.
In a within-subjects experimental design, also known as repeated measures, subjects experience all treatment conditions and are measured for each. Each subject acts as their own control, which reduces variability and increases the statistical power to detect effects.
In this experimental design, you minimize pre-existing differences between the experimental conditions because they all contain the same subjects. However, the order of treatments can affect the results. Beware of practice and fatigue effects. Learn more about Repeated Measures Designs.
Between-Subjects | Within-Subjects |
Assigned to one experimental condition | Participates in all experimental conditions |
Requires more subjects | Fewer subjects |
Differences between subjects in the groups can affect the results | Uses same subjects in all conditions. |
No order of treatment effects. | Order of treatments can affect results. |
Design of Experiments Examples
For example, a bone density study has three experimental groups—a control group, a stretching exercise group, and a jumping exercise group.
In a between-subjects experimental design, scientists randomly assign each participant to one of the three groups.
In a within-subjects design, all subjects experience the three conditions sequentially while the researchers measure bone density repeatedly. The procedure can switch the order of treatments for the participants to help reduce order effects.
Matched Pairs Experimental Design
A matched pairs experimental design is a between-subjects study that uses pairs of similar subjects. Researchers use this approach to reduce pre-existing differences between experimental groups. It’s yet another design of experiments method for reducing sources of variability.
Researchers identify variables likely to affect the outcome, such as demographics. When they pick a subject with a set of characteristics, they try to locate another participant with similar attributes to create a matched pair. Scientists randomly assign one member of a pair to the treatment group and the other to the control group.
On the plus side, this process creates two similar groups, and it doesn’t create treatment order effects. While matched pairs do not produce the perfectly matched groups of a within-subjects design (which uses the same subjects in all conditions), it aims to reduce variability between groups relative to a between-subjects study.
On the downside, finding matched pairs is very time-consuming. Additionally, if one member of a matched pair drops out, the other subject must leave the study too.
Learn more about Matched Pairs Design: Uses & Examples.
Another consideration is whether you’ll use a cross-sectional design (one point in time) or use a longitudinal study to track changes over time.
A case study is a research method that often serves as a precursor to a more rigorous experimental design by identifying research questions, variables, and hypotheses to test. Learn more about What is a Case Study? Definition & Examples.
In conclusion, the design of experiments is extremely sensitive to subject area concerns and the time and resources available to the researchers. Developing a suitable experimental design requires balancing a multitude of considerations. A successful design is necessary to obtain trustworthy answers to your research question and to have a reasonable chance of detecting treatment effects when they exist.
Miguel Petrere Jr says
Dear Jim
You wrote a superb document, I will use it in my Buistatistics course, along with your three books.
Thank you very much!
Miguel
Jim Frost says
Thanks so much, Miguel! Glad this post was helpful and I trust the books will be as well.
Jona kafunga says
What are the purpose and uses of experimental research design?