What are Interaction Effects?
An interaction effect occurs when the effect of one variable depends on the value of another variable. Interaction effects are common in regression models, ANOVA, and designed experiments. In this post, I explain interaction effects, the interaction effect test, how to interpret interaction models, and describe the problems you can face if you don’t include them in your model.
In any study, whether it’s a taste test or a manufacturing process, many variables can affect the outcome. Changing these variables can affect the outcome directly. For instance, changing the food condiment in a taste test can affect the overall enjoyment. In this manner, analysts use models to assess the relationship between each independent variable and the dependent variable. This kind of an effect is called a main effect. While main effects are relatively straightforward, it can be a mistake to assess only main effects.
In more complex study areas, the independent variables might interact with each other. Interaction effects indicate that a third variable influences the relationship between an independent and dependent variable. In this situation, statisticians say that these variables interact because the relationship between an independent and dependent variable changes depending on the value of a third variable. This type of effect makes the model more complex, but if the real world behaves this way, it is critical to incorporate it in your model. For example, the relationship between condiments and enjoyment probably depends on the type of food—as we’ll see in this post!
Example of Interaction Effects with Categorical Independent Variables
I think of interaction effects as an “it depends” effect. You’ll see why! Let’s start with an intuitive example to help you understand these effects in an interaction model conceptually.
Imagine that we are conducting a taste test to determine which food condiment produces the highest enjoyment. We’ll perform a two-way ANOVA where our dependent variable is Enjoyment. Our two independent variables are both categorical variables: Food and Condiment.
Our ANOVA model with the interaction term is:
Satisfaction = Food Condiment Food*Condiment
To keep things simple, we’ll include only two foods (ice cream and hot dogs) and two condiments (chocolate sauce and mustard) in our analysis.
Given the specifics of the example, an interaction effect would not be surprising. If someone asks you, “Do you prefer ketchup or chocolate sauce on your food?” Undoubtedly, you will respond, “It depends on the type of food!” That’s the “it depends” nature of an interaction effect. You cannot answer the question without knowing more information about the other variable in the interaction term—which is the type of food in our example!
That’s the concept. Now, I’ll show you how to include an interaction term in your model and how to interpret the results.
How to Interpret Interaction Effects
Let’s perform our analysis. All statistical software allow you to add interaction terms in a model. Download the CSV data file to try it yourself: Interactions_Categorical.
Use the p-value for an interaction term to test its significance. In the output below, the circled p-value tells us that the interaction effect test (Food*Condiment) is statistically significant. Consequently, we know that the satisfaction you derive from the condiment depends on the type of food.
But how do we interpret the interaction in a model and truly understand what the data are saying? The best way to understand these effects is with a special type of line chart—an interaction plot. This type of plot displays the fitted values of the dependent variable on the y-axis while the x-axis shows the values of the first independent variable. Meanwhile, the various lines represent values of the second independent variable.
On an interaction plot, parallel lines indicate that there is no interaction effect while different slopes suggest that one might be present. Below is the plot for Food*Condiment.
The crossed lines on the graph suggest that there is an interaction effect, which the significant p-value for the Food*Condiment term confirms. The graph shows that enjoyment levels are higher for chocolate sauce when the food is ice cream. Conversely, satisfaction levels are higher for mustard when the food is a hot dog. If you put mustard on ice cream or chocolate sauce on hot dogs, you won’t be happy!
Which condiment is best? It depends on the type of food, and we’ve used statistics to demonstrate this effect.
Overlooking Interaction Effects is Dangerous!
When you have statistically significant interaction effects, you can’t interpret the main effects without considering the interactions. In the previous example, you can’t answer the question about which condiment is better without knowing the type of food. Again, “it depends.”
Suppose we want to maximize satisfaction by choosing the best food and the best condiment. However, imagine that we forgot to include the interaction effect and assessed only the main effects. We’ll make our decision based on the main effects plots below.
Based on these plots, we’d choose hot dogs with chocolate sauce because they each produce higher enjoyment. That’s not a good choice despite what the main effects show! When you have statistically significant interactions, you cannot interpret the main effect without considering the interaction effects.
Given the intentionally intuitive nature of our silly example, the consequence of disregarding the interaction effect is evident at a passing glance. However, that is not always the case, as you’ll see in the next example.
Example of an Interaction Effect with Continuous Independent Variables
For our next example, we’ll assess continuous independent variables in a regression model for a manufacturing process. The independent variables (processing time, temperature, and pressure) affect the dependent variable (product strength). Here’s the CSV data file if you want to try it yourself: Interactions_Continuous. To learn how to recreate the continuous interaction plot using Excel, download this Excel file: Continuous Interaction Excel.
In the interaction model, I’ll include temperature*pressure as an interaction effect. The results are below.
As you can see, the interaction effect test is statistically significant. But how do you interpret the interaction coefficient in the regression equation? You could try entering values into the regression equation and piece things together. However, it is much easier to use interaction plots!
In the graph above, the variables are continuous rather than categorical. To produce the plot, the statistical software chooses a high value and a low value for pressure and enters them into the equation along with the range of values for temperature.
As you can see, the relationship between temperature and strength changes direction based on the pressure. For high pressures, there is a positive relationship between temperature and strength while for low pressures it is a negative relationship. By including the interaction term in the model, you can capture relationships that change based on the value of another variable.
If you want to maximize product strength and someone asks you if the process should use a high or low temperature, you’d have to respond, “It depends.” In this case, it depends on the pressure. You cannot answer the question about temperature without knowing the pressure value.
Important Considerations for Interaction Effects
While the plots help you interpret the interaction effects, use a hypothesis test to determine whether the effect is statistically significant. Plots can display non-parallel lines that represent random sampling error rather than an actual effect. P-values and hypothesis tests help you sort out the real effects from the noise.
The examples in this post are two-way interactions because there are two independent variables in each term (Food*Condiment and Temperature*Pressure). It’s equally valid to interpret these effects in two ways. For example, the relationship between:
- Satisfaction and Condiment depends on Food.
- Satisfaction and Food depends on Condiment.
You can have higher-order interactions. For example, a three-way interaction has three variables in the term, such as Food*Condiment*X. In this case, the relationship between Satisfaction and Condiment depends on both Food and X. However, this type of effect is challenging to interpret. In practice, analysts use them infrequently. However, in some models, they might be necessary to provide an adequate fit.
Finally, when an interaction effect test is statistically significant, do not attempt to interpret the main effects without considering the interaction effects. As the examples show, you will draw the wrong the conclusions!
If you’re learning regression and like the approach I use in my blog, check out my Intuitive Guide to Regression Analysis book! You can find it on Amazon and other retailers.