Interaction effects occur when the effect of one variable depends on the value of another variable. Interaction effects are common in regression analysis, ANOVA, and designed experiments. In this blog post, I explain interaction effects, how to interpret them in statistical designs, and the problems you will face if you don’t include them in your model.

In any study, whether it’s a taste test or a manufacturing process, many variables can affect the outcome. Changing these variables can affect the outcome directly. For instance, changing the food condiment in a taste test can affect the overall enjoyment. In this manner, analysts use models to assess the relationship between each independent variable and the dependent variable. This kind of an effect is called a main effect. However, it can be a mistake to assess only main effects.

In more complex study areas, the independent variables might interact with each other. Interaction effects indicate that a third variable influences the relationship between an independent and dependent variable. This type of effect makes the model more complex, but if the real world behaves this way, it is critical to incorporate it in your model. For example, the relationship between condiments and enjoyment probably depends on the type of food—as we’ll see in this post!

## Example of Interaction Effects with Categorical Independent Variables

I think of interaction effects as an “it depends” effect. You’ll see why! Let’s start with an intuitive example to help you understand these effects conceptually.

Imagine that we are conducting a taste test to determine which food condiment produces the highest enjoyment. We’ll perform a two-way ANOVA where our dependent variable is Enjoyment. Our two independent variables are both categorical variables: Food and Condiment.

Our ANOVA model with the interaction term is:

Satisfaction = Food Condiment Food*Condiment

To keep things simple, we’ll include only two foods (ice cream and hot dogs) and two condiments (chocolate sauce and mustard) in our analysis.

Given the specifics of the example, an interaction effect would not be surprising. If someone asks you, “Do you prefer ketchup or chocolate sauce on your food?” Undoubtedly, you will respond, “It depends on the type of food!” That’s the “it depends” nature of an interaction effect. You cannot answer the question without knowing more information about the other variable in the interaction term—which is the type of food in our example!

That’s the concept. Now, I’ll show you how to include an interaction term in your model and how to interpret the results.

## How to Interpret Interaction Effects

Let’s perform our analysis. All statistical software allow you to add interaction terms in a model. Download the CSV data file to try it yourself: Interactions_Categorical.

The p-values in the output below tell us that the interaction effect (Food*Condiment) is statistically significant. Consequently, we know that the satisfaction you derive from the condiment *depends* on the type of food.

But, how do we interpret the interaction effect and truly understand what the data are saying? The best way to understand these effects is with a special type of graph—an interaction plot. This type of plot displays the fitted values of the dependent variable on the y-axis while the x-axis shows the values of the first independent variable. Meanwhile, the various lines represent values of the second independent variable.

On an interaction plot, parallel lines indicate that there is no interaction effect while different slopes suggest that one might be present. Below is the plot for Food*Condiment.

The crossed lines on the graph suggest that there is an interaction effect, which the significant p-value for the Food*Condiment term confirms. The graph shows that enjoyment levels are higher for chocolate sauce when the food is ice cream. Conversely, satisfaction levels are higher for mustard when the food is a hot dog. If you put mustard on ice cream or chocolate sauce on hot dogs, you won’t be happy!

Which condiment is best? It depends on the type of food, and we’ve used statistics to demonstrate this effect.

## Overlooking Interaction Effects is Dangerous!

When you have statistically significant interaction effects, you can’t interpret the main effects without considering the interactions. In the previous example, you can’t answer the question about which condiment is better without knowing the type of food. Again, “it depends.”

Suppose we want to maximize satisfaction by choosing the best food and the best condiment. However, imagine that we forgot to include the interaction effect and assessed only the main effects. We’ll make our decision based on the main effects plots below.

Based on these plots, we’d choose hot dogs with chocolate sauce because they each produce higher enjoyment. That’s not a good choice despite what the main effects show! When you have statistically significant interactions, you cannot interpret the main effect without considering the interaction effects.

Given the intentionally intuitive nature of our silly example, the consequence of disregarding the interaction effect is evident at a passing glance. However, that is not always the case, as you’ll see in the next example.

## Example of an Interaction Effect with Continuous Independent Variables

For our next example, we’ll assess continuous independent variables in a regression model for a manufacturing process. The independent variables (processing time, temperature, and pressure) affect the dependent variable (product strength). Here’s the CSV data file if you want to try it yourself: Interactions_Continuous.

In the regression model, I’ll include temperature*pressure as an interaction effect. The results are below.

As you can see, the interaction term is statistically significant. But, how do you interpret the interaction coefficient in the regression equation? You could try entering values into the regression equation and piece things together. However, it is much easier to use interaction plots!

**Related post**: How to Interpret Regression Coefficients and Their P-values for Main Effects

In the graph above, the variables are continuous rather than categorical. To produce the plot, the statistical software chooses a high value and a low value for pressure and enters them into the equation along with the range of values for temperature.

As you can see, the relationship between temperature and strength changes direction based on the pressure. For high pressures, there is a positive relationship between temperature and strength while for low pressures it is a negative relationship. By including the interaction term in the model, you can capture relationships that change based on the value of another variable.

If you want to maximize product strength and someone asks you if the process should use a high or low temperature, you’d have to respond, “It depends.” In this case, it depends on the pressure. You cannot answer the question about temperature without knowing the pressure value.

## Important Considerations for Interaction Effects

While the plots help you interpret the interaction effects, use a hypothesis test to determine whether the effect is statistically significant. Plots can display non-parallel lines that represent random sample error rather than an actual effect. P-values and hypothesis tests help you sort out the real effects from the noise.

The examples in this post are two-way interactions because there are two independent variables in each term (Food*Condiment and Temperature*Pressure). It’s equally valid to interpret these effects in two ways. For example, the relationship between:

- Satisfaction and Condiment depends on Food.
- Satisfaction and Food depends on Condiment.

You can have higher-order interactions. For example, a three-way interaction has three variables in the term, such as Food*Condiment*X. In this case, the relationship between Satisfaction and Condiment depends on both Food and X. However, this type of effect is challenging to interpret. In practice, analysts use them infrequently. However, in some models, they might be necessary to provide an adequate fit.

Finally, when you have interaction effects that are statistically significant, do not attempt to interpret the main effects without considering the interaction effects. As the examples show, you will draw the wrong the conclusions!

If you’re learning regression, check out my Regression Tutorial!

Neha says

Thank you for amazing posts. the way you express concepts is matchless.

Jim Frost says

You’re very welcome! I’m glad they’re helpful!

Mona says

what does it mean when I have a significant interaction effect only when i omit the main effects of the independent variables (by choosing the interaction effect in “MODEL” in SPSS). it is “legal” to report the interaction effect without reporting the main effects?

Jim Frost says

Hi Mona,

That is a bit tricky.

If you had one model where the main effects are not significant, but the interaction effects are significant, that is perfectly fine.

However, it sounds like in your case you have to decide between the main effects or the interaction effects. Models where the statistical significance of terms change based on the specific terms in the model are always difficult cases. This problem often occurs (but is not limited to) in cases where you multicollinearity–so you might check on that.

This type of decision always comes down to subject area knowledge. Use your expertise, theory, other studies, etc to determine what course of action is correct. It might be OK to do what you suggest. On the other, perhaps including the main effects is the correct route.

Jim

Apple says

what is the command for conintuous by continuous variables interaction plot in stata?

Thanks

Jim Frost says

Hi, I’ve never used Stata myself, but I’ve seen people use “twoway contour” to plot two-way interaction effects in Stata. Might be a good place to start!

Sol says

Hi Jim, thank you very much for your post. My question is how do you interpret an insignificant interaction of a categorical and a continuous variable, when the main effects for both variables are significant? For the sake of simplicity if our logit equation is as follows Friendliness = α + βAge + βDog + βAge*Dog. Where Friendliness and Dog are coded as dummy variables that take the values of either 1 or 0 depending on the case. So if all but the interaction term, βAge*Dog, is significant, does that mean the probability of a dog being friendly is independent of its age?

Jim Frost says

If the Age variable is significant, then you know that friendliness

isassociated with age, and dog is as well if that variable is significant. A significant interaction effect indicates that the effect of one variable on the dependent variable depends on the value of another variable. In your example, lets assume that the interaction effect was significant. This tells you that the relationship between age and friendliness changes based on the value of the dog variable. In that case, it’s not a fixed relationship or effect size. (It’s also valid to say that the relationship between dog and friendliness changes based on the value of age.)Now, in your case, the interaction effect is not significant but the two main effects are significant. This tells you that there is a relationship between age and friendliness and a relationship between dog and friendliness. However, the exact nature of those relationships DO NOT change based on the value of the other variable. Those two variables affect the probability of observing the event in the outcome variable, but one independent variable doesn’t affect the relationship between the other independent variable and the dependent variable.

The fact that you have one categorical variable and a continuous variable makes it easier to picture. Picture a different regression line for each level of the categorical variable. These fitted lines display the relationship between the continuous independent variable and the response for each level of dog. A significant interaction effect indicates that the differences between those slopes are statistically significant. An insignificant interaction effect indicates that there is insufficient evidence to conclude that the slopes are different. I actually show an example of this situation (though not with a logistic model) that should help.

I hope that makes it more clear!

Luka says

Hello,

I am interested how to read for interaction effect if we just have a table of observations, for example

A B C

2 4 7

4 7 8

6 9 13

In the lecture I attended this was explained as “differences between differences” but I didn’t get what this refers to.

Thanks

Jim Frost says

Hi Luka, it’s impossible to for me to interpret those observations because I don’t know the relationships between the variables and there are far too few observations.

In general, you can think of an interaction effect as an “it depends” effect as I describe in this blog post. Suppose you have two independent variables X1 and X2 and the dependent variable Y. If the relationship between X1 and Y changes based on the value of X2, that’s an interaction effect. The size of the X1 effect depends on the value of X2. Read through the post to see how this works in action. The value of the interaction term for each observation is the product of X1 and X2 (X1*X2).

An effect is the difference in the mean value of Y for different values of X. So, if the interaction effect is significant, you know that the differences of Y based on X will vary based on some other variable. I think that’s what your instructor meant by the differences between differences. I tend to think of it more as the relationship between X1 and Y depends on the value of X2. If you plot a fitted line for X1 and Y, you can think of it as the slope of the line changes based on X2. There’s a link in this blog post to another blog post that shows how that works.

I hope this helps!

Syahmi says

Your explanation is really great! Thank you so much. I totally will recommend you to my friends

Jim Frost says

You’re very welcome! Thank you for recommending me to your friends!

Luka says

Thanks for help, I appreciate it!

Yeasin says

Great work Jim! People get very vague idea whenever they look at google to learn the basic about interaction in statistics. Your writing is a must see and excellent work that demonstrated the basic of interaction. Thanks heaps.

Jim Frost says

Hi Yeasin, thank you! That means a lot to me!

Tanikan says

Hi Jim,

Thank for the valuable tutorial.

I have 2 questions as follows:

1. In more complex study areas, the independent variables might interact with each other. What do you mean by complex area? Is it social science?

2. I have run Mancova and observed that results of two-way = interaction. I found that SPSS does not run post-hoc. Can I use the t-test after that?

My model is factorial design (2 levels of X1, 2 levels of X2, and 2 levels of X3) on Y.

I report in paper for two-way and three way interaction on below. Is it ok?

Two-way interaction

Among the X2 level 1 group, the mean of Y among subjects who viewed X3 level 2 (adjusted M = xxx, SE =xxx) is significantly higher than those who viewed X3 level 1 (adjusted M = xxx, SE = xxx) with t(xx) = xx, p < xx.

three-way interaction

Among the subjects who viewed the X3 level 2, the mean of Y of the subjects who expressed X1 level 2 (adjusted M = xxx, SE = xxx) is significantly greater than those who expressed X1 level 1 (adjusted M = xxx, SE = xxx) for those who had X2 level 1 [t(xx) = xxx, p < xxx].

Thank you in advance

Jim Frost says

Hi Tanikan,

Thanks for the great questions!

Regarding more and less complex study areas, in the context of this post, I’m simply referring to subject matter where only main effects are statistically significant as being simpler. And, subject areas where interaction effects are significant as more complex. I’m calling them more complex because the relationship between X and Y is not constant. Instead, that relationship depends on at least one other variable. It’s just not as simple.

I would not use t-tests for that purpose. I’m surprised if SPSS can’t perform post-hoc tests when there are interaction effects–but I use other statistical software more frequently. With your factorial design, there will be multiple groups based on the interactions of your factors. As you compare more groups, the need for controlling the family/joint/simultaneous error rate becomes even more important. Without controlling for that joint error rate, the probability that at least one of the many comparisons will be a false positive increases. T-tests don’t control that joint error rate. It’s important to use a post hoc test.

At least for the two-way interaction effects, I highly recommend using an interaction plot (as shown in this post) to accompany your findings. I find that those graphs are particularly helpful in clarifying the results. Of course, that graph doesn’t tell you which specific differences between groups are statistically significant. The post hoc tests for those groups will identify the significant differences.

I hope this helps!

Alicia says

Hi, Jim!

I have a sort of somehow interaction-related question, but I didn’t know where to post it, so this entry seemed the most adequate to me.

I work with R and I would like to use an ANCOVA to evaluate the effect of a factor (age, for example, with two levels, adult and subadult) in the regression of body length (log transformed, logLCC) and weight (log transformed, logweight). This regression measures body condition of an individual (higher weights at same lenghts indicate a better condition, that is, sort of “fluffyness”).

So, when I run the analysis:

aov(logweight~logLCC*age)

I obtain a significant interaction between logLCC:age (p=0.0068). I understand this means that slopes for each age class are not paralell. However, the factor age alone it’s not significant (p=0.2059).

What does this mean? How is it interpreted?

I have tried deleting the interaction from the model, but it loses a lot of explicative power (p=0.0068). So, what should I do? I am quite lost with this issue…

Thank you so much in advance,

Alicia

Jim Frost says

Hi Alicia!

First, before I get into the interaction effect, a comment about the model in general. I don’t know if you’re analyzing human weight or not. But, I’ve modeled Percent Body Fat and BMI. While I was doing that, I had to decide whether to use Height, Weight, and Height*Weight as the independent variables and interaction effect or should I use body mass index (BMI). I found that both models fit equally as well but I went ahead with BMI because I could graph it. I did have to include a polynomial term because the relationship was curvilinear. I notice that you’re using a log transformation. That might well be just fine and necessary. But, I found that I didn’t need to go that route. Just some food for thought. You can read about this BMI and %body fat model.

Ok, so on to your interaction effect. It’s not problematic at all that the main effect for age is not significant. In fact, when you have a significant interaction you shouldn’t try to interpret the main effect alone anyway. Now, if it had been significant and you wanted determine the entire effect of age, you would’ve had to assess both the main effect and the interaction effect together. Now, you just need assess the interaction effect alone. But, it’s always easiest to interpret interaction effects with graphs, as I do in this blog post.

In the post, I show examples of interaction plots with two factors and another with two continuous variables. However, you can certainly create an interaction plot for a factor * continuous variable. For your model, this type of graph will display two lines–one for each level of the age factor. Because you already know the interaction term is significant, the difference between the two slopes is statistically significant. (If the main effect had been significant, the interaction plot would have included it in the calculations as well–but it is fine that it’s not significant.)

It sounds like you should leave the interaction effect in the model. Some analysts will also include the main effects in the model when they are included in a significant interaction effect even if the main effect is not significant by itself (e.g., age). I could go either way on that issue myself. Just be sure that the interaction makes theoretical/common sense for your study area. But, I don’t see any reason for concern. The insignificant main effect is not a problem.

I hope this helps!

Alicia says

Hi Jim,

first of all… thank you very much for your early response!

And after that… I am so sorry! I forgot to explain that I work with lizards, not with humans. My measurement of body length (logLCC) corresponds to the log-transformed Snout-Vent Length (logSVL, whose acronym in spanish, given that it’s my mothertongue, is LCC; I forgot to translate it!). The relationship among these two variables tend to be linear.

So, in these animals, the regression of logSVL and logweight is a common and standardized method to assess body condition. Residuals from this regression are used to assess body condition; if they’re positive the animal is more “chubby” (better condition) and, if they’re negative, the animal is more “skinny” (worse condition). The aim of my ANCOVA is to compare the effect of age on this regression.

Anyway, following your advice I created an interaction plot which displays two lines, one for each level of the age factor. The two lines cross in a certain middle point, diverging prior and after that point. Thanks to your detailed answer, I understand that this means that age interacts somehow with body length (what sounds logical, as lizards grow together with aging), but I still don’t know how to interpret this in relation to body condition (regression).

Thanks again for your detailed, kind and early response!

Jim Frost says

You’re very welcome! And, subject area knowledge and standards definitely should guide your model building. I always enjoy learning how others use these types of analysis. And, that’s interesting actually using the residuals to assess a specimen’s health!

If you can, and are willing, post the interaction plot, I can take a stab at interpreting it. (I know I can post images in these comments but I’m not sure about other users.) Basically, the relationship between body length and weight depends on the age factor. Or, stated another way, you can’t use body length to predict weight until you know the age.

Alicia says

Hi, Jim!

Thank you again for your willingness! Unfortunately, I can’t /don’t know how to post the plot in the comments… If you are willing, you can contact me by email so I can send it to you, plus the results of the regression or whatever information that could be helpful.

Thank you!

Shruti says

Hi Jim,

Thanks for your explanation! It was really useful. I have a couple of follow-up questions. Let’s suppose a situation with 2 regression models, both of which have the exact same variables, except the second model has an additional interaction term between two variables already in the first model.

1. Now comparing the 2 regression equations, why do coefficients of other variables (apart from the interaction term and the 2 variables used to create the interaction term) change?

2. How do we compare and interpret the change in coefficients of variables which were used to create the interaction term in the first and second models?

Let me know in case it’s better for me to explain with an example here.

Thanks!

Jim Frost says

Hi Shruti,

I think I understand your questions.

1) Any time you add new terms in the model, the coefficients can change. Some of this occurs because the new term accounts for some of the variance that was previously accounted for by the other terms, which causes their coefficients to change. So, some change is normal. The changes can tend to be larger and more erratic when the model terms are correlated. The interaction term is clearly correlated with the variables that are included in the interaction. When you include an interaction term, you can help avoid this by standardizing your continuous variables.

2) I have heard about cases where analysts try to interpret the changes in coefficients when you add new terms. My take on this is that the changes are not very informative. Let’s assume that your interaction term is a valuable addition to the model. In that case, you can conclude the model without the interaction term is not as good of a model and it’s coefficient estimates might well be biased. Consequently, I wouldn’t attribute much meaning to the change in coefficient values other than your new model with the interaction term is likely to better.

However, one caveat, I believe there are fields that do place value in understanding those changes. I’m not sure that I agree, but if your field is one that has this practice, you should probably check with an expert.

I hope I covered everything!

Susanne says

Hello Jim!

Thanks for making such very clear posts. I tutor students with stats and its really tough to find good easy to follow material that EVERYONE can get. So to stumble on such a clear explanation is a breath of fresh air 😀

Now I recently saw in one of my students powerpoints that they are taught they have to redo the ANOVA analysis without the interaction if the interaction is not significant. Maybe i’ve always missed something but I have never heard of this before. Does this sound familiar to you and if so can you explain to me why this is?

thanks!

Susanne

Jim Frost says

Hi Susanne, thanks so much for your kind words. They mean a lot to me–especially coming from a stats tutor!

I have always heard that you should not include the interaction term when it is not significant. The reason being is that when you include insignificant terms in your model, it can reduce the precision of the estimates. Generally, you want to leave as many degrees of freedom for the error as you can.

Courtney Barrs says

Hi Jim,

Thankyou for this post, I found it incredibly helpful.

I am having trouble interpreting my own results of a two-way repeated ANOVA and was wondering if you could help me out.

Participants were exposed to two different videos, controlled with a counter balance. Video 1 consisted of a comedy sketch, while video 2 was of a nature documentary. Every 2 mins the participants had to indicate on a likert scale how Bored they felt at the time. For the analysis I averaged the boredom score over the first and second half of the video.

IV1: Video (Comedy vs Nature)

IV2: Time (Time 1 vs Time 2)

DV: Boredom score

My analysis output reveals a significant main effect of video p<.000, and non significant effect for time p=.192. However I have an effect of interaction for video*time, p<.000.

How would you go about interpreting these results?

Thanks in advance!

Jim Frost says

Hi Courtney,

I’m happy to hear that you found this post helpful!

The first thing that I’d recommend is graphing your results using an interactions plot like I do in this post. That’s the easiest way to understand interactions. It’s great that you’ve done the ANOVA test because you already know that whatever pattern you see in the plot is, in fact, statistically significant. Given the significance, I can conclude the lines on your plot won’t be parallel.

For your results, you can state them one of two ways. Both ways are equally valid from a statistical standpoint. However, one way might make more sense than the other given your study area or what you’re trying to emphasize.

1) The relationship between Video and Boredom depends on Time. Or:

2) The relationship between Time and Boredome depends on Video.

For the sake of illustration, let’s go with #2. You might be able to draw the conclusion along the lines of: As subjects progress from time 1 to time 2, the average boredom score increases more slowly for those who watch comedy compared to those who watch a nature documentary. Of course, you’d have to adapt the wording to match your actual results. That’s the type of conclusion that you can draw, and you’re able to say that it is statistically significant given the p-value for the interaction term.

Given that the interaction term is significant, you don’t need to interpret the main effects terms at all. And, it’s no problem that one of the main effects is not significant.

I hope this helps!

Courtney says

Hi Jim,

Thankyou so much for your quick and helpful response, it really means a lot!

This is what initially confused me when it came to interpreting my results, looking at my interaction graph there was no cross over. Both conditions are more or less parallel with one another, the gradient between time 1 and time 2 for comedy is almost 0. However, there is quite the drop for the nature video in the boredom rating at time 2.

Because the interaction graph does not cross over, does this mean that only in the Nature video does the boredom decrease significantly at Time 2? Will I need to conduct a t-test to check this?

Many thanks!

Courtney

Courtney Barrs says

Hi Jim,

Thankyou for such a quick and helpful response!

Graphing the interaction effect is actually what confused me when it came to interpretting my results. The conditions are actually parallel to one another, there is no cross over. The gradient for the comedy condition is almost zero, whereas, there is a dramatic drop in rating of boredom between time 1 and time 2 for the nature video.

With this in mind does the interpretation then mean: A difference in boredom is found across time depending on condition. Therefore, only if you are watching the nature video will you become significantly more bored at time 2. Will I need to conduct a t-test to conform this?

Many thanks!

Courtney

Jim Frost says

Hi Courtney,

You bet! 🙂

Technically, a significant interaction effect means that the difference in slopes is statistically significant. The lines don’t actually have to cross on your graph–just have different slopes. Well, having different slopes means that the lines must cross at some point theoretically even if that point isn’t displayed on your graph.

As for the interpretation, the zero slope for comedy indicates that as time passes, there is no tendency to become more or less bored. However, for nature videos, as time passes, there is a tendency to become more bored. (I’m assuming that the drop in rating that you mention corresponds to “becomes more bored”.) This difference in tendencies is statistically significant. The significant interaction indicates that the relationship between the passage of time and boredom depends on the type of video the subjects watch.

Again, an interaction effect is an “it depends” effect. Do the subjects become more bored over time? It depends on what type of video they watch! You can’t answer that question without knowing which video they watch.

So, the interaction tells you that the difference in slopes is statistically significant, which is different than the whether the difference between group means are statistically significant. To identify the specific differences between group means that are statistically significant, you’ll need to perform a post hoc test–such as Tukey’s test. These tests control the joint error rate because as you increase the number of group comparisons, the chance of a Type I error (false positive) increases if you don’t control it. I don’t have a blog post on this topic yet but plan to write one.

The interaction term tells you that the relationship changes while the post hoc test tells you whether the difference between specific group means is statistically significant.

Saheeda says

This is one of the best explanations I have read to explain ‘interactions’. Thanks!

Jim Frost says

Thanks so much, Saheeda! Your kind words mean a lot to me! I’m glad it was helpful.

Bill says

Hello. Jim. Thank for your great article.

Sorry in advance for my English. Moreover, my understanding for SPSS and stat is quite limited so some question might be silly.

I’m doing 4×5 factorial ANOVA. One of the test has Sig. interaction effect but I don’t know what exact method should I interpret it. Some told that I need to do simple main effect test, some told that Post Hoc is enough so I’m quite confused.

Another test the graph shown some cross-over line (because there are a lot of levels of iv) but the sig. value is 0.069 = not significant interaction effect right?. However I’ve read that if the line crossed, the interaction is exist. So how should I summarize?

I’m willing to send the information for you if u need.

Thank you.

Bill

Jim Frost says

Hi Bill,

You have some excellent questions!

When you have a significant interaction effect, you know you can’t interpret the main effects without considering the interaction effects. As I show in the post, interaction effects are an “it depends” effect. The interpretation for one factor depends on the value of another factor. If you don’t assess the interaction effect, you might end up putting ketchup on your ice cream!

Assessing the Post Hoc test results can be fine by itself as long as you include the interaction term in the ad hoc test. Taking that approach, you’ll see the groupings based on the interaction term and know which groups are significantly different from each other. I also like to graph the interaction plots (as I do in this post) because it provides a great graphical overview of the interaction effect.

There’s an important point about graphs. They can be very valuable in helping you understand your results. However, they are not a statistical test for your data. An interaction plot can show non-parallel lines even when the interaction effect is not significant. When you work with sample data, there’s always the chance that sample error can produce patterns in your sample that don’t exist in the population. Statistical tests help you distinguish between real effects and sample error. These tests indicate when you have sufficient evidence to conclude that an effect exists in the population.

When you have crossed lines in an interaction plot but the test results are not statistically significant, it tells you that you don’t have enough evidence to conclude that the interaction effect actually exists in the population. Basically, the graph says that the effect exists in the sample data but the statistical test says that you don’t have enough evidence to conclude that it also exists in the population. If you were to collect another random sample from the same population, it would not be surprising if that pattern went away!

I hope this helps!

Bill says

Thanks for your help. I really appreciate.

Might need your help again after I finished the post hoc.

Hope you okay with that. Haha.

Again, THANK YOU.

Sincere,

Bill

Hakim says

Thank Jim, your explanation is very nice to follow, by the way, i have a model e.e. growth=average year of schooling +political stability+average year of schooling*political stability. the stata output gives individual coefficient positive while interactive coefficient negative. unfortunately i been asked by the reviewer to explain why interaction sign is negative any statistical or theoretical explanation please.

Jim Frost says

Hi Hakim, it’s difficult to interpret the coefficients for interaction terms directly. However, I can tell you that there is nothing at all odd about having a negative sign for an interaction term. Interaction terms modify the main effects. Sometimes it adds to them while other times it subtracts. It all depends on the nature of what you’re studying.

I’d suggest creating interaction plots, like I do in this post, because they’re much easier to understand than the interaction coefficients. Look through the plots to see whether they make sense given your understanding of the subject-area. These plots are a graphical representation of the interaction terms. Therefore, if the plots make sense, your model is good to go. If they don’t, then you need to figure out what happened. I think the reviewers will find the plots easier to understand than the coefficient.

I hope this helps!

Hakim says

Thanks Jim for your quick response and comprehensive explanation..

Ting-Chun Chen says

Hi Jim,

May I ask what reference about interaction effect do you suggest to study?

I want to know more about interaction effect in clinical trial.

Thank you.

Sincere,

Ting-Chun

Jim Frost says

Hi Ting-Chun, most any textbook about regression analysis, ANOVA, or linear models in general will explain interaction effects. My preferred source is Applied Linear Regression Models. That’s a huge textbook of 1400 pages, but that’s why I like it! I don’t have a reference specifically for interaction effects, but would recommend something that discusses linear models in all of its aspects.

I hope this helps!

Jim

Ting-Chun Chen says

Thanks for your help and your quick response. I really appreciate.

Again, THANK YOU.

Sincere,

Ting-Chun

demmie says

how does interaction affect my study statistically

Jim Frost says

Hi Demmie, this is the correct post for finding that answer. Read through it and you’ll find the answer you’re looking for. If you have a more specific question, please don’t hesitate to ask!

Anoop says

Hello jim,

What if want to know 1. How does Icecream and hotdog affect enjoyment by itself

2. How does icecream and hotdog affect enjoyment when condiments are included?

In this case, isn’t both the main effect and interaction are equally important for a researcher?

Jim Frost says

Hi Anoop,

Great questions! You can see how ice cream and hot dog affect enjoyment by themselves by looking at the main effects plot. This plot shows the enjoyment level that each food produces is approximately equal.

Yes, understanding main effects like these are important. However, when there are significant interaction effects, you know that the main effects don’t tell the full story. In this case, the main effect for, say hot dog, doesn’t describe the full effect on enjoyment. The interaction term includes hot dog, so you know that some of hot dog’s effect is also in the interaction. If you ignore that, you risk misinterpreting the results. As I point out in the blog, if you go only by main effects, you’ll choose a hot dog . . . with chocolate sauce. You’d pick the chocolate sauce because it’s main effect is larger than mustard’s main effect.

To see how ice cream and hot dogs affect enjoyment when you include the interaction effect, just look at the interaction plot. The four points on that plot show the mean enjoyment for all four possible combinations of hot dog/ice cream with chocolate sauce/mustard. It displays the total effects of main effects plus interaction effects. For example, the interaction plot shows that for hot dogs with mustard, the mean enjoyment is about 90 units (the top-left red dot in the graph). Alternatively, you could enter values into the equation to obtain the same values.

I’d agree that understanding both main effects and interaction effects are important. My point is that when you have significant interaction effects, don’t look at only the main effects because that can lead you astray!

Satu says

Hi Jim!

Thank you very much for your blog site, you explain things well and understandable, thank you for that!!

I would still like to make sure, that I understand correctly what you said before.. I am running a repeated measures ANOVA and I am struggling with interpretations of interactions. So, is it so, that if the interaction effect is not significant, then you should not interpret the multivariate comparisons between groups? I have a model with 5 groups and I am trying to see if there are any differences between them in the change of X variable in two time points. In multivariate tests it shows that the change would be different in one of the groups (also the plot figure shows that), but the overall interaction effect is significant. So what would be the right way to interpret the results? Just say that there were no significant interaction i.e. tha change was similar in all groups, or say that one group was different but the interaction effetc was not (for some reason?).

Thank you already for your answer!

Satu

Michela says

Hi Jim,

This blog post is so useful thank you very much! I have however still fail to interpret one of my statistics output. I carried out a two-way mixed ANOVA analysis and inputted these data:

– between-subject variable is two therapy techniques (MD and RT)

– within-subject variable (Time with 3 levels: pre, mid and post)

– dependant variable was well-being scores.

I ran the analysis and found that for the between-subject variables there were no significant difference between the well-being scores for MD and RT therapies. However when looking at my within-subject variables. The table stated that there was a significant main effect of Time on wellbeing scores but no significant interaction between Time*Therapy on well-being scores.

Am i right in implying that with the significant main effect of time it basically states that over-time, wellbeing scores improved, independent of the therapy techniques. Can i then conclude RT and MD positively improved well-being in general and that not one is better then the other? Or is that wrong? As one of my hypothesis states that MD and RT will have a positive effect on wellbeing scores.

Thank you so much for taking time to read this and helping me !!

Michela

Jim Frost says

Hi Michela,

Your interpretation sounds correct. The data you have suggests that as time passes, well being increases. You don’t have evidence that either treatment works better than the other. Often you’d include a control group (no treatment). It’s possible that there is only a time effect and no treatment effect. A control group would help you establish whether it was the passage of time and/or the treatments.

In other words, right now it could be that both treatments are equally effective. But, it’s also possible that neither treatment is effective and it’s only the passage of time–as the saying goes, time heals all wounds!

Michela says

Hi Jim,

Thanks for your reply. Yes that was one of the problems that was pointed out in my dissertation; was that it did not have a control group that was compared to :/ It was due to the fact that alongside time constraints, the sample size was already so small so it was difficult to get enough people to make 3 separate groups :/ So should am i wrong to accept the hypothesis that both RT and MD has a positive effect on wellbeing levels? Or do i have to reject that as i did not have a control group?

Kind Regards,

Michela

Jim Frost says

Hi Michela,

Unfortunately, it is hard to draw any conclusions about the treatments. It’s possible that both had the same positive effect on well being. However, it’s also possible that neither had an effect and instead it was entirely the passage of time. I definitely understand how it is hard to work with a small sample size!

If other researchers have studied the same treatments, you can evaluate their results. That might lend support towards your notion. But, that’s a tenuous connection without a control group.

Best wishes to you!

Jim

Anoop says

Hi Jim,

I have an interaction significant ( 0.004) for supplement use and physical activity interaction. The nonusers had a Hazard Ratio 0f 0.61(.46-0.80) ( lower risk) where users had a HR 1.40 (.85-2.3) ( high risk). My question is although it looks like a qualitative interaction ( opposite in direction), since the users CI crosses margin of no difference, how do you interpret it? Can we say users had a higher hazard when combined with PA?

Thank you

Jim Frost says

HI Anoop,

I can’t interpret the main effect of supplement use without understanding the interaction effect. Can you share, the hazard ratios for your interaction. In other words, the ratios for the following groups: user/high activity, user/low activity, non-user/high activity, and non-user low-activity.

I don’t know how you recorded activity, so those groups are just an example. Then we can see how to interpret it!

Thanks!

Jim

Anoop says

Hey Jim,

Not sure why ur posting doesn’t show. But it shows in my email.

This is a trial is looking at if physical activity vs Control can reduce physical disability. We are looking at a certain supplement users vs nonusers in the trial. Interaction was significant ( p=.003)

PA C

Users 7.1 6.1 HR 1.40 (.85 – 2.3)

Nonusers 5.4 10.2 HR 0.61(.46 – 0.80)

How do you interpret this result?

Thank you so much. Also you should start a youtube page. We need more people like you in this world 🙂

Jim Frost says

Hi again Anoop,

I checked and I see my comment showing up under yours. I think it might be a browser caching thing that is causing you not to see my reply on the blog post. Refresh might do the trick.

At any rate, this example will also show the importance of several other concerns in statistics–namely understanding the subject area, the variables involved, and statistical vs. practical significance. So, with that said, let’s take a look at your results!

I’m not sure what the dependent variable is, but I’ll assume that higher values represent a greater degree of disability. If that’s not the case, you got really strange results! In the interaction table you provided, I see three group means that are roughly equal and one that stands out. I’m not sure if the differences between any of those three group means (5.4, 6.1, and 7.1) are statistically significant. You can perform a post hoc analysis in ANOVA to check this (I plan to write a blog post about that at some point). Even if they are significant, you have to ask yourself if those differences are practically significant given your knowledge of the subject area and the dependent variable. I don’t know the answer to that.

And, then there is the one group mean (10.2) that is noticeably different than the other three groups. To me, it looks like that subjects in the control group who don’t use the supplement have particularly bad results. And, the other three groups might represent a better outcome. Again, use your subject-area knowledge to determine how meaningful this result is in a practical sense.

If that’s the case, it suggests to me that subjects have better outcomes as long as they use the supplement and/or engage in physical activity. In other words, the worst case is to not do either the activity or use the supplement. If you do one or both of physical activity and supplement usage, you seem to be better off in an approximately equal manner. And, again, I don’t know if the differences between the other three outcomes are statistically significant and practically significant. In other words, those differences could just represent random sample error and/or not be meaningful in a practical sense.

I hope this clarifies things! And, yes, I do plan to start a YouTube channel at some point. I need to finish a book that I’m working on first though!

Take care,

Jim

Anoop says

Thank you for the long post Jim!

I used a cog regression model and the results is hazard ratio’s. The trial is physical activity vs control. And we are doing a subgroup analysis with the supplement.

The above table shows for Users the CI is 1.40 ( .85 to 2.3) and not significant.

For nonusers, the HR shows 0.61 ( 0.46-.80) and significant.

And the interaction between these two is significant. My question is isn’t this an example of qualitative interaction where the direction is opposite for users vs non-users. Like if you plot the forest plot, the lines are on 2 sides of no difference line.?

Jim Frost says

Hi Anoop,

The interesting thing about statistics is that the analyses are use in a wide array of fields. Often, these fields develop their own terminology for things. In this case, I wasn’t familiar with the term qualitative interaction, but it seems to be used in medical research. I’ve read that a qualitative interaction occurs when one treatment is better for some subsets of patients while another treatment is better for another subset of patients. It sounds like a qualitative interaction occurs when there is a change in direction of treatment effects. A non-crossover interaction applies to situations where there is a change in magnitude among subsets but not of direction.

So, I learned something new about how different fields apply different terminology to statistical concepts!

I’m not sure why you’d have only two hazard ratios when you know that the interaction effect is significant? Right there you know that you can’t interpret the main effect for supplement usage without knowing the physical activity level. It seems like you’d need 4 hazard ratios.

As for whether this classifies as a qualitative interaction given the definition above, you’ll first have to determine whether those differences between the three groups I identified before are both statistically significant and practically significant. If the answers to both questions are yes, then it would seem to be a qualatative interaction. However, if either answer is no, then I don’t think it would. And, I’m going by your dependent variable. If you want to answer that using hazard ratios, you’d need four of them as I indicate above. You can’t answer that question with only two ratios.

I hope this helps!

Mei says

Hi Sir. Thank you for this wonderful post as this is very helpful. But I still can’t seem to understand or interpret my interaction plot. My main effects are significant and my interaction effect are also significant but then looking at the regression coefficient (result from SPSS), moderator(IV2) is a negative significant predictor of DV but looking at my interaction plot, they are both positive significant predictor? I’m not sure if you get it because I am also having difficulty explaining the situation because I am just a beginner when it comes to psychological statistics. Thank you in advance, Sir!

Jim Frost says

Hi Mei, I don’t understand your scenario completely. However, there is nothing wrong with having positive coefficients for main effects and negative coefficients for interaction effects. When you have significant interaction effects, then the total effect is the main effect plus interaction effect. In some cases, the interaction effect adds to the main effect but sometimes it subtracts from it. It’s ok either way. I find that assessing the interaction plots is the easiest way to interpret the results when you have significant interaction effects.

Habtamu Tolera says

I do have 20 IV binary or categorical variables and one binary DV. My question is shall I check col linearity first and run bi variate analysis or otherwise. help me please

Habtamu Tolera says

do have 20 IV binary or categorical variables and one binary DV. My question is shall I check col linearity first and run bi variate analysis or otherwise. help me please

Mei says

Thank you for the reply, Sir. I will do my best to interpret the interaction plot. 🙂

Erick Turner says

Jim, like many others here, I love your intuitive explanation.

I thought it would be a good exercise to replicate what you did in your example. (I’m using Stata, and I understand you don’t use that, but the results should still be the same.) Unfortunately, I’m having trouble replicating your results and I don’t know why. Using values of 0 and 1 for each of the IVs, I’m getting significant results for both of them and for the interaction variable, while you got NS results for one of the IVs.

I’ll paste the output below. (Sorry, the formatting got lost.)

. regress enjoyment food_01 condiment_01 food_cond

Source | SS df MS Number of obs = 80

————-+—————————— F( 3, 76) = 212.43

Model | 15974.9475 3 5324.98248 Prob > F = 0.0000

Residual | 1905.09733 76 25.0670701 R-squared = 0.8935

————-+—————————— Adj R-squared = 0.8892

Total | 17880.0448 79 226.329681 Root MSE = 5.0067

——————————————————————————

enjoyment | Coef. Std. Err. t P>|t| [95% Conf. Interval]

————-+—————————————————————-

food_01 | -28.29677 1.583258 -17.87 0.000 -31.45011 -25.14344

condiment_01 | -24.28908 1.583258 -15.34 0.000 -27.44241 -21.13574

food_cond | 56.02826 2.239065 25.02 0.000 51.56877 60.48774

_cons | 89.60569 1.119533 80.04 0.000 87.37594 91.83543

——————————————————————————

Any clue as to what’s I’m doing wrong?

Jim Frost says

Hi Erick, offhand I don’t know what could have happened. As you say, the results

shouldbe the same. I’ll take a closer look and see if I can figure anything out.Michela says

Dear Jim,

I found your blog while trying to find an answer to a reviewer comment to a paper I submitted.

So now I am looking for answers.

One of my hypothesis was on a moderated mediation model.

Considering the moderation I have (measured as continuous variables):

X=job demands

M (moderator)= team identification

Y= workplace bullying

The fact is that when I looked at the results the effect of X on Y is positive; the effect of M on Y is negative but my problem is that I have the interaction term (X*M) that is positive, while I (and especially the reviewer) was expecting a negative effect.

The graph makes sense to me (and partly the reviewer) but he/she is expecting that I am giving him/her some explanation about this positive interaction effect.

I hope you could help me in explaining me why and explain that to the reviewer!

Jim Frost says

Hi Michela,

I seem to have been encountering this question frequently as of late! The answer is that the coefficient for an interaction term really doesn’t mean much by itself. After all, the interaction term is a product of multiple variables in the model and the coefficient. Depending on the combination of variable values and the coefficient, a positive coefficient can actually represent a negative effect (i.e., if the product of the variable values is negative). Additionally, the overall combined effect of the main effect and interaction effect can be negative. It might be that the interaction effect just makes it a little less negative than it would’ve been otherwise. The interaction term is basically an adjustment to the main effects.

Also, realize that there is a bit of arbitrariness in the coefficient sign and value for the interaction effect when you use categorical variables. Linear models need to create indicator variables (0s and 1s) to represent the levels of the categorical variable. Then, the model leaves out the indicator variable for one level to avoid perfect multicollinearity. Suppose you have group A and group B. If the model includes the indicator variable for group A, then 1s represent group A and 0 represents not group A. Or, it could include the indicator variable for group B, then 1s represent group B and 0 represents not group B. If you have only two groups A and B, then the 1s and 0s are entirely flipped depending on which indicator variable the model includes. You can include either indicator variable and the overall results would be the same. However, the coefficient value will change including conceivably the sign! You can try changing which categorical level the software leaves out of the model, which doesn’t change the overall interpretation/significance of the results but can make the interpretation more intuitive.

Finally, it’s really hard to gain much meaning from an interaction coefficient itself for all the reasons above. However, you can see the effect of this term in the interaction plot. As long as the interaction plot makes sense theoretically, I wouldn’t worry much about the specific sign or value of the coefficient. I’d only be worried if the interaction plots didn’t make sense.

I hope this helps!

Erick Turner says

Mystery solved! It wasn’t an issue of the difference in software but rather in the type of model. I had asked Stata to run a regression model and got output that didn’t match up. However, when I ask Stata to run ANOVA (including the interaction term), I got output that matched yours. For other Stata users, the syntax to use is “anova enjoyment food condiment food#condiment”.

Jim Frost says

Hi Erick, thanks so much for the update! I had rerun the analysis to be sure that I hadn’t made a mistake, but it produced the same results in the blog post. I guess this goes this goes to show how crucial it is to know what your statistical software is doing exactly! I still wonder what produced the difference between the regression and the ANOVA model because they both use the same math underneath the hood? In other words, what is different between Stata’s regression and ANOVA model?

Erick Turner says

However, I’m still puzzled as to why I got such different output when I transformed the data to 0/1 dummy variables, created an interaction variable, and then ran regression.

Erick Turner says

I see our replies crossed in cyberspace and are that we are similarly puzzled. I’m assuming you ran an ANOVA routine and that it gives you regression output automatically. Just out of curiosity, what if you were to convert your variables to 0/1 and ask your software to just run regression?

Jim Frost says

I used regression analysis in Minitab and it automatically creates the indicator variables behind the scenes. So, I just told it to fit the model. Depending on which level of each categorical variable that the software leaves out, you’d expect different numeric results (although, they’d tell the same story). You wouldn’t expect differences in what is and is not significant though. I wonder if STATA possibly uses sequential SS for one of it’s analyses? Minitab by default uses adjusted SS. Using Seq. SS could change which variables are significant. I was going to test that but haven’t tried yet.

Dan Mark says

First of all, thank you for the clear explanation. It is hard sometimes to find someone who can explain it in plain English!

Secondly, I still face an issue what to put on my axis in my research. I saw in your explanation that you put the dependent variable, the interaction term and one independent variable on the axis. My question is why you did not put both the independent variables that are in the interaction term, and the interaction term on the axis.

Already many thanks!

Jim Frost says

Hi Dan,

Thanks so much. I work really hard to find the simplest way to explain these concepts yet staying accurate!

Graphing relationships for multiple regression can get tricky. The problem is that the typical two-dimensional graph has only two-axes. So, you have to figure out the best way to arrange these two axes to produce a meaningful graph. This isn’t a problem for simple regression where you have one dependent variable and one independent variable. You can graph those two variables easily on fitted line plots. You have as many variables as you have axes.

Once you get to multiple regression you will have more than two variables (one DV, and at least 2 IVs, and possibly other terms such as interaction terms) than axes. You definitely want to include the dependent variable on an axis (typically the y-axis) because that is the variable you are creating the model for. Then, you can include one IV on the X-axis. At this point, you’ve used up your available axes! The solution is to use separate lines to represent another variable (as shown in the legend). That’s how you get the two IVs into the graph that you need for a two-way interaction. Then you just assess the patterns of those lines.

Instead, if I had put an IV on both X and Y-axes, the graph would not display the value of the DV. The whole point of regression/ANOVA is to explore the relationships with the DV. Consequently, the DV has to be on the graph.

I hope this helps clarify the graphs! The interaction plots I show in this post are the standard form for two-way interactions.

Marlie Greeff says

Dear Jim

Your blog is amazing! Makes everything more understandable for someone with no stats background! Thank you!

Jim Frost says

Hi Marlie, thanks so much for your nice comment. It means a lot to me because that’s my goal for the blog! I’m glad it’s been helpful for you.

Joe R says

Hi Jim,

Thanks for this blog post, really appreciate your efforts to break things down in a simple, intuitive and visual way.

I am a bit confused by the continuous variable example (regarding interactions), specifically your interpretation.

I used your linear model, plotting the coefficients in Excel and manual calculating the Strength for several points of ‘test’ data.

In the article you write – “For high pressures, there is a positive relationship between temperature and strength while for low pressures it is a negative relationship”.

This is what your interaction plot also shows, but plugging actual values in (see below) to the equation – using your coefficients outline above – proves that this is not true.

Test Data

Temperature Pressure Time Temprature*Pressure Predicted Strength Values Difference

95 81 32 7695 3,891

115 81 32 9315 4,258 367

95 63 32 5985 3,477

115 63 32 7245 3,800 323

As you can see, for 2 ‘sets’ of data above, each with a low (63) and high (81) pressure setting, Predicted Strength increases as Temperature Increases.

Am i missing something?

Joe

Jim Frost says

Hi Joe,

I can’t quite tell from your comment how you set up your data. So, I’m unable to figure out how things are not working correctly. However, I can assure you that when you plug the values in the equation, the fitted values behave according to the interpretation (i.e., that the relationship changes direction for low and high values of pressure).

To illustrates how this works, I put together an Excel spreadsheet. In the spreadsheet, there are two tables–one for low pressure and the other for high pressure. Both tables contain the same values for Temperature and Time. However, each table uses a different value for Pressure. The low pressure table uses 63.68 while the high pressure table uses 81.1. I then take these values and plug them into the equation in the Strength column to calculate the fitted values for strength.

As you can see from the numbers in the tables and associated graphs, there is a negative relationship between Strength and Temperature when you use a low pressure but a positive relationship when you use a high pressure.

You can find all of this my spreadsheet with the calculations for the continuous interaction. The two graphs below are also in the spreadsheet.

I hope this helps clarify how this works!

Michela says

Thanks for this, very helpful!

I hope the reviewer will be satisfied as well 🙂