## What are Interaction Effects?

An interaction effect occurs when the effect of one variable depends on the value of another variable. Interaction effects are common in regression models, ANOVA, and designed experiments. In this post, I explain interaction effects, the interaction effect test, how to interpret interaction models, and describe the problems you can face if you don’t include them in your model.

In any study, whether it’s a taste test or a manufacturing process, many variables can affect the outcome. Changing these variables can affect the outcome directly. For instance, changing the food condiment in a taste test can affect the overall enjoyment. In this manner, analysts use models to assess the relationship between each independent variable and the dependent variable. This kind of an effect is called a main effect. While main effects are relatively straightforward, it can be a mistake to assess only main effects.

In more complex study areas, the independent variables might interact with each other. Interaction effects indicate that a third variable influences the relationship between an independent and dependent variable. In this situation, statisticians say that these variables interact because the relationship between an independent and dependent variable changes depending on the value of a third variable. This type of effect makes the model more complex, but if the real world behaves this way, it is critical to incorporate it in your model. For example, the relationship between condiments and enjoyment probably depends on the type of food—as we’ll see in this post!

## Example of Interaction Effects with Categorical Independent Variables

I think of interaction effects as an “it depends” effect. You’ll see why! Let’s start with an intuitive example to help you understand these effects in an interaction model conceptually.

Imagine that we are conducting a taste test to determine which food condiment produces the highest enjoyment. We’ll perform a two-way ANOVA where our dependent variable is Enjoyment. Our two independent variables are both categorical variables: Food and Condiment.

Our ANOVA model with the interaction term is:

Satisfaction = Food Condiment Food*Condiment

To keep things simple, we’ll include only two foods (ice cream and hot dogs) and two condiments (chocolate sauce and mustard) in our analysis.

Given the specifics of the example, an interaction effect would not be surprising. If someone asks you, “Do you prefer ketchup or chocolate sauce on your food?” Undoubtedly, you will respond, “It depends on the type of food!” That’s the “it depends” nature of an interaction effect. You cannot answer the question without knowing more information about the other variable in the interaction term—which is the type of food in our example!

That’s the concept. Now, I’ll show you how to include an interaction term in your model and how to interpret the results.

## How to Interpret Interaction Effects

Let’s perform our analysis. All statistical software allow you to add interaction terms in a model. Download the CSV data file to try it yourself: Interactions_Categorical.

Use the p-value for an interaction term to test its significance. In the output below, the circled p-value tells us that the interaction effect test (Food*Condiment) is statistically significant. Consequently, we know that the satisfaction you derive from the condiment *depends* on the type of food.

But how do we interpret the interaction in a model and truly understand what the data are saying? The best way to understand these effects is with a special type of line chart—an interaction plot. This type of plot displays the fitted values of the dependent variable on the y-axis while the x-axis shows the values of the first independent variable. Meanwhile, the various lines represent values of the second independent variable.

On an interaction plot, parallel lines indicate that there is no interaction effect while different slopes suggest that one might be present. Below is the plot for Food*Condiment.

The crossed lines on the graph suggest that there is an interaction effect, which the significant p-value for the Food*Condiment term confirms. The graph shows that enjoyment levels are higher for chocolate sauce when the food is ice cream. Conversely, satisfaction levels are higher for mustard when the food is a hot dog. If you put mustard on ice cream or chocolate sauce on hot dogs, you won’t be happy!

Which condiment is best? It depends on the type of food, and we’ve used statistics to demonstrate this effect.

## Overlooking Interaction Effects is Dangerous!

When you have statistically significant interaction effects, you can’t interpret the main effects without considering the interactions. In the previous example, you can’t answer the question about which condiment is better without knowing the type of food. Again, “it depends.”

Suppose we want to maximize satisfaction by choosing the best food and the best condiment. However, imagine that we forgot to include the interaction effect and assessed only the main effects. We’ll make our decision based on the main effects plots below.

Based on these plots, we’d choose hot dogs with chocolate sauce because they each produce higher enjoyment. That’s not a good choice despite what the main effects show! When you have statistically significant interactions, you cannot interpret the main effect without considering the interaction effects.

Given the intentionally intuitive nature of our silly example, the consequence of disregarding the interaction effect is evident at a passing glance. However, that is not always the case, as you’ll see in the next example.

## Example of an Interaction Effect with Continuous Independent Variables

For our next example, we’ll assess continuous independent variables in a regression model for a manufacturing process. The independent variables (processing time, temperature, and pressure) affect the dependent variable (product strength). Here’s the CSV data file if you want to try it yourself: Interactions_Continuous. To learn how to recreate the continuous interaction plot using Excel, download this Excel file: Continuous Interaction Excel.

In the interaction model, I’ll include temperature*pressure as an interaction effect. The results are below.

As you can see, the interaction effect test is statistically significant. But how do you interpret the interaction coefficient in the regression equation? You could try entering values into the regression equation and piece things together. However, it is much easier to use interaction plots!

**Related post**: How to Interpret Regression Coefficients and Their P-values for Main Effects

In the graph above, the variables are continuous rather than categorical. To produce the plot, the statistical software chooses a high value and a low value for pressure and enters them into the equation along with the range of values for temperature.

As you can see, the relationship between temperature and strength changes direction based on the pressure. For high pressures, there is a positive relationship between temperature and strength while for low pressures it is a negative relationship. By including the interaction term in the model, you can capture relationships that change based on the value of another variable.

If you want to maximize product strength and someone asks you if the process should use a high or low temperature, you’d have to respond, “It depends.” In this case, it depends on the pressure. You cannot answer the question about temperature without knowing the pressure value.

## Important Considerations for Interaction Effects

While the plots help you interpret the interaction effects, use a hypothesis test to determine whether the effect is statistically significant. Plots can display non-parallel lines that represent random sampling error rather than an actual effect. P-values and hypothesis tests help you sort out the real effects from the noise.

The examples in this post are two-way interactions because there are two independent variables in each term (Food*Condiment and Temperature*Pressure). It’s equally valid to interpret these effects in two ways. For example, the relationship between:

- Satisfaction and Condiment depends on Food.
- Satisfaction and Food depends on Condiment.

You can have higher-order interactions. For example, a three-way interaction has three variables in the term, such as Food*Condiment*X. In this case, the relationship between Satisfaction and Condiment depends on both Food and X. However, this type of effect is challenging to interpret. In practice, analysts use them infrequently. However, in some models, they might be necessary to provide an adequate fit.

Finally, when an interaction effect test is statistically significant, do not attempt to interpret the main effects without considering the interaction effects. As the examples show, you will draw the wrong the conclusions!

If you’re learning regression and like the approach I use in my blog, check out my Intuitive Guide to Regression Analysis book! You can find it on Amazon and other retailers.

Omar N says

Hi Jim,

Thanks a lot for your patience and for getting back to me. I was referring to Section 6.2.2 of the posted paper. I hope you can look at it. Otherwise, your previous responses were helpful.

Jim Frost says

Hi Omar,

I could only see the abstract using the link you provided. But I found the full article elsewhere.

In general, yes, their interpretation of main effects is correct. More difficult tasks tended to take longer than simpler tasks while controlling for the other variables in the model. Additionally, there’s a significant difference in mean time to perform tasks on the different interfaces, again while controlling for other variables in the model. That provides some useful information by itself.

However, given the interactions, it’s not quite that simple. Although, I note that these interactions don’t flip things on their head as much as my hotdog/sundae example. But you do see some complexities. For starters, I do disagree with one point of their interpretation of the interaction effects.

They write, “Looking closer, we see that while the static-simple interface outperforms the others for easy and medium tasks, there is no significant difference between the interfaces for hard tasks. This indicates that there may be performance differences among the interfaces that are overwhelmed by the additional time it takes to perform hard tasks, but that remain detectable for easy and medium tasks.”

That wouldn’t be the case because task difficulty is included in the model. So, it’s controlling for difficulty (and for the other variables) and determining there is no significant difference. Interestingly the error bars are narrower for the hard task than the easy and medium tasks. I would’ve expected them to be wider (given higher variability due to longer times and harder tasks) and that might’ve accounted for the lack of significance. But that doesn’t seem to be the case. It most definitely DOES NOT indicate performance differences between the interfaces as they state.

The results are interesting because the static simple interface is clearly the best. It produces the fastest times for most conditions. There are a few conditions where other interfaces are as good, but not better. Those exceptions are the significant interaction effects we’re discussing. They’re the “it depends” factor of this study. Which interface is best? It depends on Movement and Task Difficulty.

The simple static interface is best for easy and medium tasks and when the user is standing.

For hard tasks, all interfaces are equal.

For walking, the simple static and adaptive interfaces are equal and both are better than the static complex interface.

And that’s why these interactions are not as convoluted as they might’ve been. Given these results, you might as well use the simple-static interface for all cases because it’s always the best or tied with the best. The interactions don’t point to cases where a different interface is better.

And that’s interesting because the adaptive interface was apparently developed to change based on movement. It’s only tied for the best in specific conditions and not the best in all other conditions. I guess it’s back to the drawing board for the interface designers!

Omar N says

Thanks a lot for the elaboration.

Do you mind looking at the following concrete example that can be found on the following paper in Section 6.2.2 :

https://dl.acm.org/doi/pdf/10.1145/1409240.1409253?casa_token=6TkTBbCHEycAAAAA:a-rNNdBvVvJ64uxC_V-DDH3wZoGOlxFJwaRbPOIPkRNQwcoHozg1K9b30Y5GzWhWCYFmLG-9Qii_WA

Where Main effects were interpreted although there are interaction effects. Is their interpretation correct?

Jim Frost says

Hi Omar,

This is the same study you posted before and I commented on in my previous comment. I don’t really see much discussion about main effects in the Abstract.

Omar N says

A more clear concrete example can be found on the following paper in Section 6.2.2 : https://dl.acm.org/doi/pdf/10.1145/1409240.1409253?casa_token=6TkTBbCHEycAAAAA:a-rNNdBvVvJ64uxC_V-DDH3wZoGOlxFJwaRbPOIPkRNQwcoHozg1K9b30Y5GzWhWCYFmLG-9Qii_WA

Where Main effects were interpreted although there are interaction effects

Omar N says

Thanks for the great explanation. I just have one question, Although the main effects could be misleading when there are interaction effects, can we still use them to infer something ? For example, if the input technique (touch vs audio) has a significant main effect, can we infer that these two techniques are signficnatly different?

Jim Frost says

Hi Omar,

They do provide some information. How useful that information is depends on why you’re constructing the model. There are two general purposes for statistical models–understanding the nature of the relationships between the variables and predicting outcomes.

If you primarily want to understand the relationships, then understanding the main effects when there are significant interaction effects can be informative to a degree. If the main effects and interaction effects are all significant, you know that an independent variable has an effect independent of all other variables. And, there is also an additive or subtractive effect in combination with another variable. So, you understand more about the subject area. How it all works together.

However, when it comes to predicting or modeling the outcome, they’re not so helpful. Refer back to the hot dog and sundae example in the post to see why. For understanding outcomes, you really must consider the interaction effects.

Let’s look at that article you found. I’ve only read the abstraction. I’ll assume that their model has two IVs: Speed and Icon Size. And the outcome is something like User satisfaction with the interface. I’ll also assume that both main effects and the interaction effect are all significant.

In terms of pure understand, sure, it’s nice knowing that speed and size both contribution to user satisfaction totally independently of each other. However, suppose you want to optimize for user satisfaction. Someone asks you, what size should the icons be to optimize it? You can’t answer that question without knowing how fast the user is moving.

I hope that helps!

Tamori says

Many thanks for the detailed reply. I was hoping that there’d be a way to do the simple main effects analysis relatively straightforwardly in Python (with Pingouin, Statsmodels or other libraries) but that doesn’t seem to be the case. Oh well, I guess that’s how SPSS can justify its price.

Jim Frost says

Hi Tamori,

It might well be possible using Python. I just don’t know enough about Python to say!

Best of luck!

Tamori says

Thanks for the reply.

I was asking about how to analyse *simple* main effects (which your article doesn’t address), not general main effects. This article (https://ezspss.com/simple-main-effects-tests-for-two-way-anova-with-significant-interaction-in-spss/) explains how to do that analysis in SPSS, but I was asking about the proper method more generally (I actually want to do this in Python with Pingouin or Statsmodels).

Taking my example again, with my two-way ANOVA I established that there’s an interaction effect between my two IVs, and now I want to do a detailed analysis of IV1 for each level of IV2. What I’m currently doing is splitting the experiment into two separate experiments, one for each level of IV2. In experiment 1, I take the data corresponding to the first value of IV2, and do a one-way ANOVA on IV1. If the results are significant, I proceed to do pairwise comparisons. I then move on to experiment 2, where I set IV2 to the second value, and repeat the process.

What I want to know if this is the right way to do it.

Jim Frost says

Hi Tamori,

Apologies, I completely misread your question. The terminology used in statistics changes dramatically by subject area, region, when someone learned, etc. I don’t use the term “simple main effects” or “simple effects” and so I just went straight to main effects. While the term simple main effects is used, I’m not sure it makes sense because its entirely based on an interaction effect. Personally, I think of it as an interaction effect where you’re comparing pairwise differences between the various factor level combinations. This perspective helps keep the focus on how the interaction influences the relationship between the factors and the dependent variable.

At any rate, on to your question! Even though I misread your question, some of my previous response still stands.

While your approach of splitting the data and performing separate one-way ANOVAs for each level of IV2 is a common method, it does not fully account for the other factor and the interaction effect. As I stated in my previous reply, one-way ANOVAs does not account for the variance that the 2nd IV and interaction term explains in the full model. Hence, one-way ANOVA can bias your results in a context where you

knowthe 2nd IV and interaction are significant. A more rigorous approach would involve using the results of your two-way ANOVA to analyze the simple main effects while still considering both factors and their interaction.I can’t help you with how to do that in Python or the others, but the approach is to fit the model with both IVs and the interaction term, and using that model with a post hoc multiple comparison method to compare the combinations of factor level means to control the family-wise error rate. With all the group comparisons, you need to use a post hoc test to control the familywise error rate. That approach is how the SPSS example does it, both by using a two-way model with and interaction term and a post hoc comparison method, which is least significance difference (LSD).

I hope that at least points you in the right methodological direction!

Tamori says

I’d like to know how to do the simple main effect analysis on each level of one the independent variables if the two-way ANOVA shows an interaction effect.

Let’s say I have two independent variables IV1 and IV2 and a corresponding dependent variable DV. IV1 has 5 levels and IV2 has 2 levels.

The two-way ANOVA on IV1xIV2 reveals an interaction effect and I’m interested in analysing the simple main effects of IV1 on each level of IV2. How does that work in practice? Do I do one-way ANOVAs on IV1 for each subset of DV corresponding to each level of IV2 and then post-hoc tests for each ANOVA that is significant?

Jim Frost says

Hi Tamori,

It’s possible to assess the main effect in the normal fashion even when you have a significant interaction effect. I talk about that in this post a bit. However, I don’t recommend it. When you have a significant interaction you must focus on that interaction otherwise you risk misinterpreting the results. Reread my article and focus on the section where I talk about main effects in the context of the condiment example. You’ll see why you shouldn’t focus on the main effects alone!

But to answer your question, you don’t need to perform the analysis any differently. Fit your model as you normally do. Then you can assess the main effects, but again focus on the interaction effects. You don’t want to fit separate one-way models because those models don’t control for the other known significant variables, which can create additional biases.

Kari says

Hi Jim, great examples and very helpful, especially to my students! This is maybe an obvious question, but I find myself overthinking it and not finding a clear answer elsewhere online, so I wanted to ask – I’m using some interaction terms in SEM/path analysis (both variables are continuous, so it makes more sense than a multi-group analysis, and all variables are observed, b/c I know AMOS won’t do interaction terms of latent variables). I’ve created the interaction terms using centering, but conceptually, it still seems like it would make sense to correlate the error of the centered interaction term with the errors of each of its component variables? But is this correct? So in other words, if I am testing an interaction between depression and age and their joint influence on marital satisfaction (not my actual variables, making these up), and I then create an interaction term by multiplying centered versions of depression and age and specifying a path from that interaction term to marital satisfaction, as well as from uncentered depression to MS and uncentered age to MS, should I also correlate the error of the interaction term with the errors of depression and age? Or is that inappropriate once I’ve centered the variables to create the interaction term? Thanks!

Dave says

Thank you for your blog. Should I remove a non-significant interaction term between row and column factors while doing a two-way ANOVA analysis and only keep the row and column main effects?

Jim Frost says

Hi Dave,

Yes, that’s generally the standard approach. An exception would be if you have strong theoretical reasons or previous research that suggests that the interaction effect exists despite the non-significant results.

Rajni Soni says

Hi Jim,

I liked your blog about interactions. I am working with the interaction of two dummy variables and want to interpret its results. Can you please guide me how to do it?

Jim Frost says

Hi Rajni

In this post, pay particular attention to the mustard and chocolate sauce example. Those are two dummy variables and I work through the interpretation of their interaction effect.

BE Hartzema says

I think part of the answer also lies in the standard error. I was also scratching my head over one of the main effects becoming insignificant (while the interaction was also insignificant), but then I saw the standard error had almost doubled.

Jim Frost says

That might be due to multicollinearity. Fitting an interaction term increases that. Try centering the variables to see if that helps. Read my post about multicollinearity for more details about this method!

Saltanat says

Dear Jim,

Many thanks for your posts. I have a question regarding my results. The regression analysis shows that the coefficient of the interaction term is statistically significant. However, when I plot the lines, the graph shows no interaction effect. How should I interpret my result? Should I rely on the regression result or the graph?

Thank you

Saltanat

Jim Frost says

Hi Saltanat,

Typically, you’d expect to see either crossing lines or at least obviously non-parallel lines when you have a significant interaction effect. However, there are several possibilities for your case.

Keep in mind that the statistical test for an interaction effect evaluates whether the difference between the slopes is statistically significant. The fact that your interaction effect is significant indicates that the different between the lines is significant whether you see that on the graph or not. Consider, the following possibilities:

Perhaps your model has very high statistical power and can detect trivial differences between the lines. The lines are just barely non-parallel. This might happen if you have a large sample size and/or very low noise data. If this is true, your interaction effect might be statistically significant but probably aren’t practically significant.

Another possibility is the scaling for your graphs might be hiding a difference. That’s probably unlikely but at least worth considering. Perhaps the scaling is too zoomed out?

One thing to do is evaluate the predicted DV values for specific IVs for a model with and without the interaction effect. Do those predicted values change much? If they don’t change much, the interaction effect is not practically significant.

I hope that helps!

Candy says

Good evening Mr. Frost. Can an interaction effect be found if there are only two variables: tslfest and agegp3?

Jim Frost says

Hi Candy,

Yes, assuming those two variables are independent variables in a model, it’s entirely possible they have an interaction effect. Two is the minimum number of independent variables that can have an interaction effect. I don’t know about those two specific variables, but it is at least possible.

Omar says

Thanks for the article, really helpful. Just a question that I have in mind, when we run repeated measures two way anova and find out there is an interaction effect. Is it advisable to run one way anova on each factor to find out the simple effects? In my experiment, I’m using ARTool to trasnform ordinal data so that I could use anova on them https://depts.washington.edu/acelab/proj/art/. The tool documentation mentioend about using IV1*IV2 for pairwise comparison. But what if I want to use one way anova provided by the tool as well to find simple effects? Problem is each apporach generate different results 🙂

Jim Frost says

Hi Omar,

I’ve only taken a quick look at this tool, but it looks like a good one!

In general, when you have significant interaction effects, you don’t want to focus too much on only the main effects. In this context, the main effects can lead you astray! I show an example of that in this blog post. You really need to understand the interaction effects because in some cases they drastically change your understanding of the main effects. For this reason, whenever I’ve had significant interactions and needed to do pairwise comparisons, I’ve always included the interaction in those comparisons as the tool’s documentations suggests. (I’ve never used the tool but the principle should be the same.) Again, comparing main effects

mightlead you astray.If you graph the main and interaction effects as I show in this post, you should be able to see whether the interaction effects significantly change your understanding of the model’s results.

I hope that helps!

Elaine says

How about when all of the two main variables that provide the main effect become statistically insignificant only after including the interaction term but were statistically significant before including the interaction term? And two models have similar residual plots.

Jim Frost says

Hi Elaine,

That can be a tricky situation. The question becomes, which model is better? The one with or without the interaction terms? It sounds like you could justify either model statistically. Here are some tips to consider.

If your IVs are continuous, center them to reduce multicollinearity in the model with interaction effect. Interaction terms jack up multicollinearity, which can reduce the power of the statistical tests. See if either of the main effects become significance after centering the variables and refitting the model. Click the link to learn more about that process.

If that doesn’t provide a clear answer (i.e., the main effects are still not significant), consider the following.

Is the R-squared and other goodness-of-fit measures notably better for one model or the other? While you don’t necessarily want to chase a high R-squared mindlessly, if one model does provide a better fit, that might help you decide.

Graph the interaction effect to see if it is strong. Perhaps it is statistically significant but practically not significant? I show what interaction plots look like in this post. If an interaction effect produces nearly parallel lines, it is fairly weak even if its p-value is significant.

Input values into the two models and see if it produces similar or different predicted values. It’s possible that despite the different forms of the model, they might be fairly equivalent.

Use subject-area knowledge to help you decide. Is the interaction effect theoretically justified? Evaluate other research and use your expertise to determine which model is better.

Simba Chidzambwa says

How do you interpret when one of the two main variables that provide the main effect become statistically insignificant only after including the interaction term but were statistically significant before including the interaction term?

Jim Frost says

Hi Simba,

When you include an interaction term in a regression model and observe that one of the main effects becomes statistically insignificant, it suggests that the effect of that variable is conditional on the level of the other variable. In other words, the effect of the main variable is not consistent across all levels of the other variable, and its significance is captured by the interaction term. That variable has no effect that is independent of the other variable.

However, there are several cautions about this interpretation. It’s possible that the main effect exists in the population but your regression analysis lacks sufficient power to detect it after including the interaction term. Remember, failing to reject the null hypothesis does not prove that the effect doesn’t exist. You can’t prove a negative. You just don’t have enough evidence so say that it does exist.

Also, you need to use your subject-area knowledge to theoretically evaluate the two models. Does it make more theoretical sense that the main effect or interaction effect exists? Or, perhaps theory suggests they both exist? Answering those questions will help you determine which model is correct. For more information on that topic, read Specifying the Correct Regression Model. Pay particular attention to the theory section near the end.

It’s always a good idea to plot the interaction to visualize and better understand the relationship.

I know that’s not a definitive answer but understanding those results and determining which model is best requires you to assess it theoretically. Also check those residual plots for each one!

quyen says

it truly is!

Hannah says

This article was SO helpful, thank you!

Jim Frost says

Hi Hannah! I’m so glad to hear that it helpful! 🙂

Mohannas says

Hi Jim,

Thank you for your helpful link.

I have an issue regarding the quadratic and linear terms in a model. I have studied insect population in many places and plants species. I collected insects randomly every week.

So the model (mixed effects) is as follows:

– Fixed factors (place and plant species),

-Sampling date as quadratic and linear terms

– and plant (from which the insects were collected) as a random effect.

My questions are about the interactions:

1- is it enough to include only binary interaction among those effects: place, plant species and date (quadratic and linear effect)?

2- should I consider quadratic and linear terms in three way interaction (place:plant species:date + place:plant species:date^2)?

I appreciate your help.

Jim Frost says

Hi Mohannas,

It’s possible to go overboard with higher-order interactions. They become very difficult to interpret. However, if they dramatically improve your model’s fit and are theoretically sound, you should consider add three-way interactions.

However, in practice, I’ve never seen a three-way interaction contribution much to the model. That’s not to say it can’t happen. But only add them if they really make sense theoretically and demonstrably improve your model’s fit.

Luke says

Hello Jim, what happens when we have to deal with a variable that only has a value if another condition is met? Suppose we run a regression and we want to assess the impact of years of marriage, however this only applies to married people. Can we model the the problem the following way?

y ~ if.married+if.married*years.marriage

If so, how can I interpret the results of this? What does it mean to have a significant main term (if.married) but an insignificant interaction term? If the interaction is insignificant, but the main term is significant, can we just maintain the main term?

Thank you

Isa says

Hi Jim,

Thank you for your help!

Honestly I didn’t fully understand the part of using post hoc tests.

If I add a categorical variable, with several levels, in a regression (I use R), the output already gives me the p-values, and coefficients, for each level, compared with the reference level. After that, I would run an ANOVA between a model with, and a model without the variable, to test if, overall, the categorical variable is significant or not. But, I got the idea that, what you are sugesting is a test to evaluate the differences among each combination of levels, instead of just the reference (output of a regression). Is that correct?

Jim Frost says

Hi Isa,

Read my post that I link to and learn about post hoc tests and the need to control the familywise error rate. Yes, you can get the p-values for the difference but with multiple groups there’s an increasing chance of a false positive (despite significant p-values).

There are different ways to assess the mean differences. One is the regression coefficients as you mention, which compares the means to the baseline level. Another is to compare each group’s mean to the other groups means to determine which groups are significantly different from each other. Because you were mentioning creating groups based on all the possible combinations, the all pairwise approach makes more sense than comparing to a baseline level.

But, either way, you should control the familywise error rate. Read my post for information about why that is so important. In particular, you described creating multiple groups for all the combinations and that’s when the familywise error rate really starts to grow!

Isa says

Hello Jim,

Thanks for your article!

I have a question regarding interactions with binary variables. Considering the example you showed (Food+Condiment) is there any advantage in considering an interaction instead of modelling all possible combinations as a categorical variable?

My suggestion is: instead of doing Food*Condiment, we would create a categorical variable with the following levels: HotDog+Mustard; HotDog+Chocolate; IceCream+Chocolate; IceCream+Mustard. The results would show the difference between that certain level and the reference level. In my opinion this “all levels” approach seems easier to understand but is it correct?

Of course that for categorical variables with several levels (for example, several types of food and several condiments) this solution would be impractical, so I am just talking about two binary variables.

Jim Frost says

Hi Isa,

Conceptually, you’re correct that you could include in the model the groups based on the combinations of categorical levels. However, there are practical drawbacks for using that approach by itself. That method would require more observations because you need minimum number of observations per group. Additionally, because you’re comparing multiple means, you’d need to use a post hoc multiple comparison method that controls the familywise error rate, which reduces statistical power.

By including an interaction term, you’re directly testing the idea that the relationship between one IV and the DV depends on the value of another IV. You kind of get that with the group approach, but you’re not directly testing the interaction effect. Also, by including the interaction term rather than the groups, you don’t have the sample size problem associated with including more groups nor do you have the need for controlling the familywise error rate. Finally, using interaction plots (or group means displayed in a table), the results are no harder to interpret than using your method.

In practice, I’ve seen both methods used together. Statisticians will often include the interaction term and see if it is significant. If it is and they need to know if the differences between those group means are statistically significant, they can then perform the pairwise multiple comparisons on them. But usually it is worthwhile knowing whether the interaction term is significant, and then possibly determining whether any of the differences between group means are significant.

I’d see the two methods as complementary, but usually starting with the interaction term. I hope that helps!

zaid says

Hi Jim!

Lots Thank you for explaining these concepts.

I have a question about interaction effect tests.

In my study, I had two groups: a smokers’ group with three categories (including heavy , medium, and light smoking) and a non-smokers’ group (as control )

to evaluate the interaction effect of smoking on the depednent contanous variable (platelet count).

What is the best statistical test that can be applied to determine the effect of smoking on platelet count? What is the best statistical test that can be applied to determine the relationship between smoking and platelet count?

Jim Frost says

Hi Zaid,

To have an interaction effect, you must have at least two independent variables (IV). Then you can say that the relationship between IV1 and the dependent variable (DV) depends on the value of IV2.

It appears like you only have one IV (smoking level) and you have four levels of that IV (non, light, medium, heavy). You can use one-way ANOVA to find the main effect of smoking on platelet count. That test will tell you if the differences between the groups mean platelet count are statistically significant. Just be aware that is a main effect and not an interaction effect.

Vittorio Napoli says

Dear Jim, I have an important question on a matter I have to understand for a piece of research I’m conducting myself. You said enjoyment might depend on the Food*Condiment interaction, ok. What I don’t get is a point at a deeper level. Let’s suppose you want to do a repeated measure anova, because the same subject eats different types of foods with different condiments (which is the design of my own research) Let’s assume Food has three levels: sandwiches, pasta and meet and Condiment has two: ketchup and mayonnaise. The interaction might mean two things, among others, that is:

1) WITHIN the “sandwich” level, people prefer it with mayonnaise (higher enjoyment rate for mayonnaise) and not with ketchup. In this case, mayonnaise would show enjoyment rates which are higher in a statistical relevant way than ketchup WITHIN one of the levels of the other factor.

OR:

2) WITHIN “ketchup” level, people prefer to eat ketchup with pasta than with meet. In this other case, the comparison of the enjoyment means from the subjects is between food types and not condiment types.

I am stuck in this point, because the two things are different.

Thank you so much,

Vittorio

Muhammad Sajid Tahir says

Hi Jim, I gain lot of guidance from your discussions on statistics. I am stuck in a concept of multiple regression. If you can guide me it will be a great help. Following is my concept of background working of multiple regression: When we have two independent variables, first, we get residuals of both independent variables by regressing both variables on each other. Then we perform simple regression using residuals of both independent variables against the dependent variable, separately. This step gives us one beta for residual of each variable which is essentially considered as beta of each independent variable in multiple regression. Is this concept true? If so, I can understand that residual of one independent variable is necessary to obtain in order to break the multicollinearity between both independent variables. However, is it fair to get and use residual of second independent variable as well? I tried my best to put question in right way. If you need me to make it more elaborated, I can give it another try. Your answer will be highly appreciated.

Sajid

Jim Frost says

Hi Muhammad,

When you have two independent variables, you only fit the model once and that model contains both IVs. Hence, even when you have two IVs, you still obtain only one set of residuals. You don’t fit separate models for each IV.

In the future, try to place you comment/question in a more appropriate post. It makes reading through the comments more relevant for each post. To find a closer match, use the search box near the upper right-hand margin of my website. Thanks!

Eva says

Thank you very much Jim!

Emily says

Hi Jim

I am having trouble interpreting my own results of a two-way repeated ANOVA and was wondering if you could help me out.

DV: negative affect

IV:sleep quality (good or bad)

IV:gender

I found a significant main effect of sleep quality and negative affect but no significant main effect of gender and negative affect. However, i did find a significant interaction between sleep quality and gender. What could you conclude from this or how would you interpret these results?

Symon says

Dear mr. Frost,

I have a question regarding the interpretation of the repeated measures ANOVA. I conduct a part of my study to investigate the best way to identify the maximum voluntary contraction (MVC) in healthy subjects between 2 different protocols to measure the MVC (ref) and 4 different methods of determing the MVC (mean of the mean, mean of maximum, maximum of the mean and maximum of maximum). Therefore, in my analysis, I have 2 values of the 2 different protocols and values of 4 the conditions for each examined muscle.

I conducted in my analysis a repeated measures of ANOVA with a Bonferroni. On 1 particular muscle, the serratus anterior (SA), my results of the “Tests of Within-Subjects Effects” state that I have a significant interaction effect between the 4 conditions and the 2 values of the 2 different. So this means that I either have a difference in values between the 2 protocols but not for each condition i gues??

I got stuck with the interpretation of the interaction effects between this 2 types of factors.

Would it be possible for you to help me interpret the results?

Thank you in advance.

Kind regards,

Symon

Eva says

Hi Jim!

Thank you so much for the quick reply.

When you say increasing the sample size by 10, is that per group or overall? E.g. if I have 20 participants and want to add the sex interaction term, if I have 10 males and 10 females, do I increase the sample size to 30 or 40?

Thank you!

Eva

Jim Frost says

Hi Eva,

You’re very welcome!

Ah, I’m glad you asked that follow-up question. The guideline for a minimum of 10 observations is for continuous independent variables. It didn’t click with me that gender is, of course, categorical. Unfortunately, it’s a bit more complicated for categorical variables because it depends on the number of groups and how they’re split between the groups. Most requirements for categorical variables assume an even split. I’m forgetting some of the specific guidelines for categorical variables, but I’d guestimate that you’d need an additional 20 observations to add the gender variable with half men (10)/half women (10). If you have unequal group sizes, you’d ensure that the smaller group exceeds the threshold.

In the other post I recommended, I include a reference that discusses this in more detail. I don’t have it on hand, but it might provide additional information about it. Also, given the complexity of your design, I’d be sure to run it by your statisticians to be sure.

Finally, these guidelines are for the minimums. And you’d rather not be right at the minimums if at all possible!

Eva says

Hi Jim!

Thank you for the straightforward blog posts explaining these concepts.

I have a question about interaction tests.

I am designing an experiment and deciding on which test to use. The study is essentially a biomechanical test pre and post a treatment, so for checking treatment effects on the outputs (which are continuous) I will use a paired t-test. However, I also want to check the effects of sex and menstrual cycle phase. For the menstruating females, they are tested at 3 phases in the cycle. Another group of oral birth control users is also tested at 3 times across the cycle.

Now, one statistician recommended just putting everything in a linear mixed effects model (sex, menstrual cycle phase, birth control). Another one suggested doing the sex comparison separate and testing the change due to treatment between the groups, and then to test menstrual cycle effects comparing change due to fatigue in pairwise tests between phases (1 and 2, 2 and 3 etc) and also comparing between birth control and non birth control users, which ends up being a lot of separate tests.

I was also thinking I could instead test sex effect on the treatment outcome with an ANOVA with interaction (sex x treatment), using the average values for the females (since they are tested 3 times), and then for the menstrual cycle effects check phase x treatment x birth control, or phase x treatment first, and then do a separate test to compare changes due to treatment between birth control users and non birth control users at each phase (or use a linear mixed effects model only for the females).

Regarding the interaction tests, someone raised the issue of increased sample size being needed (https://statmodeling.stat.columbia.edu/2018/03/15/need-16-times-sample-size-estimate-interaction-estimate-main-effect/).

But another person mentioned that if I would do all the separate tests between phases etc then this would be an issue with multiple testing and would need correction. Which I assume would also then require a higher sample size for the same power and effect size, correct?

I am very new to stats so any help is much appreciated, especially explaining pros and cons of these approaches. I am also not sure whether, regardless of the tests for sex and phase, I should do a separate t-test for the primary outcome (pre vs post treatment) first and whether I need to correct p values for this or only for the subsequent tests.

Thank you!

Jim Frost says

Hi Eva,

As for the general approach, it sounds like you have a fairly complex experiment. My sense is to include as much as you can in a single model and not perform separate tests.

Instead of a paired t-test, use the change in the outcome as your dependent variable. Then include all the various other IVs in the model. This approach improves upon a t-test because you’re still assessing the changes in the outcome, just like a paired t-test, but you’re also accounting for addition sources of variability in the model, which the paired t-test can’t do. If there’s a chance that the treatment works better in women or men, then you should include an interaction term.

Again, I would try to avoid separate tests and build them into the model. Based on the rough view of your experiment that I have, I don’t see the need for separate tests. You can perform the multiple comparison tests as needed to do the adjustments for the multiple tests, but I would have the main tests all be part of the model, and the perform the multiple comparison tests based on the model results.

You definitely need a larger sample size when include interaction terms, or any other type of term (main effect, polynomial for curvature, etc.) I’m not sure that I agree with that link you share that you need 16 times the sample size though. (The author proposes a very specific set of assumptions about relative effect sizes and then generalizes from there. Not all interaction effects will fit his specific assumptions.)

Typically, you should increase your sample size by at least 10 for each additional term. Preferably more than that but 10 is a common guideline. And that’s an increase of 10, NOT 10X! Before collecting data, you should consider all the possible complexities that you might want to include in your model and figure out how many terms that entails. From there, you can develop a sense of the minimum sample size you should obtain.

As I show in this post, if interaction effects are present and you don’t account for them in the model, your results can be misleading! However, you don’t want to add them all willy nilly either.

I write about that in my post about overfitting your model, which includes a reference for sample size guidelines.

I probably didn’t answer all your questions, but it seems like a fairly complex experiment. I think many of the answers will depend on subject-area knowledge that I don’t have. But hopefully some of the tips are helpful!

Amy says

Hi Jim,

I am currently doing my dissertation and am doing a 3 (price difference: no/low/high, within subjects) X 2 (information level: low/high, between groups) mixed ANOVA to assess the affect of price and information on consumers sustainable shopping decisions. I have significant main effects of both price and information, but the interaction was not significant. When interpreting these results, what does the non-significant interaction tell me about the main effects?

Also is there any other possible reason for the non-significant interaction eg narrow sample type etc?

Thank you!

Amelia Grant-Alfieri says

Hi Jim,

Do you have R code by chance?

Amelia

Layan says

Hi Jim,

Great!! Thank you for the explanation. Very helpful and informative.

Layan says

Thank you for the this, great explanation.

My question is, do we rely only on the p-value to indicate significance in interactions?

All thanks

Jim Frost says

Hi Layan,

In terms of saying whether the interaction is statistically significant or not, yes, you go only by the p-value. However, determining whether to include it or not in your model is a different matter.

You should never reply solely on p-values for fitting your model. They definitely play a role but shouldn’t be the sole decider. For more information about that, read my post, Choosing the Correct Regression Model.

That definitely applies to interaction terms. If you get a significant p-value and the interaction makes theoretical sense, leave it in. However, if it’s significant but runs counter to theory, you should consider excluding it despite the statistical significance. And if the interaction is not significant but theory and/or other studies strongly suggest it should be included, you can include it regardless of the p-value. Just be sure to explain your approach and reasoning in the write-up and include that actual p-value/significance (or not) in your discussion. For example, if it’s not significant but you need to include it for theoretical/other researcher reasons, you’d say something to the effect, “the A*B interaction was not significant in my model, but I’m including it because other research indicates it’s a relevant term.” Then some details about the other research or theory.

I’m not sure if you were asking about only determining significance/nonsignificance or the larger issue of whether to include it in your model, but that’s how to handle both of those questions!

Alice says

“The other main reason is that when you include the interaction term in the model, it accounts for enough of the variance that the main effect used to account for before you added the interaction, that it is longer significant. If that’s the case, it’s not a problem! It just means that for the main effect in question, that variable’s effect entirely depends on the value of the other variable in the interaction term. None of that variable’s effect is independent of that other variable. That happens sometimes!”

Would this be true if the interaction term was not significant?

Jim Frost says

Hi Alice,

It can be true. It’s possible that there’s not enough significance to go around, so to speak, leaving both insignificant.

Hania ElBanhawi says

Hi Jim,

Thanks for a great explanation!

What happens if, once you add the interaction term, the main effect is no longer statistically significant, but the interaction term is?

Thank you,

Hania

Jim Frost says

Hi Hania,

That can happen for a couple of reasons. One, if you have an interaction term that contains two continuous variables, it introduces multicollinearity into the model, which can reduce statistical significance. There’s an easy fix for this problem. Just center your continuous variables, which means you subtract each variable’s mean from all its observed values. I write about this approach and show an example in my post about multicollinearity.

The other main reason is that when you include the interaction term in the model, it accounts for enough of the variance that the main effect used to account for before you added the interaction, that it is longer significant. If that’s the case, it’s not a problem! It just means that for the main effect in question, that variable’s effect entirely depends on the value of the other variable in the interaction term. None of that variable’s effect is independent of that other variable. That happens sometimes!

Tadele says

Thank you Jim, that is really helpful to me!

I am currently on research and in my research I have 3 independent variables (x, y, z) and one dependent variable. after conducting a 3-way ANOVA, I have seen that, all the 3 variables and their interaction are significant. I have no idea what to do next. please help me how to solve this🙏🙏

Serap says

Dear Jim, thanks for your time, and valuable info . My analysis has result for one of the dimensions impact is statistically insignificant while interaction effect is significant. I was told not right to interpret. Now I see it could be. Can you please send me a source to refer in my thesis? Thx, abd Best Regards

Jim Frost says

Hi Serap,

My go to reference is

Applied Linear Statistical Modelsby Neter et. al. I haven’t looked to see if it discusses this aspect of interaction effects, but I’d imagine it would in its over 1000 pages!Yes, when you have significant interaction effects, you need to understand them and incorporate them into your conclusions. Failure to do so can lead you to incorrect conclusions. This is true whether or not your main effects are significant.

Chelsea says

Thank you Jim. This is very helpful!

I have one question regarding interaction. Let’s say I have two dichotomous variables A and B. What is the difference between using an interaction term A*B in the model vs. creating a grouping variable that has four levels (A+B+; A+B-; A-B+; A-B-)?

Thanks!

Meghan Sevilla says

Hi,

In my paper, I have two independent variables (infusion time and amount of lemongrass) which has a significant interaction. I am unsure as to how to explain and support this in words.

Thank you

Diego Figueroa says

Hi Jim!!

Thanks for taking your time to answer our questions, I discovered your page today and it’s great!

If you have some time, I would like to ask you about interaction terms in a time series regression, such as an ARDL model. My questions are two.

i) Does an interaction term with two variables, let’s call these X1{t} and X2{t}, need lags in this type of model?

ii) If I have some interaction term like X3{t}=X1{t}*X2{t}, is it necessary to apply a unit root test to check for stationarity, right? In that case, what is the best way to reach it?

Thanks a lot !!!

Ângela de Carvalho Ribeiro says

Hi Jim! Thanks for you post!

How do I report the interaction on the text? Is it correct to say that the codiment has effect on satisfaction only if it interact qith the type of food?

Linlin Zhang says

Hi Jim,

How to Automatically judge whether there are interaction terms between independent variables by using a package in R? Is there a way to do this?

As I have 5 categorical independent variables and 4 continuous independent variales. It’s hard for me to check it one by one manually.

Jim Frost says

Hi Linlin,

I don’t know about R, but various software packages have routines that will try all sort of combinations of variables. However, I do not recommend the use of such automated procedures. That increases your chances for finding chance correlations that are meaningless. Read my post about datamining for more information. Ideally, theory and subject-area knowledge should be guides when specifying your regression model.

Jefferson Ibhagui says

Hello Jim,

I got to know about the awesome work you are doing in the statistics field from someone I am following on YouTube and LinkedIn in the Data Analysis and Data Science space(s).

With respect to interactions effects, I have some questions:

(1) should the interaction terms be included in a multiple regression model at all times if they are statistically significant? Or is it the research/study question(s) that should determine their inclusion?

(2) How do you determine the interaction terms to include? For instance, in the second example in this blog of a manufacturing process, I see that you used Pressure and Temperature as the interaction terms.

Finally, I analyzed the data in (2) using Excel. While I obtained the same model output as in your example, the interaction plot I created had parallel lines for high and low pressure respectively, suggesting a lack of interaction. I used this formula

Strength = 0.1210*Temperature*Pressure

to derive fitted/predicted values for the dependent variable along with the high and low values for pressure and the range of values for temperature. Please is there something I did incorrectly?

I wanted to send you an email or post the image here but these have proven difficult.

Hoping to hear from you.

Thank you.

Jefferson

Jim Frost says

Hi Jefferson,

Sorry about the delay replying! Thanks for writing with the great questions.

In the best case scenario, you add the interaction terms based on theoretical reasons that support there being an interaction effect. You can go “fishing” for a significant interaction term to see if it is significant. But be aware that doing that too much increases your probability of finding chance effects. For more information about that, read my post about data mining. If you do add interaction terms “just to see” and you find a significant one, be sure that it makes logical/theoretical sense.

So, there’s not a hard and fast rule for knowing when to include interaction terms. It’s bit of an art as well as a science. For more details about that aspect, read my post about Model Specification: Choosing the Correct Regression Model. But try to use theory and subject-area knowledge as guides and be sure that the model makes sense in the real-world. You don’t want to be in a situation where the only reason you’re including variables and interactions is just because they’re significant. They have to make sense too.

I’m not sure why your Excel recreation turned out that way. I recreated the graph in Excel myself and got replicated the graph in the post almost exactly. You need to use the entire equation and enter values for all the terms. For the Time variable, I entered the mean time. The Excel version I made is below. Additionally, I’ve added a link to the Excel dataset in the post itself. You can download that and see how I made it.

I hope that helps!

Ly says

Hi Jim,

I am attempting to explain a three-way interaction.

I examine the effect of X on Y condition on two moderators (W and Z; W and Z are positive values).

I expect that W reduces the negative effect of X on Y; Z strengthens the negative effect of X on Y; and I need to conclude which one has stronger effect on the relationship between X and Y.

Case 1: W and Z are continuous variables, the outcome is:

Y = 0.023 -X(0.941-0.009*W+0.340Z-0.201WZ)

Case 2: W and Z are dummy variables (taking a value of 1 or 0), the outcome is

Y = 0.016 -X(0.967-0.092*1+0.145*1-0.253*1*1) (I replaced W=1 and Z=1)

How I can interpret the three way effect in this case. Could you give me a help?

Thank you

Emily says

Thank you for your quick response! Really helpful 🙂

Becky says

Jim,

Thank you so much for your time explaining these concepts! I’m reading medical literature and trying to figure out the difference between a P-interaction and a p-value. This is a very important study with a P-interaction=0.57 so being that it is not <0.05 (statistically significant), I'm thinking P-interaction must be the 3-way interaction rather than the main effects.

Thanks so much!

Becky

Jim Frost says

Hi Becky,

I don’t know what a P-interaction is? Do you mean a p-value for an interaction term?

P-values for interaction terms work the same way as they do for main effects and I’ve never seen them given a distinct name. When they’re less than your significance level, they indicate that the interaction (or main effect) is statistically significant. More specifically, they’re all testing whether the coefficient for the term (whether it’s for a main effect or interaction effect) equals zero. When your p-value is less than the significance level, the difference between the coefficient and zero is statistically significant. You can reject the null that effect is zero (i.e., rule out no effect). All of that applies to main effects and interaction effects. It’s how you interpret the effect itself that varies.

I hope that helps!

Max says

Jim — this is so helpful. I think I get it now. We can imagine a scenario where sleep and study are not correlated. The good students, no matter how much they sleep, still find the time to study. The bad students, no matter how much they sleep, still don’t study much. So sleep hours and study hours are not correlated.

And yet, there can still be an interaction effect between sleep and study. For example, we could imagine that for the students who study a lot, getting extra sleep has a big, positive impact on their GPA. But for the students who study a little, getting extra sleep has only a tiny, positive impact on their GPA.

Thus, there was no correlation between sleep and study, but their interaction was still significant.

Did I get that right?

Jim Frost says

Hi Max,

Yes, that’s it exactly! There doesn’t need to be a correlation between the IVs for an interaction effect to exist.

Emily Boekhout says

Hi Jim, thank you for writing this article!

I was wondering if you could help me with this. I conducted a regression analysis including an interaction between two categorical variables. (Sequel*Book) [yes=1; no=0] on movie performance.

I found a positive significant interaction effect, can I now say that the performance of book on movie performance is based on if it is a sequel or not? And thus suggest that if sequel = 1 this positively affects performance?

Thank you!

Jim Frost says

Hi Emily,

Yes, if you have that type of interaction and it is statistically significant, you can say that the relationship between Book (Yes/No) and Movie Performance depends on whether it’s a sequel. Using an interaction plot should help bring the relationships to life. From what you write, it sounds like the interaction effect favors movies that are a sequel and based on a book. That combination of characteristics produces a higher performance than you’d expect based on main effects alone.

Dilum says

Hi Jim,

Thank you so much!! I really appreciate you taking the time to answer my questions! If I may, I have another couple of questions that arose when I ran the moderations:

First, I had only one three-way interaction (I*C*E) that was b = .002, p = .049, and I am not sure if I should keep it in. To give more context, R square change = .013, R^2 = .404, F = 3.910, p = .049.

Second, one of the lower order terms I have in my model (C*E) is a product of the two moderators and it was not significant in any of the 15 moderations and it does not contribute to my hypothesis. Should I still retain the C*E when I run this even though it’s not technically something I’m looking at? I was told I didn’t have to, but since it is a lower order term (and of course I, C, E and I*C and I*E are included) that contributes to the I*C*E interaction, I am conflicted on if I should keep it in there.

Also, given that I am running 15 moderations, would you happen to know if I should use Bonferroni to correct the alpha value? I do not want to p-hack my results.

Thank you again!

Dilum

Jim Frost says

Hi Dilum,

You’re very welcome!

If that’s the change in R-squared due to the three-way interaction term, that’s a very small increase! And, it appears to be borderline significant anyway. It might not be worth including. You can enter values in that equation and the model without the 3-way interaction to see if the predictions change much. If they don’t, it’s more argument to not include the 3-way, unless there are strong theoretical/subject-area reasons to include.

Also, that’s a ton of interaction (moderation) terms. Are they all significant? How many observations do you have? I’m not sure if you said earlier. With so many terms in the model, you have to start worrying about overfitting your model.

Do you have theoretical reasons to include so many interaction terms? Or does it just improve the fit? I don’t think I’ve ever seen a model with so many interactions!

Liza says

Thank you.

My study is concerned with critical thinking skills measured by the health science reasoning test (hsrt) and the levels of academic performance (measured as A+, A, etc) . The critical thinking skills are divided into 5 subscales. I am after the impact of a single critical thinking skill or a combination of them to the levels of academic performance.

When I used Interaction, I found that there are significant relationship. I used the main effect, the interaction between two up to even 4 critical thinking skills (A*B*C*D) . Am I on the right track?

Thank you very much.

Jim Frost says

Hi Liza,

You’re very welcome!

One thing you should do is see what similar studies have done. When building a model, theory should be a guide. I write about that in my post about model specification. Typically, you don’t want to add terms only because they’re significant.

However, that’s not to say that you’re going down the wrong path. Just something to be aware of.

If you’re adding these terms, they’re significant, and they make theoretical sense, it sounds like you’re on the right track. Again, read my warning to Dilum about three-way interactions. That would apply to four-way and higher interactions too. They’re going to be very hard to interpret. And they often don’t improve the model by very much. In practice, I don’t typically see more than two-way interactions.

They might make complete sense for research. Just be sure they’re notably improving the fit of the model! Also, with so many interaction terms, you should be centering your continuous variables because you’ll be introducing a ton of multicollinearity in your model. Fortunately, centering the continuous variables is a simple fix for that type of multicollinearity. Read my post about multicollinearity, which also illustrates using centering for interaction terms.

Liza Hipolito says

Thank you so much for your reply! Is interaction of THREE IVs against one DV possible?

One more question, if you may. What is the difference between interaction and two-way, between interaction and three-way?

Thank you very much.

Jim Frost says

Hi Liza,

Yes, a three-way interaction is definitely possible. However, read my very recent reply to Dilum in the comments for this post with cautions about three-way interaction terms!

A three-way interaction is when the relationship between an IV and a DV depends on the values of two other IVs in the model. A two-way interaction is where the relationship between an IV and DV depends on just one other IV in the model.

Liza says

Hi Jim.

Thank you for your explanation.

So sorry for posting my comment here as i fail to find where to properly comment.

My query goes like this. I am finding out the impact of a single DV or a combination of several DVs and several IVs. Is it safe to assume that i can use GLM interaction?

Thank you very much for your time.

Jim Frost says

Hi Liza,

For most models, you’ll assess the relationship between one or more IVs on a single DV. There are exceptions but that’s typical.

In that scenario, if you have at least two IVs, yes, you can assess the main effects between each IV and the DV and the interaction effect between two or more IVs and the DV.

As long as you have at least two IVs, you can assess the interaction effect.

Dilum says

Hi Jim,

Thank you so much for this clear explanation. It cleared up a lot of things for me, but I have some questions that arose from my own research project that I am hoping you could answer.

I am running several moderations for a study, where I’m looking at different executive functions as measured by one test. So the DV(s) are the five executive functions, and I also have three IVs that three symptoms of a disorder, and three moderators. My syntax for the model looks something like this, but multiplied by 15 (for the 5 DVs, for each of the three IVs):

IV1 M1 M2 M3 IV1*M1 IV1*M2 IV1*M3 IV1*M1*M3 IV1*M2*M3

My first question is: If I find that the three way interactions are not significant, but I find that one or more of the two way interactions ARE significant, do I drop the 3-way interaction terms from the model and rerun with my 2-way interactions and the predictors/moderators?

My second question is very similar to the above one: If neither the 2- or 3-way interactions are significant, do I drop them from the model and just run a linear regression with my 4 predictors?

My third question follows from both the above questions: Because I am testing 5 IVs from the same scale, and I have (technically) 15 models to run, if I drop any interaction term from any one of the models, do I have to drop them from the other 14? Is it okay to present some results where I only had 7 terms in the model and some where I had 4?

Thank you so much!

Jim Frost says

Hi Dilum,

Yes, analysts will typically remove interaction terms (and other terms) when they are not significant. The exception would be if you have strong theoretical reasons for leaving a term in even though it is not significant.

There is an exception involving interaction terms but it works the other way than what you describe. If you have a significant two-way interaction (X1*X2) but one or both of the main effects X1 X2 are not significant, you’d leave those insignificant effects in the model. The same goes for a three-way interaction. If that’s significant, you’d include the relevant main effects and two-way terms in the model even when they’re not significant. It allows the model to better estimate the interaction effects. The general rule is to include the lower order terms when the higher order terms are significant. Specifically, it’s the lower-order terms that comprise the significant higher-order term.

To summarize, if the high-order term is not significant, it’s usually fine to remove it. If a higher-order term is significant and one or more of the lower-order terms in it are not significant, leave them in.

As for three-way interaction terms, if you include those, be sure they notably increase the model’s fit. And, I don’t mean just that they’re significant but make a real practical difference. Three-way interaction terms are notoriously hard to interpret. When they’re significant, they often are improving the model by much. So, be sure that they’re really helping. If you really need them, include them in the model. I don’t see them used much in practice though. And, check to see what similar studies have done.

The answer to your last question really depends on the subject area and what theory/other studies say. Generally speaking, you don’t need to include the same IVs in different models. The results in one model don’t necessarily affect what you include in the other models. However, there might be concerns/issues specific to the subject area that state otherwise! While there’s no general rule that says the models must be the same, there could be reasons for your specific scenario. You’ll have to look in to that.

Max says

Thanks so much for your thoughtful and quick response, Jim. I truly appreciate it.

I get what you mean about how the process is fictional and just meant to illustrate a point.

It’s interesting to me that a correlation between x and y is not necessary for x*y to be a significant interaction term. I guess is my next question is…why not?

When I tried to explain interaction effects to someone the other day, I gave a different example:

GPA = a*Sleep + b*Study + c*Sleep*Study

I tried to say, “If the impact of study hours on GPA depends on sleep hours, then you have an interaction effect. For example, if study hours only boost GPA when sleep hours are greater than 6, then you have an interaction effect.” (I hope I explained that correctly! Let me know if not.)

The person responded, “Oh, so you mean that there’s a correlation between sleep and study?”

I can see why the person asked that question, and I’m not sure I have an intuitive explanation for why the answer to their question is, “No, not necessarily.”

I imagine these things are tricky to explain, and I hope I’m not taking us too far down a rabbit hole. Anyways, thanks again for your time!

Jim Frost says

Hi Max,

I have heard that confusion about interaction effects and the being correlated several times, so it’s definitely a common misconception!

First, let’s look at it conceptually. A two-way interaction means that the relationship between X and Y changes based on the value of Z. Y is the DV. X and Z are the IVs. There’s really no reason why a correlation between X and Z must be present to affect the relationship between X and Y. It’s just not a condition for it to happen.

Now, let’s look at this using your GPA example. Imagine that each person is essentially a relatively good or poor student and studies accordingly. Better students study more. Poor students study less. A bit of variation but that’s the tendency. Now, imagine that a good student happens to sleep longer than usual. Being a good student, they’ll still study a longer amount of time despite having less awake time to do it. Alternatively, imagine that a poor student happens to sleep less than usual. Despite having more awake hours in the day, they’re not going to study more. Hence, there’s no correlation between hours sleeping and hours studying. Despite this lack of correlation, the model is saying that the interaction effect is significant, which means that the relationship between Studying and GPA depends on Sleep (or the relationship between Sleep and GPA depends on Studying) even though Sleep and Study are not correlated.

Basically, the presence of these two conditions affect the relationship between the IV and DV even when the IVs aren’t correlated. And, ideally, you don’t want IVs to be correlated. That’s known as multicollinearity and excessive amounts of it can cause problems!

Max says

Hi Jim! Thanks for the helpful post. I have some thoughts and questions about interaction effects…

– What I really like about the condiments example is that it’s extremely intuitive.

– Might you have an example for continuous variables that is equally intuitive?

– Or maybe you could say a bit more about the pressure and temperature example. You described it as a “manufacturing process.” What might we be manufacturing there? (I know the math is the same no matter what, but if I have an intuitive understanding of the “real-world scenario,” it helps me grasp things better.)

And now, my big picture questions…

Is a correlation between “x” and “y” a necessary condition for an interaction effect of x*y?

So, in the current example, does the fact that there’s an interaction effect of pressure*temperature imply that there is a correlation between pressure and temperature?

Could you ever have an interaction effect of x*y without a correlation between “x” and “y”?

Thank you so much for your time and public service!

Jim Frost says

Hi Max,

The manufacturing process uses continuous variables in the interaction. The process is fictional, but you don’t really need to know it to understand what the interaction term is telling you. Pretend it’s a widget making process. Temperature and pressure are variables in the production process. You can set these variables to higher/lower temperatures and pressures. Perhaps it’s the temperature of the widget plastic and pressure it is injected into the mold. You get the idea. The idea is that these are variables the producer can set for their process.

The continuous interaction is telling you that the relationship between manufacturing temperature and the strength of the widgets changes depending on the pressure the process uses. If you use high pressure in the process, then increasing the temperature causes the mean strength to increase. However, if you use a lower pressure during manufacturing, then increasing the temperature causes the mean strength to drop.

A correlation is NOT necessary to have significant interaction terms. You can definitely have a significant interaction when the variables in it are not correlated with each other. Those are separate properties. In the example, temperature and pressure are not necessarily correlated. You can actually check by downloading the data and assessing their correlation. I haven’t checked that specifically so I don’t know offhand.

Maura K. says

Thank you, Jim. I appreciate the time you have taken to create these wonderful posts.

Jim Frost says

You’re very welcome, Maura. I’m so happy to hear that they’ve been helpful!

Ranil says

hello jim,

In my Study to understand the effect of a therapy program onmemory function, I got significant main effects(within subject) but insignificant interaction effects(time* Therapy outcome). how can I interpret this?

Jim Frost says

Hi Ranil,

I’m not sure what you’re variables are. You indicate and interaction with time and the therapy outcome, but the interaction term will contain two independent variables and not an outcome variable.

However, in general, when you have significant main effects and the interaction effects are not significant, you know that there is a relationship between the IVs and the DV. Your data suggest those relationships exist (thanks to the significant main effects) and the relationships between each IV and the DV do not change based on the value of the other variable(s) in your interaction term because the interaction is not significant.

I hope that helps!

Erica Rodrigues says

Hi Jim,

Would you probe the simple slopes if an interaction in the regression model was not significant? I probed it anyways and saw that one of the simple slopes was significant. Im not sure what this means.

Thanks so much!

Jim Frost says

Hi Erica,

If an interaction term is not significant, I’d consider removing it from the model and, as you say, assess the slopes for the main effects. The exception is if you have strong theoretical reasons to include an interaction term. Then you might include it despite the lack of significance. But, generally, you’d remove the interaction term.

Animesh Tulsyan says

Hi Jim,

Many thanks for the blog. I have recently purchased your Introduction to statistics book and also your regression book. Really like the way you explain difficult concepts. While I have understood the concept of “Interaction effects”, I am getting a different result while running the regression on the data for Interactions categorical. The methodology I have used is as follows :

1) Dependent Variable – Enjoyment.

2) 1st IV- Food – ( 1-Hotdog, 0-Icecream)

3) 2nd IV- Condiment- (0-Mustard, 1-Choclate Sauce)

4) 3rd IV- Food * Condiment ( Gives a value of 1 for Hotdog* Choclate Sauce and 0 for others).

Then run the regression on excel.

Output

Coefficients Standard Error t Stat P-value

Intercept 61.30891335 1.119532646 54.76295272 7.91E-63

Food 28.29677385 1.583258251 17.87249416 6.13E-29

Condiment 31.73918268 1.583258251 20.04675021 4.51E-32

Food*Condiment -56.02825797 2.239065292 -25.0230568 1.95E-38

The output is quite different from your output. Could you help me understand the difference or am I using incorrect methodology ?

Also, the interaction plot for enjoyment (Food* Condiment) where the line crosses each other seems difficult to plot on excel. Would be helpful if you could write a blog how such interaction plots can be done on excel.

Thanks a lot !!

Animesh

Tom says

Hi again Jim,

So far I have read your book about regression and it is a great source of practical knowledge. Highly recommended for everyone who needs to run multiple regression.

This is where I learnt about checking for interactions.

I need some help to interpret my findings. I centralised my predictors and not the dependent variable. While the dependent variable and one predictor are the totals from validated scales, the other predictor was treated as continuous data but data is recorded on 1-5 (strongly disagree-strongly agree) Likert but not validated scale. This predictor when centralised did not come with the exact 0 mean but 0.0022. Is this something I should worry about?

Also the regression model with the interaction is overall significant but the interaction coefficient is not significant, p=.31. How should I interpret it. Does it mean that there is no interaction?

I wondered if you could also help please with a practical question regarding results write up. Would you present correlations and descriptive statistics of the analytical sample or regression sample after the residual outlier was removed?

Thank you in advance.

Tom

Jim Frost says

Hi Tom,

I’m so glad the regression book was helpful!

That slight difference from zero isn’t anything to worry about. It sounds like you’re interaction is not significant. Unless you have strong theoretical reasons for including it, I’d consider removing it from the model.

For the write up, I’d discuss the significant regression coefficients and what they mean. Do the signs and magnitudes of the coefficients match theory? You should discuss the outlier, explain why you removed it, and whether removing it changed the results much.

I hope that helps!

Neha says

Hi Jim,

Hope you have been doing well! Thank you such extremely informative and easy to comprehend posts! They truly have clarified many statistics concepts for me!

I had a quick question regarding the confidence intervals (CIs) for interaction terms. Let’s consider the following situation:

The interaction term is made of a binary variable (B) and a continuous variable (M). Each of these have their own 95% CIs. My model appears as such:

Y= 2.2 – 0.2M + 0.1B+ ((0.12)M*B ) where B = 0 or B = 1.

I am interested in the beta coefficient in front of M and so, when I reduce this equation, I get:

Y = 2.2 -0.2M when B= 0 and

Y= 2.3 – 0.32 M when B= 1.

This was all well and good as now I can talk about how B being 0 or 1 can effect the outcome Y, but can I obtain the 95% CI for these two new equations’ beta coefficients? Simply using the lower once and then the upper 95% CI of the beta coefficients for M, B and their interaction term and doing the same math does and does not make sense to me. I hope you can guide me through this a bit.

I am using SAS, so it would be great if you could also guide with the code somehow (if there is one)!

Thanks!

Jim Frost says

Hi Neha,

The good news is that yes it is entirely possible to get the CI for that coefficient. Unfortunately, I’m don’t use SAS and can’t make a recommendation for how to have it calculate the CI. The CI will give you a range of likely values for the difference between the two slopes for M. Because your interaction effect is significant (I’ll assume at the 0.05 level), the 95% CI will exclude zero as a likely value.

Patty Iyer says

Hi Jim, this was a great article. Thank you! I had a question. Are there some regression models in which effect measure modification may be present but the interaction term does not indicate presence of interaction?

Jim Frost says

Hi Patty,

I’m not entirely sure what you’re asking. But, if you have a significant interaction term in your model, then an interaction effect is present.

The only way that an interaction effect wouldn’t be present in that case is if your model is somehow biased/incorrect. Perhaps you’ve done one of the following: misspecified the model, omitted a confounding variable, or overfit the model. But then you can’t necessarily trust any part of your model.

However, if you specified the correct model and your interaction term is significant, you have an interaction effect.

Eunice AB says

Dear Jim,

Thank you for your precise and concise discussion on interaction terms in regression analysis. Please keep up the good work. I would like to know whether time series regression specifications can have interaction terms. I am trying to investigate the effect of interacting two macroeconomic variables in one country over a period of time. Thank you for your time.

Jim Frost says

Hi Eunice,

Thanks so much! Your kind words made my day!

Yes, you can definitely include interaction terms in time series regression.

Irena says

Thank you! I really appreciate this website. 🙂

Mike says

Jim,

In Chapter 5 of your ebook (great book by the way…worth every penny), you couch interaction effects in terms of variables (like A, B, and C above) which was very effective in conveying the concept. I did have a practical technical question however. If A’s main effect with Y is described in terms of a quadratic(Y=const+A+A^2) how do you check the interaction effect on Y along with the second variable B? Is it still simply A*B or should you include the squared term as such A*A^2*B? As a practical example from p82 in your ebook, you were showing the relationship between %Fat and BMI where the relationship was described well by a quadratic (%Fat=const+BMI+BMI^2). To extend that example, lets say that Age is also related to %Fat. How do you check the interaction effect between BMI and Age on %Fat?

Thanks in advance for your clarification,

Mike

Jim Frost says

Hi Mike,

I’m so glad to hear that my regression book has been helpful!!

That’s a great question. The answer is that the correct model depends on the nature of your data. If you have a quadratic term, A and A^2, you can certainly include one or both of the following interaction terms: A*B and A^2*B.

Again, it could be one, both, or neither. You have to see what works for your model while incorporating theory and other studies. Interpreting the coefficients becomes even more challenging! However, the interaction plots I highly recommend will show you the big picture. Logically, if a polynomial term includes an interaction, the lines on the interaction plot will be curvilinear rather than straight.

So, yes, it’s perfectly acceptable to add those types of interaction terms.

Irena says

Hi Jim! Thanks for the great article. I have one question: I am doing multivariate logistic regression with 7 covariates. I would like to test if there is interaction between two of those variables. Should I check for interaction in the FULLY adjusted model? Or in a model that includes only the two variables and their interaction term? Thanks!

Jim Frost says

Hi Irena,

Typically, I’d include the interaction in the model with all the relevant variables rather than just the two variables. You want to estimate the interaction effect while accounting for the effects of all the other relevant variables. If you leave some variables out, it can bias the interaction effect.

Muhammad Sajid Tahir says

Hi Jim,

First, thank you much for providing this kind of easy written information for people like me who are are statisticians.

My question is, I have 6 quantitative independent variables to do regression on a dependent quantitative variable. When I regress each independent variable on dependent variable, separately, I find every independent variable statistically very significant (p-values very less than 0.05, the max value is 0.004, rest are 0.000). When I do a multiple linear regression including all independent variables altogether, I find two of the independent variables statistically insignificant. I know that this situation arises due to interaction term and multicollinearity. Given the situation, should I drop the two non significant independent variables from the multiple regression model, while they were significant in the individual simple regression models. In same context, I find support from literature that these variables (two variables who got insignificant p-value in multiple regression) do affect the dependent variable. You answer in terms of keeping of dropping these variables will be appreciated. Thank you

Jim Frost says

Hi Muhammad,

If your model has an interaction term, you’re correct that it creates some multicollinearity. Fortunately, there is a simple fix for that. Just center your continuous variables. Read my post about multicollinearity to see that in action! Multicollinearity can reduce your statistical power, potentially eliminating the significance of your variables.

Definitely check into that because that could be a factor. There’s even a chance that you could restore their significance. And, in general, you should assess multicollinearity even outside the interaction effect. The post I link to shows you how to do all of that.

However, there’s another possibility. Remember that in multiple regression, the significance of each variable is determined after accounting for all the other variables in the model. In other words, the results for each variable are determined after accounting for all the other variables in the model. Does the variable in question explain a significant portion of the variance that is not explained by all the other variables? That could be another reason why a variable is significant in a model by itself but not with other variables. By itself, it explains a significant portion. But with other variables, it doesn’t explain a significant portion of the

remainingvariance.In terms of keeping or dropping the insignificant variables, theory should be a strong guide here. In general, it’s worse to drop a variable incorrectly than it is to leave several in unnecessarily (although it is possible to go too far that way as well). So, err on the side of leaving additional variables in when you’re unsure. Because other research suggests that those variables are important, you actually have theoretical grounds to leave them in. What I’d do is see if the results change much if you remove those variables.

If removing the variables doesn’t change the results noticeably, then I’d strongly consider leaving them in the model. In your results, you could explain that they’re not significant but other research suggests that they belong in the model. You can also state that the model doesn’t change substantially whether they’re in the model or not.

However, if removing those variables does change the results substantially, then you have a bigger decision to make! You need to consider the different results and determine which set to use! Again, let theory be your guide, but there are other statistics you can check.

Also, read my post about specifying the correct model for tips on that whole process!

But, before getting to that point, assess and fix the multicollinearity. It might be a moot point.

I hope that helps!

Timo says

Hi Jim,

Your comment was the only piece of information I found on the interpretation of a non polynomial interaction when there’s a polynomial main effect.

This is of interest for me as I found a significant polynomial main effect, but only the lower order interaction is significant. The quadratic interaction is not. And now I don’t know how to interprete that.

Do you know where I can find more information on that topic?

Best wishes

Timo

Jim Frost says

Hi Timo,

The recommendation I always offer is to create an interaction plot to understand your interaction effects. That way you can literally see what the interaction looks like! That helps even with more straightforward interaction terms. But yours aren’t straightforward! But an interaction plot will still make it clear!

Habtamu says

Is the presence of interaction (Xi &Xj. on Y) implies the presence of any types of relation between Xi &Xj?

Jim Frost says

Hi Habtamu,

A significant interaction does NOT imply a relationship between the variables in the interaction. There might or might not be a relationship.

Lik says

Thanks, Jim!

Lik says

Hi Jim!

I want to ask a stupid question. For example, people of different genders have different developmental patterns with age. Why can’t we calculate the difference between people of the same age and different genders in pairs, so as to transform the problem into a common problem of correlation analysis or regression between the difference of Y and age?

I want to use supervised learning method to find the appropriate y value, but neither the interaction effect in ANOVA nor regression analysis can achieve this, so I want to turn this problem into regression or classification problem. Or do you know other solutions that can be implemented?

Thanks a lot !

Best,

Lik

Jim Frost says

Hi Lik,

It sounds like for you case, you’d want to include an age*gender interaction term in your model. Including that term allows you to assess those different developmental patterns between genders as they age. The reason you want to include this term and use all the data in one model is so the model can account for changes in your outcome variable while factoring in that effect and all the others in your model. Correlational analysis doesn’t control for other factors. And, I have a hard time imagining how you could do that while retaining all the benefits of a regression/ANOVA model with all your IVs and the interaction effect.

Based on what you write, a regression model that includes the interaction terms sounds like a good solution to me.

Eva says

Hi Jim!

I’m investigating the effect of two categorical independent variables on one continuous dependent variable. When I’m running a two-way ANOVA, there is a significant interaction effect in my data. When I look at the summary of my general linear model (t-tests), there is only a significant interaction effect between some levels of both factors and not all of them. This confuses me.

Do I interpret this as an overall interaction effect, following the two-way ANOVA or do I have to interpret this per level of the factor, as implied by the summary? Why?

Thanks a lot !

Best,

Eva

Anthony says

Thanks, Jim!

Malai Nhim says

Hello,

If we include foods (ice cream and hot dogs) and condiments (chocolate sauce and mustard) in our model and they are correlated. Would there is the multicollinearity in our model?

Jim Frost says

Hi Malai,

The presence of interaction effects don’t necessarily indicate that the predictors are correlated. There might or might not be multicollinearity between the predictors. For the food example, these are categorical variables. However, when looking at continuous variables, I usually recommend centering the continuous variables because that removes the structural multicollinearity. Structural multicollinearity occurs when you include an interaction term because each continuous predictor will correlate with the interaction term because the interaction term includes the predictors. For more information about that, read my post about multicollinearity, where I show this solution in action.

Anthony says

I have a question. I’m unsure if interaction effects is the main answer but it might be related.

What is the main difference between factorial experiments and regression analysis? When should I use the other over the other one?

Jim Frost says

Hi Anthony,

Factorial experiments are a type of experimental design whereas regression is a method you can use to analyze the data gather in an experimental design (as well as other designs/scenarios). You use factorial design experiments to set up the factors in your experiment and how you collect your experimental data. You use regression to analyze those data to determine whether and how those factors relate to the outcome measure.

Joseph Tamale says

Hi Jim,

Many thanks for the great work that you are doing!

I have a question related to how to interpret interaction in a statistical model. First, a brief background about my work. I am investigating the effect of land use on soil greenhouse gas fluxes. I measured soil greenhouse gas fluxes from four land uses, monthly over a period of 14 months. Given the fact that my data set is as a result of repeated measures, I settled for Linear Mixed Effects Models. Linear mixed effects models handle the temporal pseudo replication arising out of repeated measures neatly hence safeguard against inflation of degree of freedom which would dramatically lower statistical power of the model. They also neatly deal with missing observations.

For the Model set up, I included land use and seasons as my fixed/main effects, and sampling days and plot numbers as my random effects. My final minimum adequate model has a significant interaction based on both the p-value for the interaction in the final model (significance was inferred if p <0.05) and the interaction plot. See below the fixed effects output of the final model

numDF denDF F-value p-value

(Intercept) 1 204 75482.03 <.0001

landuse 3 12 24.14 <.0001

season 1 204 31.96 <.0001

landuse:season 3 204 3.66 0.0133

I understand that once the interaction is significant, the whole interpretation changes; the focus shifts from the main effects to the interaction its self. However, I need to mention that I am a Novice when it comes to interpreting models with significant interaction.

With that background, my specific question is; could you help me frame a conclusion about the effect of land use on soil greenhouse gas fluxes given the fact that land use interacts with season to cause the change in soil greenhouse gas fluxes.

Best regards!

Joseph Tamale

Terry Anderson says

Thank you very much for replying. I executed the steps exactly as you recommended, and I get the following message when the plots are generated “There are no valid interactions to plot. Interaction plots are displayed for continuous predictors only if they are specified in the model.” I think the reason is that the radio button for adding the interaction term (“Add”) isn’t highlighted, so it’s not allowing me to request the interaction terms. Do you know what might be causing this?

Thank you,

Terry

Jim Frost says

Hi Terry,

It’s funny, but I actually wrote that message quite some time ago!

It means that you haven’t added the interaction term to your model. When you go to the Model dialog box, are you multiselecting both terms? You need to use the CTRL key to select the second term. After you do that, the Interactions through order Add button should become active. That’s the only reason I can think of–that you don’t have two terms selected. After you select two (or more)terms, it should become active.

Terry Anderson says

Hi,

I’ve found your books and blog posts extremely helpful.

I’m trying to create the Interaction plot that’s shown in this post and in your Regression book that uses the data set “Interactions Continuous” in Minitab. I’ve spent several hours trying everything I can think of, and can’t figure out how to create the Interaction Plot. Can you let me know what steps I need to take in Minitab to create the Interaction Plot?

Thank you

Terry

Jim Frost says

Hi Terry,

Thanks so much for supporting my books. I really appreciate it! I’m so glad to hear that they’ve been helpful!

Here’s how to create the interaction plot in Minitab. Please note that I’m using Minitab 17. It might be different in other releases.

First be sure that you have the data in the active datasheet in Minitab. After that, it’s a two part process. First you fit the model. Then you create the interaction plot.

Fit the Regression Model

1. Navigate to

Stat > Regression > Regression > Fit Regression Model.2. In Responses, enter

Strength.3. In Continuous Predictors, enter

Temperature Pressure Time.4. Click

Model.5. Select both

TemperatureandPressure. Use CTRL to multiselect.6. Click

Addnext toInteractions through order 2.7. Click

OKin all dialog boxes.That fits the regression model. Minitab saves the model information after you fit it. One crucial part was to include the interaction term, because the next part requires that!

Now, we need to create the interaction plot. Minitab has a Factorial Plots feature that creates both the main effects plots and the interaction plots using the stored model from before. Be sure that you keep the datasheet with the stored model active. Do the following:

Create the Factorial Plots

1. Navigate to

Stat > Regression > Regression > Factorial Plots.2. Under

Variables to Include in Plots, ensure that you include all all variables (Temperature, Pressure, Time) underSelected.3. Click

OK.At this point, Minitab should display the main effect plots for all three variables and the interaction plot for the Temperature*Pressure interaction.

I hope that helps!

Yechezkal Gutfreund says

Well, I was hoping your next book would be titled something like “Intuitive Statistics for ML and Deep Learning” 🙂 since this is a field that really needs some of the probity and common sense in understanding the statistical methodology that can detect weak models. The same comments you make about Stepwise regression and Best subsets regression would apply. What I see is “The AUC and F1 values are really good, must be a great model. ” and little study of interaction and subject matter expertise.

Jim Frost says

Hi, yes, that might well be a possibility down the road! But, yes, in general, it’s critical that your model matches reality regardless of the methodology. And that requires a thorough understanding of the study area! It’s to just run an algorithm and use those results. While algorithms can be extremely helpful, they don’t understand the underlying reality of the situation. Checking on that side of things is up to the analyst!

Edoardo Busetti says

Hi Jim,

I am thinking of testing the interaction between two dummy variables, for instance the variable “female” and “married”.

If I construct a linear model as follows: wage = b0 + b1*female + b2*married + b3*(married*female) + u

I can then say that:

The effect on wage given by the subject being female-married is: b1 + b2 + b3

The effect on wage given by the subject being female-non_married is: b1

If I want to test that there is no difference between female-married and female-non_married on wage which of this two hypothesis should I test:

a) b1+b2+b3 = b1

b) b2 = 0, and b3 = 0

They might seem similar at first, but they give different results since in case a) we allow for the effect of b2 and b3 to compensate each other, while on the other we don’t.

Yechezkal Gutfreund says

I probably am thinking more about boosted decision trees and deep learning neural networks. There it seems that the levels/layers, after the top layer , are actually computing weights for both the interaction effects and removing correlation between independent variables.

They they apply gradient descent and backpropagation to automate the discovery of how significant the effects are.

Is this basically correct?

[Note: I am leaving aside whether you approve or are skeptical (as I certainly can be) about the “voodoo” of selecting the excitation function, and using automated means such as backpropagation to automate the discovery of interaction weightings. Maybe manual picking the interactions and using OLS is better, but I think that is a separate discussion, and I would like to focus just on whether boosted decision trees and deep learning do or do not automate the interaction discovery]

Jim Frost says

Hi, that’s getting outside my expertise. I don’t want to risk providing incorrect information, so I’ll hold back on answering. You’ll probably need to conduct additional research to determine which analysis is best for your study given the specific properties of your subject area. Sorry I can’t give you a more precise answer on this issue.

ramla says

Hi, Jim

Thank you for the fast reply. I used a 3-D line graph to represent the 3 factors. I have now completed a full linear regression for a 3 factor full factorial and when I try to include ABC interaction in the regression there seems to be an error. I read somewhere that linear regressions only involve using two-way interactions, so AB, AC and BC. I did this and found that my results worked as I found that all the factors were significant (p-value less than 0.15) but I don’t know if a can just leave the three way interaction (ABC) out. I did the ABC interaction against my response variable by itself and I found that it was insignificant. When the summary output was generated for A,B,C,AB,AC,BC vs Response Variable the R squared value is at nearly 100. All the examples I’ve seen online no where near this so I don’t know if I’m doing something wrong or if there is perfect correlation.

Additionally, I wanted to ask if it possible to do a 3-way ANOVA on excel.

Apologies for the long question.

ramla says

Hi Jim,

I am doing a two level three factor full factorial design. I have to analyse both the main and interaction effects. I am using excel and I have found that there is interaction between factors as the lines cross. I wanted to ask, since I have only seen interaction effects done on two factors, is what I used fine, or do I have to use a different method because I am doing 3 factors instead of 2.

Jim Frost says

Hi Ramla,

Yes, it’s perfectly fine to use interaction plots using three factors. Although, if you’re fitting a three-way interaction, you won’t be able to graph that using two dimensions! But, you can certainly do what you describe.

As always, check the p-values for the interaction terms to be sure that they’re statistically significant.

Yechezkal Gutfreund says

Three part question:

1. When using non-linear machine learning models instead of linear regression, are interaction effects as important?

Perhaps this is a function of categories of ML algorithms, which of the below would be the most robust?

1. Decision Trees, Forests, etc

2. Deep learning networks (ANN, etc)

3. Naive Bayes

4. Knn

2. Are there ways other than the interaction graph to numerically measure interaction. I am thinking that I might have a mix of 100 categorical and continuous values, This might be tough to graph out,

3. Do you have other ways other than the product of two values for interaction (I am thinking this does not work well for 2 categorical independent variables).

Jim Frost says

Hi,

As a more general issue, if the phenomenon that you’re studying has a relationship that changes based on the value of a different variable, you’ll need to include it in your model. This isn’t a matter of whether or not it’s important to the form of analysis. If that exists in the real world, you’ll need to include that in your model. If you don’t, you might end up doing the equivalent of putting chocolate sauce on your hot dog!

It’s a matter of modeling the real world in all of its relevant details. If you don’t, your analysis will be regardless of the methodology. It’s not a matter of being robust but rather accurately describing how reality works.

That does sound tough to graph! Interactions are notoriously difficult to graph! I always suggest checking to see what other researchers in the same area have done. But, even if you have a lot graphs, you’d still understand the nature of each interaction. Ultimately, that is crucial. You can put each two-way interaction on a different graph and understand each interaction separately. If you have a model with that many variables, it’s just harder to fully understand whether there are interactions or not! It’s definitely understandable, it’ll just take more time to fully understand the role of each variable.

In statistics, interactions are the product of the variables in the interaction term. And, it works just fine for two categorical variables. In fact, I illustrate an example of that is this post!

Nietsrot says

Thanks for your answer! I’ll try to formulate my question in a different way:

Interaction effects in ANOVA and regression are interpreted the same way, but needs to be handled differently when to be analyzed in a statistical program (like JASP or SPSS). Why is that?

Best regards

Jim Frost says

Hi,

So disregarding any purely UI differences, the main I difference I see is that typically you’re assessing continuous interactions in regression and categorical interactions in ANOVA. In both cases, interactions involve multiplying the corresponding values of each variable. However, with categorical variables, the software must recode the raw data to be able to use it in the model. The common schemes are the (1,0) indicator variable/binary coding and the (-1, 0, +1) effects coding scheme. I will disregard those difference here although I’ll just note that the binary coding scheme is more common in regression while effects coding is more common in ANOVA. That varies by software but is one potential source of difference.

In binary coding, each categorical variable is converted into a series of indicator variables. One value of the categorical variable must be left out as the baseline level. Your choice for the baseline level can change the numeric results but the overall picture it portrays remains the same. Because an interaction multiples the values, the choice of baseline also affects the interaction term. That’s another potential source for differences.

So, I’m not totally sure about the “handling differently” you mention but those are potential sources of differences.

Nietsrot says

Hello Jim!

Now I really understand that interaction effects in ANOVA and regression are interpreted the same way. But I wonder: can I treat the interactions the same way when analyzing them in a statistics software? I hope you can help me. Best regards, Torstein 🙂

Jim Frost says

Hi Nietsrot,

I’m not sure that I understand your question. I’m using statistical software to assess interaction effects in this post. So, yes, this is how to interpret them using statistical software! But, I might not be understanding your question correctly. Please clarify. Thanks!

Avid Izadpanahjahromi says

Hi Jim,

Thanks for your valuable posts!

Regarding main and moderating effects within and across level of analysis, I’m struggling to understand why researchers are supposed to rule out the possibility of reverse interaction effect?

Could you please help me with that?

Ahmed Al Balushi says

Positive d means that ”Business environment that has woman is moderated positively by lower corruption and that will increase the profit of the firm”.

Jose Manuel Pereñíguez López says

Hi Jim,

Thanks for this wonderful post.

Can I ask you one doubt? I have the activity of an animal as my response variable, and I want to determine the effect of three different variables on its activity: the hour of the day (circular), moon_illumination and human presence. I think I have two different interactions here. One is the hour with the moon illumination (during the day the moon illumination won’t affect to animals’ activity since the sun is present) and the other is the hour with the humans’ presence (there are humans only during specific hours of the day).

Regarding this, my doubts are two:

1. How should I indicate this double interaction? I think that “Hour*Moon_illumination*Human_presence” is not correct, since then, I am mixing two variables not related (Moon_illumination and Human presence).

2. My other doubt is that there is no “Moon_illumination” at all hours (i.e., at 2 p.m there is never a moon in the sky) and there is no human presence at night hours, so, is there any problem with that? Is this kind of nested variables? I mean, not all the levels of one variable are presented in the other.

Could you give me any tips? Thanks in advance!

Loony says

Dear Jim,

I have just bought and read the interaction chapter of your book, but I still have several questions in interpreting interaction effect. I am running binary logistic regression, and the main interest of my MA thesis is the effect of moderator W( or X2) on the relationship between X1 and Y. I hypothesized that the link between X1 and Y would positively stronger as X2 increases. (the more X1 and the more X2, the more likely to Y). I tried to highlight that only people who have both X1 and ‘X2’ will highly likely to do Y (there are some mixed results of the effect of X1 on Y, so I tired to complement this results by this hypothesis).

I am using panel data set, and I am analyzing the results from the cross-national data set (including 8 countries with 9100 samples) and eight country-level data sets (each number of samples are roughly 1000)

As for the cross-national data set (all countries-integrated file), all the independent variable(X1), moderator(X2), and interaction term(X1*X2) are statistically significant, and while the effect of interaction term and independent variable are posiive (+), the effect of moderator(X2) is negative(-).

1. The sign of the effect of X1, X2, and X1*X2 is respectively (+), (-), (+), how should I interpret this? It would be so easy if the sign was all (+), but since the effect of X2 is negative… it is very difficult to interpret.How should I interpret the ‘negative’ effect of moderator even though the interaction term has a positive effect?

What I understand is that positive interaction effect means that the effect of X1 on Y increase as the level of moderator X2 increases. But since the moderator has a negative effect, it seems that moderator has a negative impact on Y. So this is really confusing…

2. For the country-level dataset, there are a lot of variations… only 3 countries out of 8 countries had statistically significant and positive interaction effects (which is good news), but the results of the effect of X1 and X2 are very different from each other.

For the sign of all variables in each country is as follows:

Country A : the effect of X1 :(NEGATIVE coefficient sign -) NOT significant

X2:(negative coefficient sign-), significant

X1*X2: (positive coefficient sign+) significant

Country B: the effect of X1: (positive coefficient sign+), NOT significant

X2: (negative coefficient sign), NOT significant

X1*X2:(positive coefficient sign+) significant

Country C: the effect of X1: (positive coefficient sign+), NOT significant

X2: (negative coefficient sign), significant

X1*X2:(positive coefficient sign+) significant

3.The coefficient sign of X1 in country A is negative and that of B&C is positive. And the sign of X1*X2 in three country is positive, and the sign of X2 in all thee country is negative.. How should I interpret them in compare with each other?

4.And what if X1 or X2, or both are not significant while X1*X2 is significant?

it seems that, based on significance and the coefficient sign, I can compare:

(1) the result of country A with C (because both country have significant effect of moderator and interaction term,

(2) the result of country A&C with B (because country B has only significant effect of interaction term), and

(3) the result of A with B&C (because only A has a negative sing of the effect of X1)

But I have no idea how exactly state the nuance among these countries…

5. I tried to graph these by using SPSS process macro, but in all 8 countries the lines in each interaction plot do ‘cross’ even though the interaction effect is ‘insignificant.’ For the countries that do not have a significant interaction effect, shouldn’t be that their interaction plot do not cross ?

Thank you for reading this long comments…

I look forward to hearing from you

Henry says

Hi Jim,

I am glad I have found your website. My situation seems a bit intricate, but maybe you can help?

In my experiment, participants of both genders take a blue or a red pill before a test. Each test consists of two examinations that give a certain score. (I regard my DVs independently and skip on a multivariate analysis). Hence, I calculate two three-way mixed ANOVAs: 2 (Sex) x 2 (Condition: blue-red vs. red-blue) x 2 (Test: T1 vs. T2) with Sex and Condition as between-subject factors, and Test as a within subject factor.

Results:

DV1: significant three-way interaction between test, sex and condition. As a follow-up I have calculated a simple two-way interactions for T1 and T2. Both are not significant.

DV2: no significant three way interaction but a significant two-way interaction between test and condition. As a follow-up I have calculated simple main effects of condition at T1 and T2. Both are not significant.

Problem:

I struggle to interpret the results. From my understanding, neither for DV1 nor for DV2 there is a significant difference of results in any of the two tests. However, for both DVs there is a significant difference between tests that depended on an interaction of sex*condition (DV1) or on condition (DV2), and looking at my figures I can see where this difference is and interpret it accordingly.

Option 1: That is correct. Though the differences itself are not significant, the trend shows me a clear result that I can interpret with certainty, no matter what.

Option 2: That is not correct, and I cannot draw any conclusions whatsoever because the difference in T1 and T2 (or in either of the two) needs to be significantly different as well.

Option 3: something else is true.

Actually, I am most interested in how the blue pill and the red pill impacted the test results, and wether participants who took a blue pill scored higher in a test than those who took a red pill. (In my results, I can actually see that pattern.) A bit of a problem is that both test are entirely identical and the difference might be a bit too small. But that as well could mean that my sample is not large enough given the results that I describe above? (FYI: N=79)

I would appreciate your help. Maybe you know the answer? Thanks a lot already.

I hope I was clear in the way I described my calculations and situations, otherwise I am happy to specify further.

Best,

Henry

Noa says

Hi Jim,

Thank you for the information.

I am doing a study of 2 independent (Label(organic/regular) Involvement(Low/High) and 3 dependent (Taste..).

I did ANOVA 2 and 3 way interaction to do my statistical Analysis.

I got a interaction of p=0,02 and below are the means=

Regular Chocolate/low Involvement= 2.71

Reg/high 3.144

Organic/low involvement=2.45

Organic/high= 3.55

The scale it is 1 to 5.

BUT I need to interpret directionality, to know what is driving the interaction. Is it the regular or the involvement? HOW CAN I INTERPRETE DIRECTIONALITY OF THE EFFECT?

Looking forward to hearing from you,

Jim Frost says

Hi Noa,

When you have an interaction term, there’s no statistical way to determine what’s “driving” it. You just know that in, say, a two-way interaction that the relationship between each variable in the interaction and the DV depends on the value of the other variable in the interaction term. In some cases, a particular interpretation can make more sense but that depends on understanding both the subject area and the goals of the analysis.

I always recommend creating interaction plots because that makes it easier to interpret. So, I took the values in your comment and created a quick and dirty interaction plot in Excel. It’s just a line chart using the groupings and taste means you list. Typically, you’d want to use the fitted means rather than raw data means. So, one way to interpret the interaction is to say that as involvement goes from low to high, the taste increases more rapidly for organic chocolate than for regular chocolate.

Additionally, if your goal is to maximize taste, when you have regular chocolate you’d want low involvement because that produces a higher taste. Conversely, if you have organic chocolate, you’d want high involvement because that produces a higher taste.

I don’t know the subject area or what involvement measures exactly, but those are the types of conclusions you can draw from your results.

If the p-value for the interaction term is statistically significant, then you know that the difference between the two slopes is significant. When the p-value is not significant, the observed difference in slopes might be caused by random sampling error.

I hope that helps!

Michael says

Dear Jim,

Good day and thank you for the good work you are doing.

Please I need your advice, I have a dataset of two treatments (pre and post intervention) for three locations, spanning across 12 sampling periods. Period 1-2 is classified as pre-intervention and period 3-12 as post-intervention. Now, I will like to test if there activities differs significantly across the sampling periods i.e. does activities of the animals reduced after intervention or not.

In an attempt to test this, I used GLM of negative binomial effect in R Studion with an interaction between location and sampling period, but I’m not so confortable with the result. It looks like I’m missing out on something, probably I’m using a wrong analysis or …. Please kindly advice me on how best to approach this hypothsis.

Your kind response will be highly appreciated.

Thanks,

Mike

Michael says

Hi Jim, good day and thank you for the good work you are doing.

Please I need your advice, I have a dataset of two treatments (pre and post intervention) for three locations, spanning across 12 sampling periods. Period 1-2 is classified as pre-intervention and period 3-12 as post-intervention. Now, I will like to test if there activities differs significantly across the sampling periods i.e. does activities of the animals reduced after intervention or not.

In an attempt to test this, I used GLM of negative binomial effect in R Studion with an interaction between location and sampling period, but I’m not so confortable with the result. It looks like I’m missing out on something, probably I’m using a wrong analysis or …. Please kindly advice me on how best to approach this hypothsis.

Your kind response will be highly appreciated.

Thanks, Mike

Florentino Menéndez says

Hi, Jim! First of all, thanks for your work clarifying a lot of doubts, of us, statistical users. I have some confusion, so I’d appreciate very much your explanation.

I have a regression. Suppose Y is the dependent variable and A and B are the indepent ones.

The interaction term is significant, but A and B are nonsignificant. Should I remove A and B from the regression or not? (I have read not). Why?

Thanks in advance, and congratulations on your books. I find them great!

KL says

Hi Jim! I am designing a study that looks at a pre and post standardized reading comprehension measures(DV) across four treatment groups. I am controlling for the pretest as the covariate, but I also want to know if there is an interaction effect with gender and the DV. Being rather new to ANOVA/ANCOVA, I have a couple of questions. Do I have a one way ANCOVA or a 2x4x2 (not even sure if this is right). Also, when getting ready to analyze my data in SPSS, do I enter DV*gender to observe the interaction effect.? Thank you so much for your help!

Lisa says

Hello Jim, thank you so much for your answer! Sorry if my question was a bit misleading, the post about comparing regression lines already helped me a lot.

I still have a problem with interpreting my results though: I did an epxeriment, where I showed some of the participants a fair trade coffee, and some participants just a normal coffee. My main independent variable was therefore a categorial variable for the experiment, so the IV indicated the treatment group =1 (Fairtrade coffee), and the control group =0 (no fair trade coffee).Then I included an interaction term (environmental awareness). When running the regression, the output shows you the effect of the IV on DV, the interaction term (IV1*IV2) and the single direct effect of IV2 (so the moderator) on DV. Here, only this single direct effect of IV2 was significant. But this single direct effect of IV2 is independent of my different groups right? So in my example, this would mean that IV2 (which was environmental awareness) has a direct impact on the DV (which was purchase intention), meaning that, regardless of what product was shown, the higher the environmental awareness, the purchase intention increased. This does not make sense for me.

I tried to compare the different regressions like you suggest in your post. I did it with two regressions and the if command, so I regress IV2 on DV only with the data from the treatment group and IV2 on DV only with the data from the control group. It does indeed only show significant results in the data from the treatment group. But I also have different N here (treatment group N=64; control group N=34). So the reason that the control group is not significant could also be because of the smaller N, right?

How can I interpret this? And how can I interpret it, withouth interpreting it as an interaction term (which was not significant). I find this difficult, because I obviously have to connect the interpretation to the fair trade coffee which was shown at the beginning. But the fair trade coffee was my main IV which was not signficiant. But otherwise there cannot be made a connection between environmental awareness and the purchase intention?

Lisa says

Hey, I have a question: if you are doing an experiment, where your independent variable is a categorial variable (so indicating treatment group =1, and control group =0), and you include an interaction term. How do you interpret the single interaction term? The direct effect of the categorial variable on dependent variable was not signficant and also not the interaction term. But is the single interaction term also for the treatment group? Or is the effect of the single interaction for all respondents regardless in which group they belonged?

Thank you so much!!

Jim Frost says

Hi Lisa

There’s no such thing as single interaction term. At the very least, it has to be a two-way interaction term, which indicates that the treatment group variable and one IV interact. The interpretation for that scenario is that slope coefficient for that other IV changes based on the value of the treatment group variable. In other words, the coefficient for the IV is different for the control group and the treatment group. If the interaction term is statistically significant, then the difference between the two slopes is significant.

Did you by chance mean what if the treatment group variable itself is significant? That it doesn’t include another variable in the term? That would be the main effect. In that case, the value of the coefficient represents the mean difference between the treatment and control groups. If that main effect is significant, it indicates that the difference between the two group is significantly different from zero.

I illustrate the differences for this type of scenario for both the main effect and interaction effect in my post about comparing regression lines.

avis says

Dear Jim,

Thank you very much uploading the all the information here they are really helpful! In fact I need some help and advice. I am struggling with experiment data analysis as I am not from psychology background. My key research questions are 1. The impact of leader possible self (LPS) (X1) on intention for leadership development (Y1) through the mediation of motivation-to-lead (MTL) (M1), and 2. The role of hope and fear in LPS in impacting that two DVs. All these are measures with scale and I embedded experiment in my online survey for data collection.

At T1, participants got an intervention to write narratives about their future work identity (LPS could be an element related work identity because they are all identities, but the leader / hope / fear elements are not activated). After the narrative writing, they completed the X1, M1 and Y1 measures for me to collect the baseline data

At T2, same participants were randomly assigned to one of the four groups. Group 1 is a control group that they did the same thing as in T1. Group 2 is the LPS hope & fear activation group, that they had to write about their envisioned leader future lives, their hopes and fears. After this, they went on completing the X1, M1 and Y1 measures. Group 3 is LPS hope group (rest of the setup is the same as group 2). Group 4 is LPS fear group (rest of the setup is the same as group 2)

In my previous survey studies, I used EFA and CFA for factor analysis, then went on doing regressions for testing the casual relationships. With the experiment setting, I am confused what the steps should be? It seems to me that I have to conduct one-way ANOVA analysis, my key questions are: what about the factor structure analysis? What about the mediation analysis? By using one-way ANOVA, I can only see the differences between groups. How should I integrate the other steps in the process? What should be a standard process for analysing experiment data actually?

Your advice will be highly appreciated and I look forward to hearing from you soon!

Kind regards,

Avis

Zaeema says

Hi Jim,

This is very informative. I have a question regarding stratified analysis versus using interaction term in a Cox PH regression model. What is the advantages and disadvantages of using one versus another and which one is superior. The data that I am analyzing was not collected keeping stratified analysis in mind. Thanks

Jim Frost says

Hi Zaeema,

I asked Alex Moreno, who is an expert in Cox regression, about your question. Unfortunately, he wasn’t able to reply directly so I’m relaying his answer to you. In fact, you’ll soon be seeing a guest post by Alex about Survival Analysis and it touches on Cox regression. At any rate, Alex’s answer follows:

“As far as I understand stratified Cox and interaction terms are not related. You use interaction terms in Cox models for the same reason that you use interaction terms in other regression models. You use stratification when the proportional hazards assumption is violated: that is, the covariates don’t have multiplicative effects, or the effect is time-varying. You then estimate a different baseline hazard for each level of the covariate.

The analogy for stratification for linear regression would be as follows. Say you have a covariate and you’re not sure whether holding other covariates fixed, this one has a linear effect on the response. You also aren’t sure what transformation is appropriate so that it will have a linear effect. One thing you could do is for every value (or several grouped values) of the covariate, estimate a different intercept for your model. I suppose doing this makes it more difficult to estimate interaction terms for this covariate, but other than that the ideas of interaction terms and ‘separate intercept for each value of a covariate’ aren’t really related.”

Here’s a link to a blog post that Alex has written about Cox Regression, which you might find helpful.

Mac Smith says

Thank you for the blog, it’s really helpful.

Can you please help me with my query, very similar to to the blog except the statistical significance of the total effect.

I am going to estimate the following model:

y=constant+b1(X)+b2(X)(Dummy)

We have daily data from 1990 to 2000. Dummy variable is equal to one for the daily data of year 2000, else zero. b1 is the main effect, and b2 is the marginal effect because of the dummy. The total effect in year 2000= b1+b2. How to estimate the statistical significance for this total effect (b1+b2), will we use F-stat or t-stat? Thank you very much.

Jim Frost says

Hi Mac,

It looks like you’re trying to determine whether the year 2000 has a different intercept than all the other years. I write about using indicator variables to test constants and indicator variables in interaction terms to test slopes in my post about comparing regression lines. You might be interesting in that article.

On to your question! Because you want to assess the significant of more than one coefficient, you’d use the F-test. You’ll need to use an F-test based on a constrained model (excludes those two coefficients) and unconstrained model (full model) and determine whether the constrained model has a residual sum of squares (RSS) that is significantly greater than the unconstrained model. Note: If your model contains only those two coefficients and the constant, just use the overall F-test for this purpose, which is probably already included in your output.

If you have more than just those two coefficients and you need help comparing the constrained and unconstrained models, let me know!

Kehabtimer Shiferaw says

Hi Jim I found it as a very informative and useful explanation. However, I am suffering from being unable to conduct interaction effect in stepwise method other than enter using stata version 14. My model is binary logistic regression.

Jennifer says

Hi Jim,

I wanted to thank you for all of the time, effort and content you put into this website. I have just completed my graduate thesis (linear regression analysis with interaction effects) and used your website on countless occasions when I found myself lost and needing guidance/clarity. Thank you for the clear explanations you provide. I greatly appriacte(d) it!

Jennifer

Jim Frost says

You’re very welcome, Jennifer! I really appreciate you taking the time to write and let me know. It absolutely makes my day!! 🙂

And, congratulations on completing your graduate thesis! That’s a huge accomplishment.

Evelien says

Dear Jim,

Would it be okay if I asked you a question? I got stuck again.

I studied moderation, looking at the effect of sex (female/male) on the relationship between sleep duration and happiness, using a multiple regression. The interaction was significant, so I plotted the simple slopes.

It turned out that the simple effect of females is significant, but the men’s isn’t.

How could I interpret or articulate that?

Thank you in advance!

If you’re not able to answer; no problem! I’ll keep digging myself.

Evelien says

Dear Jim,

Thank you so much!

This immediately gave me insight, while interpreting the moderation effect fround, writing my bachelor’s thesis. A relief!

Jim Frost says

Dear Evelien, you’re very welcome! I’m so glad this helped you!

William says

Hello ! Thank you a lot for all your explanations!

I just have a question, I did not get how to represent the hypothese of an interaction effect on my research.

For example, let’s imagine an interaction effect of Hot dog (vs. Salad) with Choice (vs. Non-choice) on the perception of well-being.

First how to write the hypothese of this interaction? For instance, is it “H5: crossing the Choice with Hot dog will increase the perception of well-being” ? I really don’t know…

Then, how to write it on the model? On both of the arrows (on the one from Choice to perception of well-being + on the one starting from Hot dog to perception of well-being) or if not where to put it ? Because when you put ony on one it is for a main effect, isn’t it?

I think for all the other types of hypotheses I got how to do it (I hope).

Thank you in advance for your help !

Jim Frost says

Hi William,

There are two ways to write it but they both mean the same thing. One is mathematical and the other is conceptual. For both cases, I’ll write about it in the general sense but you always include your specific IVs.

Mathematically, it’s the same as the tests for main effect coefficients. The test simply assesses whether the coefficient equals zero.

Null: The coefficient for the interaction term equals zero.Alternative: The coefficient for the interaction term does not equal zero.In that sense, it’s just like the hypothesis test for any of the other coefficients where zero represents no effect. It’s just that in this cases it’s for an interaction effect rather than a main effect.

We can also write it conceptually, which is based on understanding what a zero versus non-zero interaction term coefficient represents.

Null: The relationship between each independent variable in the interaction term and the dependent variableDOES NOTchange based on the value of the other IVs in the interaction term.Alternative: The relationship between each independent variable in the interaction term and the dependent variableDOESchange based on the value of the other IVs in the interaction term.I hope that helps!

sara says

Hi jim,

i had a question regarding simple main effects that we conduct after finding a significant interaction effect for a 2×3 mixed anova. Only one simple main effect was significant and the other 2 were not, so do we still do pairwise comparisons ? because my between subjects variable had 2 levels, and had a significant effect and my within subjects had 3 levels, which was significant as well, and i did a post hoc for within because it had 3 levels, but for the simple main effects of my interaction, only one interaction was significant. I am sorry if this is very confusing.

Jim Frost says

Hi Sara, yes, anytime you want to determine whether specific pairs of group have significant differences in ANOVA, you’ll need to use a post hoc test.

Laura says

Dear Jim,

Thanks for your post it’s really helpful and indeed very intuitive. Quite a bit of work to get to the bottom of the page to leave you a message though! Now I haven’t read all the comments so I’ll appologize in case I repeat a question.

I’m running a gls-model with variable A = ‘treatment’ (factor), variable B = ‘nr of pass’ (numerical), and their interaction (and a correlation between ‘wheelnr’ per ‘block’). It reveals a significant effect of ‘treatment’ (***), and the interaction (*, p-value 0.04902), but not for ‘nr of pass’ alone.

The reason for making this test is that I want to check if I can continue and work with an average per ‘treatment’ per ‘block’, hence we’d like to see no effect of ‘nr of pass’. But now I’m confused about the effect of the interaction, and on what to conclude on this analysis.

My supervisor thinks to remember from long-ago statistical courses that if one of the main effects is not significant, one should not consider the interaction even if it is indicated as significant. Or in other words, that the significance of the interaction comes from the significance of the ‘treatment’ main effect, and that I can “ignore” the interaction in the sense that it should be alright to average the response over the different number of passes (per treatment per block).

Do you have any clarifying thoughts on this?

Thanks for your time in advance.

Jim Frost says

Hi Laura,

If the interaction term is significant, it’s crucial to consider it when interpreting your results even when one of the main effects is not significant. You don’t want to ignore/average it because it might drastically change your interpretation!

Actually, the food example in this post is a good illustration of that. I have the two main effects of Food and Condiment. Food is not significant while Condiment and the interaction term are significant. Now, if I were to ignore the interaction term, the model would tell me to have chocolate sauce on hot dogs! By including the interaction term, I get the right interpretation for the different combinations of food and condiment. By excluding the interaction term, your model might tell you to do the wrong thing.

I don’t exactly understand your model, but I’d recommend creating an interaction plot like I do in this post because they make it really clear what an interaction means.

Gerrit says

Dear Jim

I am writing my thesis on the relationship between board characteristics and company performance. I am using binary logistic regression with performance as the binary dependent variable.

From the literature I found that some of the characteristics as an absolute number, e.g. number of female on a board, per se does not really have an impact on co performance, however the percentage female on the board is expected to have an impact. In other words impact of the number of females is dependent on the size of the board. My question is if I use the % females can that be described as an interaction term in my statistical model. I also include board size as a variable but not number of females, as I said literature found that it does not have an impact per se.

Thanks

Gerrit

Jim Frost says

Hi Gerrit,

I’d phrase the impact of females a bit differently. The impact on board performance depends on the percentage of board members who are females.

That’s not an interaction effect. That’s a main effect. Your model is showing that as the percentage of females increases, the probability of the event occurring increases (or decreases depending on the coefficient).

Here’s a hypothetical interaction effect so you can see the difference. Interaction effects always incorporate at least two variables that are in your model. So, let’s say the following two-way interaction is significant: female%*board size. That would indicate that the relationship between female% and performance changes depending on the value of board size. Perhaps with small boards, there is a positive relationship between female% and board performance. However, with large boards it’s a negative relationship between female% and board performance.

Just a hypothetical interaction so you can see how that differs from your describing. Female% by itself is a main effect.

Best of luck with your thesis!

Prafulla Nath says

Hi Amira,

Your finding seems very interesting. In many cases the interaction effect may be quite opposite to the nature of the independent variables. To interpret them you must find literature to support . In your case you may refer the link for literature. Some explanation is there, which may help you.

All the best

Prafulla

*** link : https://www.researchgate.net/publication/317949972_Corruption_and_entrepreneurship_does_gender_matter

Mike says

Presumably |c| > d. If so, then it means the profits decrease at a slower rate as corruption increases when women (are in charge?). If d >|c| then profits increase with women as corruption increase.

Tanya Tan says

Hi Jim,

Hope you are well. I had a question in terms of determining which statistical test would be best to use for my research! I am looking at whether 3 techniques (a, b, c) on a Psychotherapy scale effects treatment outcome (pre – post treatment score) in a within group subject (n=31).

Thus am I right to say that my DV is: pre – post treatment score and my IV would be the 3 techniques? So would the best idea would be to use a Repeated ANOVA test to test for interaction? Or would it be better to do a t-test/correlation? Getting a little confused as to which test I should choose. Your help would be greatly appreciated!

Many thanks,

Tanya

Ahmed Moosa Al Balushi says

It means the following: if a man in a low corruption environment, profits is expected to be higher than a woman in a low corruption environment.

As you have a dummy variable, the interpretation should be easy and convenience.

You may also check the result by setting a corruption as dummy variable, 1 = below median, 0 otherwise.

Nevertheless, your current model is OK.

Amira El-Shal says

Hi Jim. Thank you for this useful post. Just to double check, I am running the model below:

Profits = a + b.Woman + c.Corruption + d.(Woman*Corruption) + Error

Woman: Dummy variable=1 for women

Corruption: Continuous variable

When I ran the model, both b and c are negative (as expected) but d is positive and significant. How can d be interpreted.

Thank you in advance.

Best regards,

Amira

Barbara says

HI Jim,

I tested a three-way, job demand, job control, and locus of control interaction in prediction of burnout and found no significant interaction terms when tested with different type of job demands (interpersonal conflict, role conflict, and organizational politics). I’m trying in interpret the findings (writing Chapter 5 of my dissertation) and see that the performance of locus of control (correlations) was really not as expected. For example, locus of control had a negative correlation with job control which should have been positive. Also, it had a positive correlation with job demands and burnout (totally unexpected). In my explanation I noted that these relations may have been unique to the sample (n = 204 of respondents from diverse occupational fields) and likely affected the performance of the locus of control as the secondary moderator of the job demand-control model. Also, I mentioned that the measurement error may have been an issue also because the reliability of locus of control scale was low (.55) which contributed to the reliability of the three-way product term. Finally, although the sample was large enough for the statistical tests (as per power analysis), it seems that I didn’t get enough of people in the group combinations which to me may have affected the results as well. For example, I had far more individuals with high locus of control and low job control combinations (15%) than with high locus of control and high job control combination (9%), with the latter being of most interest. Does that mean that a larger sample may have been better? I found a possible explanation for this group combination and it relates to the distribution of the predictor variables which tend to center in the middle of joint distribution of X and Z. and, thus more cases are needed to detect interactions. Could you please let me know if I’m at the right track with the interpretations I’m making? I especially struggle with the last point related to the locus of control and job control combinations and how it relates to the null results. I greatly appreciate your help.

Barbara

Alison says

Hi Jim

Thanks for your constructive comments. Hope you are well.

Can you please explain how to interpret a situation where (a) the coefficient of two independent variables are negative but the interaction is positive? (b) the coefficient of two independent variables are negative and the interaction is negative too.

I look forward to hearing from you.

Many thanks

Alison

Jim Frost says

Hi Alison,

So, the exact interpretation depends on the types of variables. Whether they’re categorical and/or continuous. However, in general, as the values of the IVs increase, the individual main effect of each one has a negative relationship with the DV. So, as they increase, you’d expect the DV to decrease. However, the interaction term provides a positive effect that offsets the main effects. The size of the offsetting positive effect depends on the values of both IVs. That assumes you’re dealing with positive values for the IVs of course.

I always suggest using interaction plots to really see the nature of the interaction, as I do in this post. Really brings it to life! You can also input specific values to the equation using real data values to see what each part of the equation (main effects and interaction effects) provides to the predicted DV value. However, the graphs do that for you using lots of inputs.

Connor Armstrong says

You may not have enough degrees of freedom for error. You can fix this by checking your factor effects and removing the least significant ones. This will give you enough degrees of freedom for error to perform your analysis.

omeera says

HELLO Jimm!

If there is no interaction in between the factors e.g, if the critical difference in A*B*C is N/A. What does it mean?

Jim Frost says

Hi Omeera,

If an interaction effect is not significant, then your sample data do

notprovide enough evidence to conclude that the relationship between an IV and the DV varies based on the value of the other IVs in the interaction term.Sara A says

Very helpful, thank you so much!

david kiganga says

hey Prof Jim

i have performed ANOVA for interaction but the results did not include F and P- values

Jim Frost says

Hi David,

I’m not sure why your statistical software would not provide you with the test statistics and p-values. It might be a setting in your software? But, I really don’t know. Your software should be able to provide those. Really, it’s the p-value you need the most.

Greg says

Hello Jim!

Thank you so much for your detailed and well explained articles.

I had a few questions that I hoped you could answer though. I am running a multiple regression model, and wish to look at the moderating effects of age on several predictors.

– I have dummy coded age (in two separate categories; millennial = 1 and non millennial = 0). Given I am standardising all my variables; in order to create the interaction term, should I first standardise my IVs and and DVs, then multiply? Or rather should I multiply my IVs and DVs, then standardise the new interaction term? (Am using R)

– When running my model, should age also be a predictor on its own? Or does this not matter?

– If I find age has an effect after I ran the model, do I need to split my sample in the two age groups to investigate more precisely what the effect of age is on predictors (as in, this factor affects more millennials than non-millennials)? Or is there a way I can interpret this directly from the first moderated model (i.e. without splitting the sample).

Thank you so much Jim; this would be incredibly helpful!!

Ahmed says

Dear Jim,

I got confused.

I need your help please.

My model:

Net Loss in USD = – 0.05 – 0.2 Management quality – 0.1 ChristmasDummy + 0.09 Employees absenceDummy + 0.02 Management quality*ChristmasDummy – 0.03 Management quality*Employees absenceDummy

How should I interpret my two interactions effects on Net loss, please?

Christmas-Dummy; 1 = yes, 0 otherwise

Employees absence; 1 = yes, o otherwise

Ahmed,

vilma says

HI Jim.

thank you for your interaction.

I have some questions:

1) how do I understand if there is an ordinal or disordinal interaction just by looking at the statistics (coefficients and p value) in anova or regression model(i’m using R)? That being said, suppose that the coefficient of two independent variables are negative but the interaction is positive. would lead to disordinal right? What happens when one of these variables is not significant? how does it changes the interpretation?

2) what would be an interpretation of interaction and also main effects, if there are two independent variables and only one main effect but also an interaction?

3) does including dummy or effect coding changes the interpretation of interaction and how?

thank you very much!

Jim Frost says

Hi Vilma,

The best way to distinguish between ordinal and disordinal interactions is simply create an interactions plot, which I show in this post. Ordinal interactions have lines with different slopes but they don’t cross. In other words, one group always has a higher mean but the differences between means changes. For disordinal interactions, the lines cross. One group’s mean will be higher at some points but lower at other points.

For your second question, you interpret things the same way as usual. However, as always, be wary of interpreting main effects when you have a significance interaction effect!

If you use dummy coding, you’re comparing group means to a baseline level. In effects coding, you’re comparing group means to the overall mean.

I hope this helps!

Emma says

Hi Jim,

Thank you so much for taking the time to reply in such detail. This is all very useful information. I have only centred the two IVs that were included in my interaction terms.. one was a categorical variable. Should I have centred all my continuous IVs regardless if they were included in an interaction term? Also, have I made the mistake of centring the categorical (8 level) variable? (This IV was not dummy coded for centring of course, but was dummy coded to enter into the regression analysis as individual IVs.)

I am looking forward to reading more of your book by the way, looks great so far!

Thanks again,

Emma

Jim Frost says

Hi Emma,

Thanks for supporting my ebook, I really appreciate that!

You can only center continuous variables. You can’t legitimately center categorical variables because a center does not exist for categorical data. Even if you represent your categorical data by numbers, you shouldn’t center them. And, the columns for the dummy coded representation of the categorical variable shouldn’t be centered either for the same reason. If your variable is ordinal, then you’ve got a decision to make about whether to use it as a continuous or categorical variable–which I cover in the book.

If you center your continuous IVs, you should center all of them. Centering only a portion of the IVs isn’t a problem for the coefficients, but it does muddle the constant. If you don’t center any continuous IVs, the constant represents the mean DV when all IVs = zero. However, when you center all the continuous IVs, the constant represents the mean DV when all the IVs are at their mean values. If you center some IVs but not the others, the constant is neither of those! Note: I do urge caution when interpreting the constant anyway.

Harshitha says

hello Jim

Thank you for the useful blog about “Understanding interaction terms in statistics”

I have two doubts

1) In my multiple regression model 1, only my interaction term a*b is significant and has a negative coefficient,,, but the main effects with a and b are not significant,, where both a and b are dummy variables..how do i interpret this result?? Does that mean that a and b only have a negative effect on dependent variable when they appear together??

and it is the same case with model 2 where “a” is continous independent variable and b is dummy independent variable

2) Is there any difference between directly entering the multiplied values a*b as a variable c in the regression equation lm(DV~A+B+C) or is there a difference when it is lm(DV~A+B+AB)

Thanks in advance

Jim Frost says

Hi Harshitha,

I’m glad that you found this blog post to be helpful! Now, to answer your questions:

1) I’ve written about this case before where the main effects are not significant but the interaction effect is significant. Please read this comment that I wrote. That explains the situation in terms of the main effects and interaction effect.

As for the negative coefficient in model 1, you have to know what the dummy coding represents. And, yes, you are correct that only went both characteristics are present, they have a negative impact on your DV. This negative effect is a constant value. However, when neither or only one characteristic is present, there is no effect on the DV.

For model 2, I’ll assume that everything else is the same as model 1, including the fact that the main effects are not significant, except now A is a continuous variable and B is a dummy variable. In this case, B must be present for there to be an effect on the DV. When B is present, and A doesn’t equal zero, then there will be some negative effect on the DV. Unlike for model 1, this negative effect will vary based on the value of A.

For your second question, interaction terms simply multiply the values of the variables that are in the interaction term. Often, statistical software will do that behind the scenes. However, if you create a new column which the product of the relevant variables, there will be no difference in the results. For your example, there is no difference.

I hope this helps!

Emma says

Hi Jim, thank you! I will purchase your book and have a good read. However, am I right in saying that I should be looking at outliers and all other regression assumption tests for my interaction terms? Thanks again

Jim Frost says

Hi Emma,

Typically, assumptions apply to the dependent variable rather than the independent variables and other terms, such as the interaction terms. You might want to look at the distribution of values for the interaction terms to find outliers. Although, typically determining whether the underlying IVs have outliers will be sufficient. I don’t usually see analysts assessing the distribution of interaction values directly. I suppose it’s possible that you could have two continuous variables where an observation has values that aren’t quite outliers but then when you multiply them together can create a very unusual value.

One thing I write about more in the book is the importance of understanding whether the underlying values are good or not. So, even in the case I describe immediately above where you have an unusual value for the interaction term, if the underlying values for the observation are valid, you’d probably still leave it in your dataset.

The key point is to understand the directly observed values in your dataset, determine whether that observation is good (that’s a whole other issue!), and if they are, the value of the interaction term for that observation is probably not an important aspect to consider. So, you probably don’t need to assess outliers for the interaction terms. However, if you did, it’s not a bad thing. However, the priority should be looking at the observed values for the IVs and assess those. Determine if there’s an identifiable problem with that observation that warrants exclusion. After you do that, the value of the interaction plays little to no role.

Be wary of removing an observation solely because its value for an interaction term is unusual. Additionally, never remove an outlier solely because of some statistical assessment.

One other thing, when you include interaction terms, you should center your continuous IVs to reduce multicollinearity. That’ll also help with the outlier issue for the interaction term.

Emma says

Hi Jim,

This post was really helpful and I am really keen on purchasing your book because there are so many questions I have left unanswered regarding regression analysis after I studied undergrad statistics at university.

I previously conducted a study based on multiple regression, however now I want to add possible confounding variables to my analysis so I will be conducting a hierarchical regression which includes my confounding variables: categorical (dummy coded) demographic variables and two interaction terms. I have found that since adding these confounding variables my Mahalanobis distance statistics are flagging 6% of my cases as multivariate outliers. Upon investigation (histograms, scatter plots, box plots, trimmed mean statistics) I have many outliers now on some of my demographic variables and particularly on my interaction variables. I wonder what I should do about these outliers; do a few outliers in my age and education variables, for example, really make a difference to the regression model?; am I meant to be screening (and possibly handling) my interaction terms in regards to outliers? I wonder if you could give me a brief understanding on what to do in my situation. Also, are these types of questions covered in your eBook on regression?

Thank you and best regards,

Jim Frost says

Hi Emma,

I do cover outliers in detail in my regression ebook. In fact, I cover them much more extensively in the ebook than online–where I don’t have a post to direct you to otherwise I’d so. Outliers are definitely more complex in regression analysis. An observation can contain many different facets (all the various IVs), and any of those facets can be an outlier. In some cases, outliers don’t even affect the results much. In other cases, the method by which you’re detecting outliers will essentially guarantee that a certain percentage of your observations will classify as an outlier. And, there are definitely cases where a few outliers, or even one, can dramatically affect your results.

The ebook walks you through the different types of outliers, how to detect them, how to determine whether they’re impacting the results, and provides guidelines about how to determine whether you should remove them. There are many considerations–too many to address in the comments section. But, I do write quite a bit on the topic in my ebook.

I hope that helps!

Sophie says

Thank you so much for taking the time to respond this has helped me a lot!

Sophie says

Hi. I was wondering if you could explain what a higher order interactions is and a lower order interaction? Thanks.

Jim Frost says

Hi Sophie,

We talk about interactions in terms of two-way interactions, three-way interactions, and so on. The number simply represents the number of variables in the interaction term. So, A*B would be a two-way interaction while A*B*C is a three-way interaction. Three-way would be a higher-order interaction than two-way simply because it involves more variables.

A two-way interaction (A*B) indicates the relationship between one of the variables in the term and the dependent variable (say between A and Y) changes based on the value of the other variable in the interaction term (B). Conversely, you could say that the relationship between B and Y depends on the value of A. In a three-way interaction (A*B*C), the relationship between A and Y changes based on the value of both B and C. Interpreting higher-order (i.e., more than two-way) interactions gets really complicated quickly! Fortunately, in practice, two-way interactions are often sufficient!

Abhishek Mani Tripathi says

Thank you for your reply. How to identify the main effect?

Jim Frost says

Hi Abhishek,

I’m not sure if you mean how do you identify a main effect in statistical output or in the broader sense of how do you identify main effects for including in the model? I’ll take a quick stab at both!

In statistical output, a main is simply the variable name, such X or Food. An interaction effect is the product of two (or more) variables, such X1*X2 or Food*Condiment.

In terms of identifying which main effects to include in a model, read my post about how to specify the correct model. That’ll give you a bunch of pointers about that!

Abhishek Mani Tripathi says

Hi Jim,

I am following your blog. Thank you for your post and interaction with everyone. Actually, I am working on an interaction analysis. I have a complex data set that is from several plantation sites. My main objective is to see the effect of sites (3 sites), clones ( 4 clones) and spacing ( 3 spacing) on biomass, volume, DBH and nutrients supply in soil and elemental concentrations in leaves. I am working in R and tried to understand the interactions for example:

Biomass/Voulme/DBH/Nutrients~ Sites*Clones*Spacing (Factors). However, I am not able to understand which one has main effect (with all interaction together I have .99 R square and adjusted R). On the other hand, I want to develop an allmteric relation/model (individual and general) with the same factors (Site, Clone and Spacing) for response variable Biomass/Volume with DBH. However, I am not familiar with test to see the difference between/among the slopes? Is it okay to F-partial test? I can also share my R script and can discuss more if you will have time. Thank you.

Jim Frost says

Hi Abhishek,

I’m not 100% sure that I understand what you’re asking. However, if you want to know whether the slopes are significantly different, assess the p-values for the interaction terms. Ultimately, that’s what they’re telling you. If the interaction term is statistically significant, then the differences between the slopes for the variables included in the interaction term are statistically significant.

With an R-squared that high, be sure that you’re not overfitting your model. That can happen when you include too many terms for the sample size. For more information, read my post about overfitting.

Best of luck with your analysis!

Antariksha says

Dear Jim,

I am studying the effect of my intervention (IV) on some dependent variable (DV). For this I have used pretest – posttest design on two groups viz. Experimental & Control. I have used ANCOVA to account for the covariate which is the pretest scores of my variable.

Problem is I want to check the effect of my intervention at different levels of some other variable i.e. moderator variable say Intelligence(Above Avg, Avg. & Below Avg.). My EG & CG is a real world set up and the sample size in each is 32. How to do this? Can you please help me.

Sharanga says

Hi Jim,

When doing bivariate descriptives between my two IVs in multiple regression, I obtained a significant p-value in my pwcorr table. Does this suggest an interaction effect or multicollinearity? Or both? However I understand that an interaction effect implies that i must consider this later.

So following this I did my regression of both IVs together, then “vif” then “beta.” The outputs for vif were below 10 and 1/vif was above 0.10 as required. Does this mean that there is no multicollinearity or simply a low level of multicollinearity? Also is vif enough to “consider” the interaction effect, or is there something else i must do? I’m really confused and would really appreciate your help.

Thanking you in advance!

Jim Frost says

Hi Sharanga,

Correlated independent variables indicate the presence of multicollinearity and it is irrelevant in terms of an interaction effect. You might or might not have an interaction effect, but the correlated IVs don’t supply any information about that question. Furthermore, not all multicollinearity is severe enough to cause problems.

However, if you include an interaction term, it will create multicollinearity. As I discuss in my post about multicollinearity, you can standardize your variables to reduce this type of multicollinearity. Additionally, typically VIFs greater than 5 are considered problematic. So, you’d need to know the precise VIFs.

VIFs are irrelevant in determining whether you have a significant interaction effect. If you include an interaction term, you will have severe multicollinearity (if you don’t standardize the variables) regardless of whether the interaction effect is significant. To determine whether you have a significant interaction effect, you need to assess the p-value for the interaction term, as I describe in this post. Don’t use correlations or VIFs to assess interaction effects.

I hope this helps!

Mira says

Hi, I noticed that there are differences between interaction and moderator. But confused with their differences. They both have the same interaction models. What are their differences in hypothesis and interpretation and concept?

Jim Frost says

Hi Mira,

Interactions and moderators are the same things with different names. I’ve noticed that the social sciences tend to refer to them as “moderators” while statisticians and physical sciences will tend to use the term “interactions.”

Statistics are heavily used across all scientific fields and each will often develop it’s own terminology.

Anupriya says

Dear Jim,

Thanks for the tutorial. However, I sincerely request you to clarify my doubts.

I have two fixed effect (FE) models, each with a dependent var (DV), a continuous independent var (IV), and a dummy variable D (developed vs. developing economy), and additional control variables. Now, I am looking at the interaction between IV and D (Developed=0). These are the situations:

Model 1:

1. When I have: DV= IV + D + IV*D +rest ; here, all become insignificant.

2. When I have: DV= IV + IV*D + rest; here, all are significant including the interaction effect. Please let me know if it is correct not to include dummy in the FE regression model. I found research that did not include dummy D, and reported the IV and interaction effects only.

Model 2:

1. When I have: DV= IV + D + IV*D +rest ; here, all become insignificant.

2. When I have: DV= IV + IV*D + rest; here, the interaction effect is significant. But IV goes insignificant. How to interpret the main effects?

Will appreciate if you kindly share your interpretation of results.

Jim Frost says

Hi Anupriya,

Sometimes choosing the correct or best model can be difficult. There’s not always a clear answer. To start, please read my post about choosing the correct regression model. Pay particular attention to the important of using theory and subject-area knowledge to help guide you. Sometimes the statistical measures point in different directions!

As for your models. A question. Are you saying that for Model 1, that the only difference between 1 and 2 is the inclusion of D in in 1 and not 2? Or, are there any other differences? In other words, you take 1 and just remove D, and the rest becomes significant?

Same question for Model 2. Is the only difference between 1 and 2 the removal of D?

Let me know the answers to those questions and I can offer more insight.

In the meantime, I can share some relevant, general statistical rules. Typically, when you fit an interaction term, such as IV*D, you include all the lower-order terms that comprise the interaction even when they’re not significant. So, in this case, the typical advice would be to include both IV and D because you’re including the interaction term IV*D. Often you’d remove insignificant terms but generally that’s not done for the lower-order terms of a significant interaction.

And, it’s difficult to interpret main effects when you have significant interaction effects. In fact, you should not interpret main effects without considering the interaction effects because that can lead you astray, as I demonstrate in this post! What you can say is that when you look at the total effect of the IV on the DV, some of that total effect does not depend on the value of D (the main effect) but a part of it does depend on the value of D (the interaction effect). However, there’s not much you can say about the main effect by itself though. You need to understand how the interaction effect influences the relationships.

Panos says

If we have two significant main effects and a significant interaction (moderation) should we mention both the main effects and the interaction or just the moderation?

Jim Frost says

I would report all the significant effects, both main and interaction. Also explain that because of the interaction effects, you can’t understand the study results by assessing only the main effects.

Kyle says

Thank you, Jim.

Yes, that’s what I figured too. The part that I’m struggling is if I should interpret the main effect in the model with or without interaction effect (given that I couldnt find a significant interaction effect). Some literature termed the main effect in the interaction model as simple effect (as the interaction effect is included and treated as covariate in the model). Would you reckon to base my argument in theoretical understanding rather than stats findings?

Thanks!

Jim Frost says

Hi Kyle,

I’m not sure that I understand your concern. If the interaction is not statistically significant, typically you don’t include it in the model and you can interpret just the main effects. However, if you have theoretical/subject-area literature reasons that indicate an interaction effect should exist, you can still include it in the model. When you discuss your model and the results, you’d need to discuss why you’re including it even though it is not statistically significant.

If I’m missing something about your concerns, please let me know!

Kyle Tan says

Hi Jim!

Love going through your guides, as they are very informative.

I have a question here in regards to interpretation of main effects.

Do I interpret the main effects of independent variables in the regression model with the interaction or without the interaction? I have two separate models for two dependent variables, one found significant interaction effect, and one didn’t. For the latter, I wasn’t sure which model I should use to interpret the main effects.

Thank you in advance!

Jim Frost says

Hi Kyle,

When you have significant interaction effect, you really must consider the interaction effects to understand what is happening. If you try to interpret only the main effects without considering the interaction effect, you might end up thinking that chocolate sauce is best to put on your hot dogs.

For the model without significant interaction effects, you can focus solely on the main effects.

Thomas E. Antonaccio says

Thanks, Jim…as always….very helpful/insightful!

Thomas E. Antonaccio says

Thanks, Jim. This helps a great deal. And I will review your other posts. It does seem like both the R2 change table and the coefficients table are relevant, even if the interaction term does not explain any additional variance in DV.

The only difference in what you mention above is that, for model 1, only one of the predictors was significant; the other was not. And that one predictor was still significant when I added the interaction term in model 2. So it sounds like the non significant predictor may need to be removed or I need to come up with a better composite to operationalize that predictor…at least that is my flavor from the literature….

Thanks again for your timely response…Thomas

Jim Frost says

Given the additional information, it doesn’t seem like you have any statistical reason to include the 2nd predictor or the interaction term. You might only need the one significant predictor. However, check those residual plots and consider theory/other studies. And, that’s great that you’re considering the literature for how to operationalize that other predictor. It sounds like you’re doing the right things!

Thomas E. Antonaccio says

Hi Jim: I have 2 tables – one shows the R2 changes. It has two models: one is the two predictors only (statistically significant change of .573); one includes the interaction term and is not statistically significant. So addition of interaction term does not indicate a significant change beyond main effect.

The other table, which contains the coefficient, includes 2 models. Model 2 includes predictor 1, predictor 2, and the interaction term. The interaction term is not statistically significant; predictor 1 is also not statistically significant; however, predictor 2 remains statistically significant when the interaction term includes in the model.

Does this help?

Jim Frost says

Hi Thomas,

Thanks for the additional information. It does help, although it’s not exactly clear which predictors are significant in model 1? When discussing things like this, it’s good to understand the significance of each term, not just things like significant changes in the R-squared because that doesn’t tell you about specific variables.

So, I’m going to assume the following:

Model 1: Both predictors are significant. (Let me know if that’s not the case.)

Model 2: One predictor is significant. The other predictor is not significant. The interaction term is not significant.

In this scenario, you basically have a choice between a less complex and more complex model that explain the same amount of the variability in your dependent variable/response. Given that choice, you’d often want to favor the simpler model. However, that only assesses the statistical measures. You also need to incorporate theory. If you have theoretical reasons to believe that both predictors are relevant and theory also suggests that an interaction is relevant, then you should favor the more complex model with those three terms.

You should read my post about choosing the correct regression model. In that post, I say that you should never pick the model solely on statistical measures but need to balance that with theory. I think that post will be helpful for you. Also, check the residual plots. If you see problems in the residual plots for one of the models, it’ll help you rule that out and possibly suggest changes you need to make.

So, I can’t definitively say which model is the correct model (assuming one of them is, in fact, the correct model). I’d lean towards the simpler model with just the two main effects if its residual plots look good. That’s particulary true because the two terms are both statistically significant. The more complex model also contains two insignificant terms. But, do balance those statistical considerations with theory and other studies. If you stick with the simpler model, then you just have two main effects to consider. For this model, the main effects collectively explain the total effect.

In terms of understanding which predictor is more important, that opens several statistical and practical considerations about how you define more important. Explaining more of the variance is one method. I write about this in a post about identifying the most important predictor.

If you have more questions after digesting that, please don’t hesitate to post again under the relevant post(s). I hope that helps!

Thomas E. Antonaccio says

Hi Jim: I pay close attention to these posts on interaction effects, given my research. However, something that is still not clear (or maybe I am reading too much into it)…in my study, I am testing whether the relationship between Lean Participation and Workforce Engagement is moderated by workgroup psychological safety. The interaction term is a combination of Lean Participation * Workgroup Psychological Safety.

Model 1, Lean Participation and Workgroup Psychological Safety (Main effect), is statistically significant. Does that mean that the two predictors independently (or collectively) explain the variance in the DV? Would you say that Lean Participation and workgroup psychological safety together explain the variance? In other words, can we not know which predictor explains more of the variance?

Model 2, same two predictor variables plus the interaction term, is not statistically significant. At this point, is the coefficients table essentially meaningless? When I look at the coefficients for model 1, only one of the two predictor variables is statistically significant; the other is not. Also, in the presence of the interaction term (model 2), the same predictor stays significant though its B is a bit smaller.

Am I making this all harder than it needs to be?

Many thanks

Thomas

Jim Frost says

Quick question before I answer. In your model 2, are all the terms (two predictor variables and the interaction term) not significant? Or do you mean just the interaction term or other subset of variables? I’m not totally clear on exactly what is and is not significant. Thanks!

Miss says

Hi Jim, I have a very urgent question and I really hope you can help me with it! My IV(X) and DV(Y) did not show a significant relationship, using a univariate regression. Adding a moderator showed significant one main effect between the moderator(M) (Age) and X, however, the interaction-effect was insignificant. The overall model turned out to be significant. How can I interpret this? Is it still usefull to look at the simple slopes? I don’t know how to interpret the main effects in light of the insignificant interaction. Please help me out! Thank you so much in advance!

Jim Frost says

Hi,

A moderating variable is one that has an interaction effect with one or more of the dependent variables. Because of the interaction effect, the relationship between X and Y changes based on the value of M.

However, a main effect is different. Main effects for one variable don’t depend on the value of any other variables in the model. It’s fixed.

In your case, because the the interaction effect with M is not significant, it’s not a moderating variable. So, what’s going on? You added this variable (M) which has a significant main effect. Your model describes the relationship between X and Y and M and Y. Both of these relationships do not change based on the value of the other variable. (Actually, you didn’t state whether the X-Y relationship was significant after adding M to the model.) For your model, you just consider the main effects of X (if X is significant) and M. The relationship between X and Y does NOT change based on the value of M.

I talk about interpreting main effects in my post about regression coefficients. I think that post will help you out. It’s actually easier to interpret when you don’t have to worry about an interaction effect.

Colton says

The reason for the different results is clear: individual comparisons have higher statistical power.

Ahmed says

Dear Jim,

I have these results with no dropped now. I think there was a command error in STATA.

1. However, it is correct that non of my industry categorical variable has been dropped? What is the explanation for this?

2. I have a fourth industry ”Other”. I included it as a dummy but I did not include its interaction term (Total sales * Other industry) because its firms have heterogeneous characteristics where it is not rationale to interpret its results. Further, once I add it (Total sales * Other industry) in the model, all interactions become not significant as well as Total sales.

2.a Is my method in excluding interaction term (Total sales * Other industry) correct? If yes, can I interpret the results across industries now?

2.b Why once I add it (Total sales * Other industry) in the model, all interactions become not significant as well as Total sales?

Variable Coef. (P value)

Total sales 0.09 (0.002)

Hi-Technology industry 0.08 (0.011)

Manufacturing industry 0.05 (0.002)

Consumer industry 0.15 (0.18)

Other industry 0.02 (0.39)

Total sales * Hi-Technology industry -0.27 (0.011)

Total sales * Manufacturing industry -0.028 (0.002)

Total sales *consumer -0.15 (0.18)

Constant 0.1 (0.11)

Ahmed,

Jim Frost says

Hi Ahmed, please check your email for a message from me. –Jim

Colton Thurgood says

Jim,

Thank you for your reply, it was helpful. I read the post hoc post as well and it too was helpful. Thank you again!

Jim Frost says

You’re very welcome. I hope the reason for the different results is clear now!

Colton Thurgood says

I couldn’t find a related comment section, but I have another question. I am comparing multiple treatments to a control using Dunnett’s procedure. When comparing all treatments at the same time in my software (SAS) it shows no significance, but when I compare one at a time in the same software using the same test, some individual treatments show significance. Why is this? I thought Dunnett’s procedure was supposed to control for family error rate even when comparing multiple treatments to a single control at once. Thanks.

Jim Frost says

Hi Colton,

Based on your description, I believe this is occurring because in one case you’re making all the comparisons simultaneously versus in the other case you’re making them one at a time. The family error rate is based on the individual error rate and the number of comparisons in the “family.” When you have one comparison, the family error rate equals the individual error rate. When you have a lower family error rate, which would be the case when you have just one comparison, your statistical power increases, which explains why some of the comparisons become significant when you make the comparisons individually. In other words, your “family” size changes, which changes the statistical significance of some of the comparisons. You should use the results where you make all the comparisons simultaneously because that is when the procedure is correctly controlling the error rate for all comparisons.

I have post about post hoc tests that explains how this works regarding the number of groups, family error rate, and statistical power. I think that post will be helpful!

Monia says

Dear Jim,

Thank you so much for your clear explanation!

During my analysis, I did not find a significant main effect, but found a significant interaction effect. I have a categorical independent variable and a categorical moderator. However, I am a bit confused how to discuss this in the discussion section.

My first hypothesis is X will increase Y, and another hypothesis is: W will strengten the positive influence of X on Y. If I understand it correctly, I have to reject my first hypothesis and I can accept my second hypothesis. My question is how I should handle this in the discussion? Can I say that I did not find an evidence for X on Y and that this is probably because other factors are influencing this relationship. Can I give then the example of the interaction effect I found? Or should I completely separate those hypotheses and discuss them separately?

Thanks in advance!

Jim Frost says

Hi Monia,

The best way of thinking about this is realizing that an independent variable has a total effect on the dependent variable. That total effect can be comprised of two portions. The main effect portion is the effect that is independent of all other variables in the model–only the value of the IV itself matters. The interaction effect is the portion that does depend on the values of the other variable(s) in the interaction term. Together, the main effect and interaction effect sum to the total effect.

In your case, the main effect is not significant but the interaction effect is significant. This condition indicates that the total effect of X on Y depends on the value of W. There’s no portion of X’s effect that does not depend on W. I’ve found that the best way to explain interaction effects in general (regardless of whether the main effect is significant or not) is to display the results using interaction plots as I have in this post.

You

canstate that there is a relationship between X and Y. However, the nature of that relationship dependsentirelyon the value of W.I hope that helps!

Ahmed says

Dear Jim,

Thanks for your reply.

Once I implemented what you suggested, one interaction for total sales*industry of Consumer Industry dropped from the regression.

The statistics output:

Variable Coef.

Total sales 0.002

Hi-Technology industry 0.011

Manufacturing industry 0.018

Total sales * Hi-Technology industry -0.039

Total sales * Manufacturing industry -0.036

Constant 0.1

Now please,

(Q 1) how can I interpret these results in light that interaction (Total sales *consumer) not shown in the statistics output?

(Q 2) Why interaction (Total sales *consumer) has been dropped from the regression?

Your support is highly appreciated, Jim.

With thanks & kind regards,

Ahmed,

Jim Frost says

Hi Ahmed,

You’ll need to include the p-values for all the variables and the interaction term in the model. Specifically, the p-value for the Total sales, Industry categorical variable, and the interaction between total sales*industry.

Did you drop total sales*industry because it was not significant? What do you mean it “has been dropped.” Did you remove it? Please fit the model with the all the terms I asked then tell me the coefficients/p-values. Don’t remove terms. Thanks.

Ria says

Yes it is cti and trial type r my within subjects factors

Jim Frost says

Hi Ria,

Sorry, it’s just not clear from your description how you’re conducting your study. You didn’t mention before having two within-subjects factors (and there’s no between subjects factor?) nor did you mention the lack of pretest and posttest, which means my previous explanation was wasted. Context is everything when interpreting statistics, and I don’t have that context.

I suggest contacting a statistical consultant at your organization where you can describe your experiment’s methodology in detail so they can provide the correct interpretation.

ria says

Hello,

i don’t have a pretest and post test.

Jim Frost says

Is trial type your within-subjects factor?

ria says

Thank you so much jim!, this really helped me to clarify my doubts, just to make sure, my task is the effects of Cue target interval and switch costs on reaction times. So my 2 IV’S are cti (long and short) and trial types(switch and repeat). So i had to see how reaction times are affected for repeat trials and switch trials when there is a long and short cti.

Jim Frost says

Ah, ok, so that changes things a bit, I think. I’m not familiar with the subject area and the difference between switch and repeat trials. That’s your within-subjects factor? Is there anything that’s similar to pretest and posttest?

How you interpret the results depends on the nature of the observations. What I described was when you have pretest and posttest. If you have something else, it might change the interpretation. But, I don’t fully understand your experiment.

Ria says

Hello,

My design is a 2×2 repeated measures ANOVA . I have 2 independent variables and each has 2 levels . I found a main effect for both the variables but I did not find an interaction, so how can I explain these results in relation to my hypothesis ? Since it’s based on interaction effects in the dependent variable ?

Jim Frost says

Hi Ria,

I’m assuming that one variable (the within-subjects factor) is time (maybe pretest and posttest) and the other factor is something like treatment and control. If both of these effects are significant, then you know that the scores changed across time and between the treatment and control groups. However, the interaction effect is not significant, which is crucial in this case.

If you were to create an interaction plot, as I do in this post, imagine that DV value is on the Y-axis and time is on the X-axis. You’d have two lines on this graph, one represents the control group and the other represents the treatment group. Because the main effect for treatment/control is significant, the lines will be separated on the graph and that difference is statistically significant. Because the interaction effect is not significant, the lines will be parallel or close to parallel. The difference in slopes is not statistically significant. So, while experimental group factor is statistically significant, the lack of significant interaction suggests that the same effect size exists in both the pretest and posttest. What you want to see is the difference change from the pretest to the posttest, which is why you want a significant interaction effect.

Your results suggest that a statistically significant effect existed in the pretest. Because it exists in the pretest, it was not caused by the treatment itself and existed before the experiment. That same initial difference continues to exist unchanged in the posttest. Because the interaction effect is not significant, it suggests that the treatment did not change that difference between experimental conditions from the pretest to postest. In other words, there’s no evidence to suggest that the treatment affected outcomes overtime as you move from the pretest to the posttest.

I hope that helps!

Ahmed says

Dear Jim,

Thanks for Your prompt responses are highly appreciated.

I collect these valuable Comments. . But sometimes comment dates were not arranged. So I hope to be arranged

With Regards

Ahmed Ebieda

Jim Frost says

Hi Ahmed,

Comments should appear in reverse chronological order so that the most recent comments appear first. I’ll double-check the settings but that’s how they should appear. I’m glad you find them to be valuable! I always ask people to post their questions in the comments (rather than by email) so that everyone can benefit.

Thanks for writing!

Ahmed says

Dear Jim,

I have a statistical inquiry on my analysis. May you help me, please?

The situation as follows:

My sample is small; 143 firms.

My example research question: Do Different Industries Affect the Relationship between Total Sales and Net Income?

I have this model:

Net income = β0 + β1 Total sales + ε

I need to run this model across three industries; (1) Consumer, (2) Hi-Technology, and (3) Manufacturing, to examine which industries have a significant effect of total sales on net income. Thus, I will run three regressions in total, one for each industry.

Before running the regression, I add interaction variable (β2 Total sales*industry) to the above model, where total sales is continuous variable in USD and industry is a dichotomous variable where industry = 1 for consumer, 0 otherwise (regression 1), 1 for Hi-Technology, 0 otherwise (regression 2), 1 for Manufacturing, 0 otherwise (regression 3).

The final model is:

Net income = β0 + β1 Total sales + β2 Total sales*industry +ε

My questions:

1. Is my method correct in adding (Total sales*industry) and considering industry is a dichotomous variable where industry = 1 for consumer, 0 otherwise (regression 1), 1 for Hi-Technology, 0 otherwise (regression 2), 1 for Manufacturing, 0 otherwise (regression 3)?

2. How can I compare the significance of the difference in coefficients of (Total sales*industry) across the three regressions?

I am thinking to utilise this formula as utilized by Josep Bisbe & Ricardo Malagueño 2015, footnote no. 11.

Is this formula valid in my situation?

Is there another way/formula to compare the significance of the difference in coefficients across more than two regressions?

This is a tricky situation for me where I need an expert in statistics to assist me on it.

Your prompt response is highly appreciated.

If you need further explanations, please let me know.

With thanks & kind regards,

Ahmed

Jim Frost says

Hi Ahmed,

Yes, you’re on the right track, but I’d make a few modifications. First, you can answer all your questions using one regression model. If your data for the three industries aren’t already in one dataset, then combine them together. While combining the data, be sure to add a categorical variable that captures the industry. When you specify the model, include the total sales variable, the industry categorical variable, and the interaction term for total sales*industry. The categorical variable tells you whether the differences between the intercepts for the three industries are statistically significant. If the interaction is not significant, the categorical variable also indicates the mean net income differences between the industries.

If the interaction is significant, it indicates that the relationship between total sales and net income varies by industry. In other words, the differences between that relationship for the three industries are statistically significant. You could then use interaction plots to display those relationships. If the interaction term is not significant, your data do *not* suggest that the relationship between total sales and net comes varies by industry. Those differences are not statistically significant.

Fortunately, I’ve written a post all about what you want to do. For more information, please read my post about comparing regression equations.

Aina Jacob Kola says

In ANOVA, the test between-subject indicates that it’s not significant because Sig. 0.281. I am considering gender and academic performance. How do i interpret this? Also test within-subject is not significant with 0.112.

Mahnoor Sohail says

Hi Jim,

Would

1. y= age + age^2+ gender+gender*age+gender*age^2

or

2. y= age+ age^2 +gender+gender*age

or both would work fine?

in words

should you interact the gender dummy with higher polynomials, is it necessary? or the second one would also work fine

Jim Frost says

Hi Mahnoor,

You can use the interaction term with a polynomial. Use this form when you think the shape of the curvature changes based on gender. If you don’t include the polynomial with the interaction, then the model assumes there is curvature but the shape of that curvature doesn’t change between genders. However, the angle or overall slope of the curvature on a plot might be different (i.e., you rotated the same curvature).

Of course, which form you should use depends on your data. As always, use a combination of theory/subject area knowledge, statistical significance, and checking the residual plots to help you decide. So, I can’t tell you which one is best for your data specifically, but I can say there’s no statistical reason why you couldn’t use either model.

Ivan Sysoev says

Hi Jim! I’ve just got your book – was looking for something like that for a long time. Thank you very much for explaining these concepts in simple terms!

I wanted to ask three related questions regarding the condiment example.

(1) If I understand it correctly, one way to interpret the significant p-value of the interaction term is that adding chocolate sauce to hot dog doesn’t produce the same increase in enjoyment as in the default case (the main effect of chocolate sauce). But is there a way to show that it leads to significant *decrease* in enjoyment? Of course, on the interaction plot, the corresponding line points downwards – but what how to show that this downward direction is statistically significant, and not just a result of a fluke?

(2) How to interpret the main effect in presence of interaction? As far as I understand, this is the trend observed in the “default” case, when the value of *hot dog* variable is equal to zero. So, am I correct that in this case, the main effect is nothing but the trend observed for ice cream?

(3) If I’m correct with (2), does it mean that I can answer (1) by making *hot dog* the “default” case, having an *ice cream* variable that gets either 0 or 1, and looking at the main effect in this case?

Thank you!

Ivan

Jim Frost says

Hi Ivan,

Thanks so much for supporting my ebook! I really appreciate it. I’m happy to hear that it was helpful! 🙂

On to your questions.

A significant p-value for an interaction term indicates the relationship between a independent variable and a dependent variable changes based on the value of at least one other IV in the model. There’s really no default case in a statistical sense. Maybe from a subject-area context there’s a default case. Like in the manufacturing example, there might a default method the manufacturer uses and they’re considering an alternative. But, that’s imposed and determined by the subject area.

In the hot dog/ice cream example, I wouldn’t say there’s a default case. It’s just that relationship between the variables change depending on whether you’re assessing a hot dog or an ice cream sundae. Or you can talk about it with equal validity from the standpoint of condiments. If you have IV Y and DVs of X1 and X2, a significant p-value for the interaction termindicates that the relationship between Y and X1 depends on the value of X2. Or, you can state it the other way of the relationship between Y and X2 depends on X1. The interaction plot displays those differing relationships with non-parallel lines. A significant p-value indicates your sample evidence is strong to suggest that the observed differences (non-parallel lines) exist in the population and are not merely random sampling error (i.e., not a fluke). I think that all answers your first question.

Your second question, I address in more detail in the book than in the post. So read that chapter towards the end for more detail. But, I’ll give the nutshell version here. The total effect for IVs can be comprised of both main effects and interaction effects. If both types of effects are significant, then you know that a portion of the total effect does not dependent on the value of the other variables (that’s the main effects) but another portion does depend on the other variables (the interaction effect). A significant main effect tell yous that a portion of the total effect does not depend on other variables. It’s tricky though because while that knowledge might seem helpful in theory, you still have to consider the interaction effect if you want to optimize the outcome. So, if you want to maximize your taste satisfaction, you can see that condiment has a significant main effect. In this case, it means you prefer one condiment overall regardless of the food your eating. It’s an overall preference. In the example, that’s chocolate sauce. You just like it more than mustard overall. I suppose that’s nice to know in general. However, despite that significance, if someone asks you which condiment do you want, you still need to know which food you’ll be eating because you’re just not going to like chocolate sauce on a hot dog! That’s what the significant interaction tells you. When you have a significant interaction, you ignore it at your own peril! So, to answer your second question, the main effect is the portion of the total effect that doesn’t change based on the value of the other variables.

For the third question, again, I wouldn’t think of it in terms of default cases. Rather think of it in terms of proportions of the total effect. If you have a significant main effect, then you know that some of the total effect doesn’t depend on other variables. Overall, you prefer chocolate sauce more than mustard. But the interaction tells you that for hot dogs specifically you don’t want chocolate sauce! If the interaction term wasn’t significant, interpretation is easier because you could just say that chocolate sauce is better for everything. It doesn’t depend on the food you’re eating. That type of main effect only relationship is valid in some areas. But, common sense tells you it’s not true for condiments and food. And, the interaction term is how you factor that into a regression or ANOVA model!

Felix says

Hi Jim,

Given a model Y = b0 + b1X1 + b2X2 + b3X1X2

Was wondering how to interpret the effect of X1 on Y, for given values of X2. Say we are given two different values of X2, case 1 and 2. If the the p-value is higher than the signlevel in the first case, then it is not signifacant. How is this interpreted? In the other case, the p-value is lower, so we have signifance. How is it interpreted then? Find it a bit wierd that the effect is significant for some values of the interaction term, and not for others.

Jim Frost says

Hi Felix,

I don’t understand your scenario. Are saying that X2 is a categorical variable with the two values of “Case 1” and “Case 2”? Or, that you have two models that have the same general form but in one case the interaction term is significant and in the other case it isn’t?

If it’s the former, you’d only have one p-value for the interaction term–not two. It’s either significant or not significant.

If it’s the later, if you’re fitting different models with different datasets, it’s not surprising to obtain different results.

I think I need more details to understand what you’re asking exactly. Thanks!

Jessica says

Hi Jim!

I was wondering whether a significant interaction effect (within the 2-way ANOVA) implies that there is a moderation effect? My lecturer only discusses ‘moderation’ as part of the regression. So I am a bit confused there.

Jim Frost says

Hi Jessica,

That’s a great question. Moderation effect is the same thing as an interaction effect. I think different fields tend to use different terms for the same things. I believe psychology and the social sciences tend to use moderation effect. I probably should add that in the blog post itself for clarity!

Fran says

Hi Jim, I have run 3×2 ANOVA, which generated interesting signifcant effects, however, there was no interaction effects. Does that sugggest some sort of problem in operationalising the experiment? Is that some sort of an anomaly? Thank you for your opinion in advance.

Jim Frost says

Hi Fran,

No, that’s not necessarily a problem. An interaction effect indicates that at least a portion of a factor’s effect depends on the value of other factors. If the there are no significant interaction effects, your model suggests that the effects of all factors in the model do not depend on other factors. And, that might be the correct answer.

However, you need to incorporate your subject area knowledge into all statistical analyses. If you think that significant interaction effects should be present based on your subject-area expertise, it possibly indicates a problem. Those problems include a fluky random sample, operationalizing problems, and a sample that is too small to detect it. Again, it’s not necessarily a problem.

Mina says

Thanks for sharing, Jim. This article is quite helpful for a beginner like me. Will keep subscribing your blog.

safa says

Hello Jim,

I did my experiment to investigate the evaporation rate from still water surface.I have two questions.

i have five factors that are supposed to be effected on the evaporation rate.the first question is which the best method i should follow to get the main effect of these factors on the main response(evaporation rate) and the interaction between these factor and how this interaction will effect on the evaporation rate, then get equation can be used to predict the evaporation rate. the second question is can i use the multi regression analysis to get the main effect then use nonlinear regression analysis to get the equation (because the relation between factors and the response should be nonlinear)

Jim Frost says

Hi Safa,

I’m not completely sure what you’re asking. If you are asking about how do you specify the correct regression model including main effects and interaction effects, I suggest you read my post about specifying the correct regression model.

For help on interpretation of the regression equation, read my posts about regression coefficients and the constant. The coefficients post deals with the main effects while this post deals with the interaction effects.

You might also need to fit curvature in your data. To accomplish that, you can use linear regression to fit curvature. Confusingly, linear refers to the form of the model rather than whether it can fit curves-as I describe in this post about linear versus nonlinear regression. Read my post about curve-fitting to see how to fit curves and whether you should use linear or nonlinear regression. I always suggest starting with linear regression and only going to nonlinear regression when linear regression doesn’t give you a good fit.

If you need even more information about regression analysis, I highly recommend my intuitive guide to regression ebook, which takes you from a complete novice to effective practitioner.

Best of luck with your analysis!

Ruoxi says

Hello Jim,

Thank you so much for your post. It is very helpful. I am having some troubles interpreting the following results:

Model 1:

CEO turnover dummy = CEO performance x prior CEO’s performance + CEO performance + prior CEO’s performance + controls + year FE

Model 2:

CEO turnover dummy = CEO performance + prior CEO’s performance + controls + year FE

In Model 1, I find a negative coefficient on the interaction term, which shows that when the prior guy was performing really well, the current CEO’s turnover-performance sensitivity is stronger (or more negative), suggesting the current guy has bigger shoes to fill.

However, in Model 2, I find a negative coefficient on the prior CEO’s performance. This means, holding the current CEO’s performance constant, the better the previous guy is, the less likely the current CEO is gonna get fired.

These results together seem to suggest completely different directions. I am wondering if I interpreted correctly…Would you like to give me some suggestions?

Jim Frost says

Hi Ruoxi,

You know that the interaction effect is significant. When you fit the model without the interaction effect, it’s forced to average together that interaction effect into the terms available in the smaller model. It’s biasing the coefficients because it simply can’t model correctly the underlying situation. It’s a form of omitted variable bias.

Given this bias and the omitted significant interaction effect, you should not even attempt interpreting Model 2. It’s invalid. Stick with Model 1.

Best of luck with your analysis!

Gurbir says

Hello Sir,

How to interpret a moderation effect when the correlation between IV and DV is not significant.

All three variables are continuous, the relationship between IV and DV is positive and the moderation effect is positive.

What can we say about the relationship between IV and DV?

Jim Frost says

Hi,

It’s difficult to understand interaction effects by just observing the equation. And, with the limited information you provide, I don’t have a good picture of the situation. But, here’s some general information.

You say the correlation between the IV and DV is not significant. That’s not too surprising because when you perform a pairwise comparison like that you’re not factoring in the other IVs that play a role. It’s a form of omitted variable bias. In your case, your model suggests that the nature of the relationship between the IV and DV changes base on the values of other IVs in your model. The significant interaction term (you don’t state it’s significant but I’m assuming it is) captures this additional information. To understand how the relationship changes and discuss the interaction term in greater detail, I recommend creating interaction plots as I show in this blog post.

I hope this helps!

Binda says

Hi Jim,

Thanks for your explanations. Kindly let me know if it is possible to conduct a single ANOVA where there are three independent variables and two or more dependent variables.

Thanks

Jim Frost says

Hi Binda,

It sounds like you need to use multivariate ANOVA (MANOVA). Read my post about MANOVA for details!

Rini says

Thank you so much sir!! really your guidance and support was of great help to me!will look forward to more such guidance!

RINI says

Sir,

My SAMPLE SIZE is 342, so therefore i guess the model has enough statistical power to determine that small difference between the slopes is statistically significant.

This means there is a v slight diff in slopes which isnt visible in the plot and the lines probably would cross only on extending the graph. I got your point.!!

I read the post about practical significance. It was such an eye opener!! Thank yous so much sir for guiding through the right path!

As far as practical significance is concerned, if we see the mean values mentioned in the previous comment, then the Instructional strategy used in experimental group was more beneficial for the females than males. Thus there is a practical significance i guess. The males had almost similar mean scores for both experimental and control groups. Am I correct in interpreting the practical significance?

Since this difference or effect is so small but it is not meaningless in practical sense; so in that case I should “NOT REJECT THE NULL HYPOTHESIS”?

Jim Frost says

Hi Rini,

Yes! It sounds like you are getting it! 🙂 You do have a large sample size, which it can detect small differences.

You’re welcome about the other post.

I can’t tell what’s practical effect size for your study. However, I could see the case being made that it’s only worthwhile using that instructional strategy with females. You’d need to determine that the effect size for males is not practically significant AND that it is practically significant for females to draw that conclusion. Bonus points for creating CIs around the effect sizes so you can’t determine the margin of error.

For your last question, no, you’d still reject the null hypothesis. The null hypothesis states that the slopes are equal and your model specifically suggests that is not true. That’s a purely statistical determination.

However, the practical, real-world importance of that difference is an entirely different matter. In your results, you can state that the interaction term is statistically significant and then go on to describe whether the difference is important in a practical sense based on subject-area knowledge and other research.

In a nutshell, if you have statistical significance, you can then assess practical significance. Separate stages of the process. If you don’t have statistical significance, don’t bother assessing practical significance because even if the effect looks large, it wouldn’t be surprising if it was actually zero thanks to random error!

RINI says

Sir,

Thank you so much for your prompt response!!

The plot which i have obtained using SPSS is ABSOLUTELY two parallel lines, which clearly can not meet even if the graph is extended.

It does not show ORDINAL INTERACTION and there is no difference in the slopes at all !! that is why i am all the more confused. SO STILL SHALL I ASSUME THAT THEY WILL CROSS AT SOME POINT?

As suggested by you to interpret the interaction in the plot, Here are the mean values obtained:

Female: Experimental grp-12.478

Control grp-10.822

Male: Experimental grp-10.904

Control grp-10.447

Clearly in both the groups the females have performed better. For both males and females the mean score is better for the experimental group only.

Do these values indicate there is no interaction between Instructional strategy and gender?

Jim Frost says

Hi Rini,

Based on the numbers you include, those lines are not at all parallel. If the interaction term for these data is significant and assuming higher values are better, you can conclude that the experimental group has a larger effect on females than males. The effect in males is relatively small compared to the effect in females. In other words the relationship between instructional strategy and the outcome (achievement?) depends on gender. It has a larger effect on females.

RINI says

Hello Sir,

Your blogs were really helpful! I look forward for your guidance for the following problem that I am facing in my analysis.

In one of the objectives in my study

Independent variables – 2 levels of Instructional strategy

Dependant variable- Achievement

Moderate variable-Gender

Covariate- Pre-Achievement

For data analysis in SPSS I used ANCOVA to study the effect of Instructional Strategy, Gender and their interaction on Achievement.

The significance values were as follows:

Inst strategy: p=0.000

Gender: p=0.000

(Inst strategy* Gender): p=0.014

This clearly shows that the main effect as well as the interaction effect is significant.

However in the plot obtained in SPSS, the two lines DO NOT INTERSECT but are PARALLEL LINES

How should I interpret this result? Shall i consider it significant by ignoring the plots?

(since you mentioned in one of the replies above that “Sometimes the graphs show a difference that is nothing more than random noise caused by random sampling”)

Also for this objective the Assumption of homogeneity of regression slopes was not met. What should be done then?

Sir Kindly provide necessary guidance.Thank you.

Jim Frost says

Hi Rini,

We often think that a significant interaction effect indicates that the lines cross. However, strictly speaking, a significant interaction effect simply indicates that the difference between the slopes is statistically significant. In other words, your data provide strong enough evidence to reject the null hypothesis that the slopes are the same. Unless there is an error in your graph, your lines will exhibit different slopes. Those lines might not cross on the graph, but if you extend the graph out far enough, they’ll eventually cross at some point. Of course, you don’t want to extent the graph outside the range of the data anyway.

There’s nothing inherently wrong with a significant interaction where the lines don’t cross on an interaction plot. Your model is telling you that the slopes of the lines are different. In other words, the nature of the relationship between variable A and B depends on the value variable C (in a two-way interaction between B*C). In your case, the relationship between Instructional Strategy and Achievement depends on Gender. Use your graph to interpret the interaction more precisely. For example, it’s possible that one of your instructional strategies is even more effective for one gender than the other. Again, use your graph to make the determination.

There’s also one another consideration. The interaction is statistically significant, but is it practically significant? You can only answer that using subject-area knowledge rather than statistics. If you have a large sample size, the model has enough statistical power to determine that a small difference between slopes is statistically significant. However, that small difference might not be meaningful in the real world. I write about this in my post about statistical significance vs. practical significance.

I hope this helps!

Dan says

Hi Jim,

In your graph showing a continuous on continuous interaction, how would one go about determining the value of x1 where the predicted value of y is the same for values of x2? My question is about determining the x1 coordinate where the two regression lines for pressure cross each other. It seems like there should be a way to solve for the value of x1 without plugging a series of apparent x1 values into the equation and seeing where the difference in y is 0. Thanks.

Jim Frost says

Hi Dan,

You’d need to find the slope of each line, set them to equal each other, and then use algebra to solve for X. In the example in this post, X is Temperature. I don’t have the two slopes at hand, but, based on the graph, it should be about 95.

However, when I see interaction terms used in practical applications, understanding where the lines cross is usually not the main concern. Typically, analysts want to determine how to find the optimal settings and they look for the combination of settings that produce the best outcomes rather than different combinations that produce the same middling outcome. I can’t think of cases where the point at which the lines cross is meaningful. That’s not to say it couldn’t be–just that I haven’t seen that case.

I hope this helps!

Nada says

Thank you so much Jim for a brief and to the point introduction to interaction terms.

I hope you would save me by answering the following questions:

I have 5 independent variables and I’m interested in checking the interaction term of one particular variable of the 5 variables with all the rest 4 variables.

I did that on stata and found it to be significant for one or two interactions. The thing is when including these interaction terms, it totally changes the value of the main effect coefficients, it could even change it from being positive to being negative. Why do you think this occurs and how do I interpret the new main effects when including interaction terms?

Also, do I test the significance of each interaction term alone all do I test the 4 together in the same model?

Thank you and I hope you could help me with this and appreciate it.

Chris says

Hi Jim,

Thank you for these very helpful posts!

I still need something cleared up if you could help?

How would we interpret the ‘economic’ values of the coefficients? Using the continuous variables as an example: does a 1% increase (decrease) in pressure, a 1% increase (decrease) in temperature lead to a X% increase (decrease) in strength?

Thomas Antonaccio says

Hi Jim: I ended up switching to SPSS to test interaction effect via moderated multiple regression. I continue to read up on interpreting the outcomes, but the one thing that I’m still confused about is: let’s say the main effect (in my case Lean participation and psychological safety) explain about 35% of the DV (workforce engagement), and this main effect is statistically significant. Is this saying that Lean participation and psychological safety, together, account for 35% or does each one variable separately account for 35%? The interaction effect was not significant, indicating that the relationship is not dependent on the high/low levels of psychological safety. However, it seems that the significant main effect of 35% is important in and of itself, indicating that psychological safety does play a role, though just not in combination with Lean participation. Is this making sense? Am I off the mark here? thanks again for the very informative blog posts….tom

Jim Frost says

Hi Tom,

To answer this question, I need to verify your model. Do you have two IVs, Lean Participation and Psychological Safety, and they’re both significant?

If so, and the 35% comes from the R-squared, then your model suggests that the two main effects together explain 35% of the variation in the DV around its mean. Because the interaction effect is not significant, the relationship between LP and WE does not changed based on PS. Equivalently, the relationship between PS and WE does not change based on LP.

In a nutshell, the two main effects add up to 35%. If Psychological Safety is significant, then your data provide sufficient evidence to conclude that changes in PS relate to changes in WE. Same for LP.

It sounds like you’re on the right track!

Rotimi says

Hello Jim:

Please do you have any of your books that I can buy online which will offer more guidance for me for this analysis?

Best,

Rotimi says

Thanks a lot Jim. Much appreciated!

Rotimi says

Hello Jim:

Thank you for your insightful post. My study involves one continuous dependent variable (poverty status) and 5 categorical independent variables (financial services, electricity, healthcare, water and education).I am interested in both the main effects between each of the independent variable on the dependent variable as well as any interaction effect between the independents.

I am planning to use factorial ANOVA for this. There are two groups for the independents – access or no access.

Kindly advise if factorial ANOVA is appropriate for this analysis and if not which one would you recommend?

Thank you

Rotimi

Jim Frost says

Hi Rotimi,

Yes, factorial ANOVA sounds like the right choice. Factorial ANOVA is like one-way ANOVA but it can have more than one independent variable. It’s actually the same analysis with the same math behind it, just with a different name to represent the number of IVs.

Best of luck with your analysis!

Thet Lynn says

Hi Jim,

Your post is awesome and appreciate your kindness in sharing such invaluable knowledge with an open access.

I have two questions for your kind response/validation unless you are not bothered.

—

Referring to a basic equation of multiple regression with an interaction term;

Y = intercept + β1*X1 + β2*X2 + β3*X1*X2,

– Question 1

If I have to interpret, using the coefficients, the interaction effect of X2 on the relationship between X1 and Y, the correct and basic way of thinking is;

Y = (β1+β3*X2)*X1

[please correct me if I am wrong.]

I know β3 has to be significant to infer the presence of modification.

However, does β1 need to have a significant P-value to interpret the interaction effect in such a way as above?

If it does, is the way of interpretation changed in case it was not significant in the model?

– Question 2

I have been finding it difficult to calculate the confidence interval of the interaction effect.

Do you have any idea of calculating it using R, for instance?

Or should I do it manually as some sources mentioned?

Best Regards,

Thet

mmusakaleem says

Hi Jim, Great explanations..

Thomas says

Hi Jim: I enjoy and am learning a great deal from reading your posts and look forward to reading your book on regression analysis….I do have a question in the interim….in reviewing the likes of Laerd Statistics for moderated regression, the focus is on testing the interaction term for significance and reporting findings…and that is essentially it….but if you see that the main effect is significant, would that not be meaningful or is that trumped by the insignificant finding from the interaction analysis? In my case the hypothesis is whether a significant interaction, which is not significant, but I am wondering if effects of the two independent variables independently and significantly explaining say 35% of the variance is still relevant and worth mentioning in my write up. Thoughts? Many thanks…Thomas

Jim Frost says

Hi Thomas,

First of all, thank you so much for buying my book! I really appreciate that!

In the book, I go into much more depth about interaction effects than I do on my blog. I talk about the question that you’re asking. In the book, the section on interaction effects starts on page 113. I’ll give a preview of what you’ll read when you get to that section.

Yes, it’s meaningful knowing that the main effects are significant whether or not the interaction effects are significant. When you look at the total effect of an independent variable, a significant main effect indicates that some portion of the IVs effect does not depend on the value of the other IVs, just its own value. If the IV’s interaction effect is also significant, then you know that another portion of its effect does depend on the values of other variables.

In your case, because interaction effects are not significant, you can make interpretations based solely on the main effects. In fact, the main effects are particularly for interpreting when the interaction effects are not significant. Yes, those significant main effects are very important and worth discussing in your write up!

Best of luck with your analysis!

sofea says

Hi Jim..thank you for valuable explanation. I have little bit curiosity regarding my result. I use 2 way anova to check interaction effect between x (language style matching) and y (shopping intention). my result was language style sig but not for gender. there also no sig diff was found, then reject null hypo. however when i run my data using serial multiple mediator (macro) (m1: benevolence) and (m2: integrity) there was significance diff between direct effect (x to y).

my question are:

1) can i use serial multiple mediator to see significance diff between direct effect? and ignore the anova result?

2) or can i just use serial multiple mediator to see all effect?

really need your advice regarding this matter…

Iseul says

Hi Jim,

I wanted to check your web site — I am actually subscribing your posts which have been super helpful! — because I got the similar issue mentioned above (e.g., the coefficient of the interaction term X1*X2 is significant but the coefficient of the independent variable X1 becomes no longer significant when the interaction term is added). I was wondering how to interpret the significance of the effects of X1 and X2 overall and now I have found your answer here. Thanks a lot!

Iseul

ymk says

Hi Jim, could you please help to asnwer some questiosn regarding interaction?

I am doing a dissertation on survival analysis and found significant interaction between 2 terms call it A(subgroup 1 2 3) and B (subcategory 1 2), among other variables say CDE.

On KP curve there is significant differnece in log rank if I strata B1 and B2 (significant in B1 but not in B2) with factor A 123. So I am suspecting interaction.

I only found significant interaction [<0.05 between A2 and B (but not A3 and B) so it still counts as significnat?

and if I were to proceed with with multivariayr cox regression using

CDE and 6 dummy variables for the (A*B) interactions. I wasn't able to find any significance at all in the 6 dummy variables…is that possible? How should I interpret it in discussion?

Jim Frost says

Hi,

Correct me if I’m wrong, but I think the heart of your question is that your main effects are not significant but the interaction effects are significant. If so, that’s not a problem.

I write about this in more detail in my ebook about regression analysis. In a nutshell, it just means that for each variable, the effect depends on the value of the other variables. None of the variables have an effect that is independent of the other variables. There’s nothing wrong with that. It’s apparently just the nature of your effects.

You should get a p-value for the overall interaction for terms A and B. If those aren’t significant but just certain combinations of factor levels, you can state that the difference between that specific combination of levels and the baseline is statistically significant. That’s not quite as strong of a result if the entire A*B interaction was significant but it might be just fine. The validity of that depends on the nature of your subject area of course. So, I can’t comment on that. But, in some cases that might be fine. Perhaps the effect only exists for that specific combination and not others. Use your subject-area knowledge and theory to help you determine whether that makes sense. Are those results consistent with theory and other studies?

Best of luck with your study!

George says

Hi there,

I’m reading a paper where they have a treatment*replicate interaction and I just want to make sure I understand what that means and how to avoid it. So in the study they have multiple pathogen strains they are testing on multiple varieties of a crop and they do two replications. They say there is a strain*replicate interaction so they can’t merge the replicates together.

Does this mean a variable not accounted for had a significant effect on the effect of the strain for only one of the replicates?

Could adding more replicates eliminate the interaction?

Thank you for any insight into the matter of replicate interactions.