Use two-way ANOVA to assess differences between the group means that are defined by two categorical factors. In this post, we’ll work through two-way ANOVA using Excel. Even if Excel isn’t your main statistical package, this post is an excellent introduction to two-way ANOVA. Excel refers to this analysis as two factor ANOVA.

Two-way ANOVA is a hypothesis test that allows you to compare group means. Like all hypothesis tests, two-way ANOVA uses sample data to infer the properties of an entire population.

In this post, I provide step-by-step instructions for using Excel to perform two factor ANOVA and then interpret the results. Importantly, I also include links to many additional resources I’ve written that you won’t find in Excel’s documentation.

Before proceeding, ensure that Excel’s Data Analysis ToolPak is installed for Excel. Click Data in Excel’s menu along the top and look for Data Analysis in the Analyze section.

If you don’t see Data Analysis, you need to install it. My post about using Excel to perform t-tests shows how to install the Data Analysis ToolPak. It’s free!

**Related post**: Hypothesis Testing Overview

## Two-Way ANOVA in Excel

To perform this analysis, you’ll need two categorical variables, which analysts also refer to as factors. These factors are your independent variables. The number of factors in your analysis determines the name of the ANOVA analysis. One-way ANOVA uses one factor. Two-way ANOVA has two. And, so on. Each factor has a finite number of possible values, which are known as levels. For example, gender is a categorical factor that has the two levels of male and female.

You also need one continuous outcome variable, which is the dependent variable. The various combinations of values for the two categorical variables divide the continuous data into groups. Two-way ANOVA determines whether the mean differences between these groups are statistically significant.

Additionally, two-way ANOVA determines whether the interaction effect between the two factors is statistically significant. When significant interaction effects are present, it’s crucial to interpret them correctly. Because they can be tricky to understand, I’ll spend a little extra time covering interaction effects.

I’ll run through two examples to explain the differences between cases where the interaction effect is and is not significant. Download the CSV dataset for both examples: Two-Way ANOVA. These data are fictional.

To learn more about ANOVA tests, including the more complex forms, read my ANOVA Overview.

## Choose Two Factor ANOVA in Excel

To perform two-way ANOVA in Excel, choose one of the two factor ANOVA options, as shown below. You can choose analyses for designs that either have replications or do not have replications.

Replication in this context refers to collecting multiple observations for each group. For instance, in the first analysis example in this post, we’ll look at gender and college major to determine whether these factors correspond to statistically significant income differences. If we collect only one observation per group (e.g., one female who majors in statistics), we’d use the “without replication” analysis. However, that produces a tiny sample size. Therefore, we’ll use the “with replication” option so we can have multiple observations per group. For both of the examples, we’ll have 20 observations per group.

Excel can only perform balanced two-way ANOVA. In balanced designs, all groups have an equal number of observations.

**Related post**: How to do One-Way ANOVA in Excel

## Example Two-Way ANOVA without a Significant Interaction

Imagine that we are assessing annual salaries, which is our continuous dependent variable. Our two categorical factors are gender and college major. For this analysis, we’ll use the three majors of statistics, psychology, and political science. The combination of these two factors (2 genders X 3 majors) produces the following six groups. Each group contains 20 observations.

Male / Statistics: $77,743 | Female / Statistics: $74,074 |

Male / Psychology: $69,766 | Female / Psychology: $65,320 |

Male / Political Science: $62,015 | Female / Political Science: $55,195 |

The dollar amount indicates the average income for each group. Two-way ANOVA determines whether the observed differences between means provide strong enough evidence to conclude that the population means are different. Let’s perform the analysis!

In Excel, do the following steps:

- Click
*Data Analysis*on the Data tab. - From the Data Analysis popup, choose
*Anova: Two-Factor With Replication*. - Under
*Input*, select the ranges for all columns of data. - In
*Rows per sample*, enter*20*. This represents the number of observations per group. - Excel uses a default Alpha value of 0.05, which is usually a good value. Alpha is the significance level. Change this value only when you have a specific reason for doing so.
- Click OK.

For this example, the popup should look like this:

### Interpreting Excel’s Two-Way ANOVA Results

First, look in the P-value column in the ANOVA Source of Variation table at the bottom of the output. Because the p-values for both Gender and Major are less than our significance level, these factors are statistically significant. These are the main effects in the model. Note that these p-values are so low that Excel uses scientific notation to represent them. Learn about How to Use Scientific Notation.

On the other hand, the interaction effect is not significant because its p-value (0.151) is greater than our significance level. Because the interaction effect is not significant, we can focus on only the main effects.

The main effects are the portion of the relationship between an independent variable and the dependent variable that does not change based on the values of the other variables in the model. For example, gender’s main effect on average income does not change from one major to another. It’s a consistent effect across majors. Males have a higher average income, and that effect is consistent (plus or minus random error) across majors.

### Summary Table Information

The Summary table in the top portion displays mean incomes for our six groups. These means range from a low of $55,191 for female political science majors to a high of $77,743 for male statistics majors. Our sample means are different. Importantly, the significant p-values indicate that our sample data provide strong enough evidence to conclude that the population means of these six groups are not equal.

Overall, males tend to have higher average incomes across the majors. Additionally, statistics majors tend to have the largest average salary.

If we had obtained insignificant p-values for Gender and Major, the sample differences between group means might represent random sampling error rather than differences between the population means.

For more information about the other statistics in the output, click the links for F-value and F critical value, degrees of freedom (df), average (mean), variance, p-values, and significance level.

## Two-Way ANOVA Example 2 with a Significant Interaction

In the previous example, the interaction effect was not significant. Consequently, it was a simple matter to interpret the main effects. How do you understand interaction effects? To emphasize how these effects work, I’ll use an intentionally silly example to make it more intuitive.

Imagine that we’re performing a taste test, and the dependent variable is Enjoyment. Our two factors are Food and Condiment. We want to determine which condiment produces the most enjoyment. To keep things simple, we’ll include only two foods (ice cream and hot dogs) and two condiments (chocolate sauce and mustard) in our analysis.

I think of interaction effects as an “it depends” effect.

Interaction effects indicate that another variable influences the relationship between an independent and dependent variable. This type of effect makes the model more complex, but if it accurately reflects the real world, it is critical to know how to interpret it. For example, the relationship between condiments and enjoyment probably depends on the type of food.

In Excel, do the following steps:

- Click
*Data Analysis*on the Data tab. - From the Data Analysis popup, choose
*Anova: Two-Factor With Replication*. - Under
*Input*, select the ranges for all columns of data. - In
*Rows per sample*, enter*20*. This represents the number of observations per group. - Excel uses a default Alpha value of 0.05, which is usually a good value. Alpha is the significance level. Change this value only when you have a specific reason for doing so.
- Click OK.

For this example, the popup should look like this:

### Interpreting Excel’s Two-Way ANOVA Results

First, look in the P-value column in the ANOVA Source of Variation table at the bottom of the output. The p-values indicate that Food is not significant (p = 0.801) , while Condiment is statistically significant (p = 0.001). These are the main effects. The significance of Condiment suggests that a portion of the relationship between Condiment and Enjoyment does not change based on the value of the other independent variable.

However, the extremely low p-value for the interaction effect is also statistically significant. In general, interaction effects indicate that the relationship between an independent variable and a dependent variable changes based on the value of another variable.

For our ANOVA model, the significant interaction indicates that the relationship between Condiment and Enjoyment *depends* on the type of food. That makes sense when you think about it. You might like chocolate sauce quite a bit, but I bet you’d find it disgusting on hot dogs!

### Interaction Effects in Depth

If only the main effects are significant, you’d be able to state that either chocolate sauce or mustard always increased your satisfaction the most and by a consistent amount regardless of the food. But that’s not how it works with food and condiments!

When you have statistically significant interaction effects, you can’t interpret the main effects without considering the interactions. In the previous example, you can’t answer the question about which condiment is better without knowing the type of food. Sometimes mustard is better while other times chocolate sauce is better. Again, “it depends.”

It’s often easiest to interpret an interaction effect using specialized plots known as interaction plots. Unfortunately, Excel can’t create these plots. However, you can look at the means for each group to get an idea of how it works. The hot dog/mustard group (89.6) and the ice cream/chocolate sauce group (93.0) each have relatively high enjoyment scores. Conversely, hot dog/chocolate sauce group (65.3) and ice cream/mustard group (61.3) have lower scores. That makes sense intuitively. However, to explain the changing relationship between Condiment and Enjoyment statistically, you need to include the interaction effect.

To learn more about interaction effects in more detail and with graphs, see my post Understanding Interaction Effects.

## What Excel Does Not Include

While it’s fantastic that Excel allows you to perform two-way ANOVA, it doesn’t have all the features that a full-blown statistical package provides. As I mentioned, the easiest way to understand interaction effects is by using interaction plots. Unfortunately, Excel can’t create them.

Excel can analyze only balanced designs. If your groups have an unequal number of observations, you’ll need to use other software.

If you need to control for a covariate, you’ll need to perform an Analysis of Covariance (ANCOVA) instead. Unfortunately, that’s not an option in Excel.

Additionally, significant ANOVA results only indicate that not all group means are equal. If you want to determine whether the mean differences between specific pairs of groups are statistically significant, you’ll need to use a post hoc analysis.

It’s always good practice to check the residual plots to be sure you’re not violating an assumption. Unfortunately, Excel can’t create them.

Brandon says

Hi Jim, in this example the gender factor is significant, yet we say there is no variation in the mean income given gender. While new to interpretation of ANOVA, I would think that with a significant result for the gender factor, we would reject the null and accept the alternative hypothesis that says the mean incomes in the gender factor are different. ??

Jim Frost says

Hi Brandon,

Yes, that is correct that gender is significant. For these data, males have a consistently higher mean income across the majors.

I think what you’re referring to is that the interaction effect between gender and major is not significant. So, mean male income is consistently higher across majors, indicating that the mean difference in genders does not change by major. If the interaction between gender and major had been significant, then the mean difference between male and female incomes would depend (or vary) based on major. But it doesn’t for these data.

I hope that makes that distinction clear. The main effects of gender and major are significant but the interaction effect between them is not significant.

Michael Bobic says

Jim This post would be a lot more useful if you gave us the sample data set up. For example, you have three categories of degrees, but your data input has four columns. Is one of those columns a category for gender? It just seems like you’re missing some information for us.

Jim Frost says

Michael, it appears that you need to read the article more carefully. I’ve included a link to download the dataset and the article tells you

exactlywhat you need to do in Excel, including how to fill in the popup boxes. I have both written instructionsandscreenshots showing what you need to do. I don’t what more you need.Dila Elbi says

Hi I am trying to use the two-way ANOVA for my research project where I am trying to asses if the interaction of both student personalit and teacher personality have an effect on student GPA. So I have two categorical values that affect quantitative values. I have 2 groups for teacher personality, introversion and extroversion, and 3 groups for student personality; so, sx, sp. I followed the exact instruction on this website and also other websites to carry out the tests but I am just not sure if my results are accurate and there is no one to help me asses the accuracy of these results or how to interprete it even if they are accurate. While doing the test I first put the teacher categories to the first column then the student categories to the first row and put down the GPA’s of students that fit into these categories that I have collected with a form but Excel game me #NUM ! for sample and interaction. I then ran the tests by putting the teachers in first column, then the students in the second column, and gpas in the third column but when I did that Excel didn’t let me put words into the second column so I performed it by assigning numbers to the student personalities that gave me the most accurate looking table (to my opinion, because I only look at the p-values and if there has been any error or not because I don’t know what any of the data mean) by many variations of how to write these data. So in short, I don’t if the problem is how I write my data, how I run the test, if my data is even appropriate for ANOVA or something else. I’ll appreciate any type of feedback, thanks already.

Adrian says

Hi Jim,

Regarding the the Food and Condiment example, how were the enjoyment scores calculated? Was there a scale included (i.e., how were the responses coded)?

Jim Frost says

Hi Adrian,

The data are entirely fictional. In my head, I imagined some sort of survey. I knew someone who did statistical studies for food manufacturers, and they did taste tests with surveys and created their own scales for those purposes. I don’t know the details about that, but the food industry uses measurements like that.

So, that’s what I was thinking about when writing that example. But, in reality, I don’t think you would need to ask people about how much they liked mustard on sundaes and chocolate sauce on hot dogs! If you did, they would probably give you a strange look and probably refuse!

I just wanted to find some visceral examples that would drive the point home. You can love mustard and you can love chocolate sauce, but if someone asks you which one you’d want on your next bit of food? You’d have to say it depends on the food!

I just figured the repugnant combinations would make the idea of an interaction effect easier to understand before getting into a more scientific example.

I’m so glad to hear that you’re enjoying my book! That makes my day!

Michael Furlan says

I can not find Data Analysis in my Excel. Where is it?

Jim Frost says

Hi Michael,

I show you where to find it at the top of this post. If it’s not there, find the directions for how to add it in my post about using Excel with T-Tests, where I cover that near the beginning.

Weam Alghabban says

Hi Jim.

I have one question please.

I have scale dependent variable and 2 independent variables, one with 2 categories and the other with 3 categories (between subjects).

Data is not normally distributed and there are outliers. So, in this case, I have to use a nonparametric test equivelant to 2 way ANOVA.

What is the nonparametric test equivelant to 2-way ANOVA?

Thank you very much

v.c.Jayaramaiah says

To

Jim Frost

Sir. as you have described two way anova, I tried and tried to analyse my researach data two Factor-with replication I am getting thereply Input range contains non numeric data with this dialogue my repeated attempts have gone waste please help to solve this error?

Aneeza says

It helped me a alot thank much sir

Melanie says

Hi Jim!

Thanks so much for the useful website! Its a big help now when I’m working on my thesis, since its been many years since I studied statistics! I’ve gotten a nice big data set and I feel its a shame that I don’t really know how to best analyze it. I’m trying to decide what kind of tests are best to use. Did I understand correctly that the results you get for the individual “main effects” are different from what you’d get if you did, for example, a t-test to individually test gender vs. income in example 1? Would you normally do both a t-test and an ANOVA, or is the ANOVA enough to see if there is an effect of gender on income?

Thanks!

Melanie

Christine says

Great post – very clear! Many thanks

Francis Asiedu says

Jim, where does Standard Error of the Means come in for the Two-way ANOVA with replications?

Mohammed Atari says

My name is Mohammed Atari, I bought copies of your books Introduction to Statistics and Hypothesis Testing. Very interesting books.

I have a question about comparing a number of concentration-time profiles to check if they are equivalent. Each profile has a number of readings at specific time points. For two profiles, I did a paired-sample t-test, however, I am now at a stage where I would like to compare three or more profiles (different conditions). I understand that ANOVA is the way to do so; this method averages data of all subjects within each condition, then compares their means. However, I would like to look at the differences of individual subjects between the conditions, and compare those between subjects.

Please advise and best wishes,

Mohammed.

Jim Frost says

Hi Mohammed,

Thank you so much for supporting my ebooks. I really appreciate it! And, I’m thrilled to hear that you found them to be helpful!

To do what you’d like it to do, you’ll need to use a mixed model (both fixed and random factors). This is a more complex model than in any of my books (yet). You’ll need use an ANOVA model and set the profile and time period as fixed factors. Then, be sure that each subject as a unique identifier, and include subject as a random factor.

I detail some of this process in my post about repeated measures ANOVA. In particular, see the example near the end.

It’s a more complicated model, but a good one to use when needed!

I hope this helps!

Julie Carignan says

Hi Prof Jim,

As a Business Analyst who has a goal of morphing into Business Intelligence Analyst, I find myself wanting to refresh myself on statistics I learned in my Six Sigma training and biostatistics class I took in grad school, I’m very happy to come upon this site! Thank you for sharing your knowledge!

On this ANOVA topic, I notice that it goes right into how to do these types of analyses. I haven’t been able to find the basic explanation of WHY an ANOVA is used and WHEN I’d resort to it. Could you write a topic just providing that basic context to help orient ourselves?

Jim Frost says

Hi Julie,

In a nutshell, you use ANOVA to determine whether the means of three or more groups are not equal. In one-way ANOVA, you have one factor that divides the data into groups, such as experimental group. In two-way ANOVA, as shown in this post, there are two factors that divide the data into groups, such college major and gender. Basically, use ANOVA when you want to compare group means. If you have just two group means, you can use a t-test. I’ve actually just released an ebook all about hypothesis testing that goes into the whys and whens for all types of hypothesis tests, include ANOVA. It sounds like it’s exactly what you’re looking for.

Swateja Deshmukh says

Thanks Prof Jim

This was really a wonderfull explanation for Two way ANOVA in a very simple format.

MALLICKA says

Hi Jim! your explanations are very nice.

As i am not very pro to do analysis, still learning. My very basic question is how to select put range in that input box?

Thanks and Regards

Mallicka

Wagner Fontes says

Thank you very much for making it crystal clear!

I will study in more detail this aspect (factorial design) of this project to better understand if the most relevant factors are confounded or not by the currently lacking information.

Jim Frost says

You’re very welcome, Wagner! Best of luck with your analysis!

Reshme says

Hi Professor,

Would love to get some doubts cleared from you regarding method of analysis in excel for a 3 Factorial Randomized Block Design consisting of FACTOR A( 2 levels), FACTOR B( 8 levels) and FACTOR C (2 levels) with 3 replications.

Thanks in advance.

Wagner Fontes says

Sorry I was not clear…

The following table better describes which combinations of the independent variables I have and the dashes represent what I don’t have:

gravity Y – Y Y N – N N

training Y – Y N Y – Y N

flow restriction Y – N N Y – N N

so, I don’t have:

gravity Y N

training N N

flow restriction Y Y

I was thinking about your suggestion of nested factors, as I don’t have a group of conditions for flow restriction that is specific for each condition of training. I just don’t have any condition for flow restriction when the training was not applied. I’m rather confused, but I think that would not fit nested ANOVA as well…

Thanks,

Wagner.

Jim Frost says

Hi Wagner,

I think formatting issues in the comments and my lack of subject area knowledge are hindering us here. It’s still not clear to me which combinations you have and are missing. I see that your design has 8 possible combinations that you can put in the table like below. While I’m not totally clear on what you have and don’t have, I think I have enough information to answer your question broadly.

A full factorial design is where you have measured all possible combinations of factor levels. You have a fractional factorial design, which means that you haven’t measured at all possible levels. It’s contains a subset of the combination of the full factorial design. You can still analyze these designs but they confound some of the main effects and 2-way interactions and they cannot be separated from the effects of other higher-order interactions. That’s often OK because researchers frequently assume the higher-order interactions have a negligible effect.

However, determining the specific limitations of your design is beyond what I can determine here. If you analyze it using software that can handle factorial designs, it should tell you which effects are confounded.

I hope that helps!

Wagner Fontes says

Hi Jim,

Thank you for another nice post.

If I may ask, I have a question about an experimental design:

I have data from an experiment measuring the amount of one protein in three factors (gravity, training, flow restriction), each presenting two levels (yes/no). Up to here, everything seems fine to use three-way ANOVA. I would have one dependent variable (protein amount) and three independent variables.

My problem is that I don’t have data for one of the combinations: The flow restriction is a modification of the training, therefore it only exists when training exists (the combinations below do not exist)

gravity = yes gravity = no

training = no training = no

flow restr. = yes flow restr. = yes

In that case, can I consider the variable “flow restriction” as independent? As it depends on the variable “training” to exist.

How should I analyze this? Is three-way ANOVA appropriate?

I also considered using two-way ANOVA (the variable training would have three levels – “yes with restriction”, “yes without restriction” and “no”), but then I would not be able to evaluate the interaction between flow restriction and the other variables.

Thanks in advance!

Jim Frost says

Hi Wagner,

Just to clarify, you don’t have data for one condition, which means you don’t have data for three combinations of conditions? It’s not really clear from your table. You have 6 treatment combinations. How many of those do you have data for?

Would flow restriction be a factor nested within training? Maybe you have a nested design? Three factors but one nested with another.

Dr M. SANKARAPANDIAN says

If my data involves Experimental and control as well as repeated measure(pre,middle and post) s. Can we use One way ANOVA with repeated measures else what method to be adopted

Tessa Davis says

My data involves 2 independent variables (one of which is categorical (drug doses) and one of which is ordinal (time)), and 1 dependent variable (protein expression). I was able to arrange my data in the format for the 2-way ANOVA with repetition following your example.

My groups are ‘drug dose A/Time 1’ (8 measurements), ‘drug dose B/Time 1’ (8 measurements)… and so on, for a total of 3 drug doses and 18 time points. The p-value indicates that the effect of drug dose on protein expression does not depend on the time point.

However, I want to see if the effect on protein expression is significant between drug doses at each time point.

How would I arrange my data to see this? Would I need to use a different test (if so, which one)?

Thank you for your help (and I really like your explanation above)!

Jim Frost says

Hi Tessa,

It sounds like you need to perform post hoc analysis to determine whether differences between different factor levels are significant. You wouldn’t need to necessarily rearrange your data, but you’d need to perform some tests that, unfortunately, Excel does not include. I’m not sure if you’re using Excel, but all dedicated statistical software will include these tests. Read my post of them to learn more!

biruk says

nice

AndrÃ© SERVILLE says

Hi Jim!

I appreciate it.

Very nice

Dr. d p singh says

Thanks Prof Jim

Very good exposition to Two way anova