What is Factor Analysis?
Factor analysis uses the correlation structure amongst observed variables to model a smaller number of unobserved, latent variables known as factors. Researchers use this statistical method when subject-area knowledge suggests that latent factors cause observable variables to covary. Use factor analysis to identify the hidden variables.
Analysts often refer to the observed variables as indicators because they literally indicate information about the factor. Factor analysis treats these indicators as linear combinations of the factors in the analysis plus an error. The procedure assesses how much of the variance each factor explains within the indicators. The idea is that the latent factors create commonalities in some of the observed variables.
For example, socioeconomic status (SES) is a factor you can’t measure directly. However, you can assess occupation, income, and education levels. These variables all relate to socioeconomic status. People with a particular socioeconomic status tend to have similar values for the observable variables. If the factor (SES) has a strong relationship with these indicators, then it accounts for a large portion of the variance in the indicators.
The illustration below illustrates how the four hidden factors in blue drive the measurable values in the yellow indicator tags.
Researchers frequently use factor analysis in psychology, sociology, marketing, and machine learning.
Let’s dig deeper into the goals of factor analysis, critical methodology choices, and an example. This guide provides practical advice for performing factor analysis.
Analysis Goals
Factor analysis simplifies a complex dataset by taking a larger number of observed variables and reducing them to a smaller set of unobserved factors. Anytime you simplify something, you’re trading off exactness with ease of understanding. Ideally, you obtain a result where the simplification helps you better understand the underlying reality of the subject area. However, this process involves several methodological and interpretative judgment calls. Indeed, while the analysis identifies factors, it’s up to the researchers to name them! Consequently, analysts debate factor analysis results more often than other statistical analyses.
While all factor analysis aims to find latent factors, researchers use it for two primary goals. They either want to explore and discover the structure within a dataset or confirm the validity of existing hypotheses and measurement instruments.
Exploratory Factor Analysis (EFA)
Researchers use exploratory factor analysis (EFA) when they do not already have a good understanding of the factors present in a dataset. In this scenario, they use factor analysis to find the factors within a dataset containing many variables. Use this approach before forming hypotheses about the patterns in your dataset. In exploratory factor analysis, researchers are likely to use statistical output and graphs to help determine the number of factors to extract.
Exploratory factor analysis is most effective when multiple variables are related to each factor. During EFA, the researchers must decide how to conduct the analysis (e.g., number of factors, extraction method, and rotation) because there are no hypotheses or assessment instruments to guide them. Use the methodology that makes sense for your research.
For example, researchers can use EFA to create a scale, a set of questions measuring one factor. Exploratory factor analysis can find the survey items that load on certain constructs.
Confirmatory Factor Analysis (CFA)
Confirmatory factor analysis (CFA) is a more rigid process than EFA. Using this method, the researchers seek to confirm existing hypotheses developed by themselves or others. This process aims to confirm previous ideas, research, and measurement and assessment instruments. Consequently, the nature of what they want to verify will impose constraints on the analysis.
Before the factor analysis, the researchers must state their methodology including extraction method, number of factors, and type of rotation. They base these decisions on the nature of what they’re confirming. Afterwards, the researchers will determine whether the model’s goodness-of-fit and pattern of factor loadings match those predicted by the theory or assessment instruments.
In this vein, confirmatory factor analysis can help assess construct validity. The underlying constructs are the latent factors, while the items in the assessment instrument are the indicators. Similarly, it can also evaluate the validity of measurement systems. Does the tool measure the construct it claims to measure?
For example, researchers might want to confirm factors underlying the items in a personality inventory. Matching the inventory and its theories will impose methodological choices on the researchers, such as the number of factors.
We’ll get to an example factor analysis in short order, but first, let’s cover some key concepts and methodology choices you’ll need to know for the example.
Learn more about Validity and Construct Validity.
Factors
In this context, factors are broader concepts or constructs that researchers can’t measure directly. These deeper factors drive other observable variables. Consequently, researchers infer the properties of unobserved factors by measuring variables that correlate with the factor. In this manner, factor analysis lets researchers identify factors they can’t evaluate directly.
Psychologists frequently use factor analysis because many of their factors are inherently unobservable because they exist inside the human brain.
For example, depression is a condition inside the mind that researchers can’t directly observe. However, they can ask questions and make observations about different behaviors and attitudes. Depression is an invisible driver that affects many outcomes we can measure. Consequently, people with depression will tend to have more similar responses to those outcomes than those who are not depressed.
For similar reasons, factor analysis in psychology often identifies and evaluates other mental characteristics, such as intelligence, perseverance, and self-esteem. The researchers can see how a set of measurements load on these factors and others.
Method of Factor Extraction
The first methodology choice for factor analysis is the mathematical approach for extracting the factors from your dataset. The most common choices are maximum likelihood (ML), principal axis factoring (PAF), and principal components analysis (PCA).
You should use either ML or PAF most of the time.
Use ML when your data follow a normal distribution. In addition to extracting factor loadings, it also can perform hypothesis tests, construct confidence intervals, and calculate goodness-of-fit statistics.
Use PAF when your data violates multivariate normality. PAF doesn’t assume that your data follow any distribution, so you could use it when they are normally distributed. However, this method can’t provide all the statistical measures as ML.
PCA is the default method for factor analysis in some statistical software packages, but it isn’t a factor extraction method. It is a data reduction technique to find components. There are technical differences, but in a nutshell, factor analysis aims to reveal latent factors while PCA is only for data reduction. While calculating the components, PCA doesn’t assess the underlying commonalities that unobserved factors cause.
PCA gained popularity because it was a faster algorithm during a time of slower, more expensive computers. If you’re using PCA for factor analysis, do some research to be sure it’s the correct method for your study. Learn more about PCA in, Principal Component Analysis Guide and Example.
There are other methods of factor extraction, but the factor analysis literature has not strongly shown that any of them are better than maximum likelihood or principal axis factoring.
Number of Factors to Extract
You need to specify the number of factors to extract from your data except when using principal component components. The method for determining that number depends on whether you’re performing exploratory or confirmatory factor analysis.
Exploratory Factor Analysis
In EFA, researchers must specify the number of factors to retain. The maximum number of factors you can extract equals the number of variables in your dataset. However, you typically want to reduce the number of factors as much as possible while maximizing the total amount of variance the factors explain.
That’s the notion of a parsimonious model in statistics. When adding factors, there are diminishing returns. At some point, you’ll find that an additional factor doesn’t substantially increase the explained variance. That’s when adding factors needlessly complicates the model. Go with the simplest model that explains most of the variance.
Fortunately, a simple statistical tool known as a scree plot helps you manage this tradeoff.
Use your statistical software to produce a scree plot. Then look for the bend in the data where the curve flattens. The number of points before the bend is often the correct number of factors to extract.
The scree plot below relates to the factor analysis example later in this post. The graph displays the Eigenvalues by the number of factors. Eigenvalues relate to the amount of explained variance.
The scree plot shows the bend in the curve occurring at factor 6. Consequently, we need to extract five factors. Those five explain most of the variance. Additional factors do not explain much more.
Some analysts and software use Eigenvalues > 1 to retain a factor. However, simulation studies have found that this tends to extract too many factors and that the scree plot method is better. (Costello & Osborne, 2005).
Of course, as you explore your data and evaluate the results, you can use theory and subject-area knowledge to adjust the number of factors. The factors and their interpretations must fit the context of your study.
Confirmatory Factor Analysis
In CFA, researchers specify the number of factors to retain using existing theory or measurement instruments before performing the analysis. For example, if a measurement instrument purports to assess three constructs, then the factor analysis should extract three factors and see if the results match theory.
Factor Loadings
In factor analysis, the loadings describe the relationships between the factors and the observed variables. By evaluating the factor loadings, you can understand the strength of the relationship between each variable and the factor. Additionally, you can identify the observed variables corresponding to a specific factor.
Interpret loadings like correlation coefficients. Values range from -1 to +1. The sign indicates the direction of the relations (positive or negative), while the absolute value indicates the strength. Stronger relationships have factor loadings closer to -1 and +1. Weaker relationships are close to zero.
Stronger relationships in the factor analysis context indicate that the factors explain much of the variance in the observed variables.
Related post: Correlation Coefficients
Factor Rotations
In factor analysis, the initial set of loadings is only one of an infinite number of possible solutions that describe the data equally. Unfortunately, the initial answer is frequently difficult to interpret because each factor can contain middling loadings for many indicators. That makes it hard to label them. You want to say that particular variables correlate strongly with a factor while most others do not correlate at all. A sharp contrast between high and low loadings makes that easier.
Rotating the factors addresses this problem by maximizing and minimizing the entire set of factor loadings. The goal is to produce a limited number of high loadings and many low loadings for each factor.
This combination lets you identify the relatively few indicators that strongly correlate with a factor and the larger number of variables that do not correlate with it. You can more easily determine what relates to a factor and what does not. This condition is what statisticians mean by simplifying factor analysis results and making them easier to interpret.
Graphical illustration
Let me show you how factor rotations work graphically using scatterplots.
Factor analysis starts by calculating the pattern of factor loadings. However, it picks an arbitrary set of axes by which to report them. Rotating the axes while leaving the data points unaltered keeps the original model and data pattern in place while producing more interpretable results.
To make this graphable in two dimensions, we’ll use two factors represented by the X and Y axes. On the scatterplot below, the six data points represent the observed variables, and the X and Y coordinates indicate their loadings for the two factors. Ideally, the dots fall right on an axis because that shows a high loading for that factor and a zero loading for the other.
For the initial factor analysis solution on the scatterplot, the points contain a mixture of both X and Y coordinates and aren’t close to a factor’s axis. That makes the results difficult to interpret because the variables have middling loads on all the factors. Visually, they’re not clumped near axes, making it difficult to assign the variables to one.
Rotating the axes around the scatterplot increases or decreases the X and Y values while retaining the original pattern of data points. At the blue rotation on the graph below, you maximize one factor loading while minimizing the other for all data points. The result is that the loads are high on one indicator but low on the other.
On the graph, all data points cluster close to one of the two factors on the blue rotated axes, making it easy to associate the observed variables with one factor.
Types of Rotations
Throughout these rotations, you work with the same data points and factor analysis model. The model fits the data for the rotated loadings equally as well as the initial loadings, but they’re easier to interpret. You’re using a different coordinate system to gain a different perspective of the same pattern of points.
There are two fundamental types of rotation in factor analysis, oblique and orthogonal.
Oblique rotations allow correlation amongst the factors, while orthogonal rotations assume they are entirely uncorrelated.
Graphically, orthogonal rotations enforce a 90° separation between axes, as shown in the example above, where the rotated axes form right angles.
Oblique rotations are not required to have axes forming right angles, as shown below for a different dataset.
Notice how the freedom for each axis to take any orientation allows them to fit the data more closely than when enforcing the 90° constraint. Consequently, oblique rotations can produce simpler structures than orthogonal rotations in some cases. However, these results can contain correlated factors.
Oblique Rotations | Orthogonal Rotations |
Promax | Varimax |
Oblimin | Equimax |
Direct Quartimin | Quartimax |
In practice, oblique rotations produce similar results as orthogonal rotations when the factors are uncorrelated in the real world. However, if you impose an orthogonal rotation on genuinely correlated factors, it can adversely affect the results. Despite the benefits of oblique rotations, analysts tend to use orthogonal rotations more frequently, which might be a mistake in some cases.
When choosing a rotation method in factor analysis, be sure it matches your underlying assumptions and subject-area knowledge about whether the factors are correlated.
Factor Analysis Example
Imagine that we are human resources researchers who want to understand the underlying factors for job candidates. We measured 12 variables and perform factor analysis to identify the latent factors. Download the CSV dataset: FactorAnalysis
The first step is to determine the number of factors to extract. Earlier in this post, I displayed the scree plot, which indicated we should extract five factors. If necessary, we can perform the analysis with a different number of factors later.
For the factor analysis, we’ll assume normality and use Maximum Likelihood to extract the factors. I’d prefer to use an oblique rotation, but my software only has orthogonal rotations. So, we’ll use Varimax. Let’s perform the analysis!
Interpreting the Results
In the bottom right of the output, we see that the five factors account for 81.8% of the variance. The %Var row along the bottom shows how much of the variance each explains. The five factors are roughly equal, explaining between 13.5% to 19% of the variance. Learn about Variance.
The Communality column displays the proportion of the variance the five factors explain for each variable. Values closer to 1 are better. The five factors explain the most variance for Resume (0.989) and the least for Appearance (0.643).
In the factor analysis output, the circled loadings show which variables have high loadings for each factor. As shown in the table below, we can assign labels encompassing the properties of the highly loading variables for each factor.
Factor | Label | High Loading Variables |
1 | Relevant Background | Academic record, Potential, Experience |
2 | Personal Characteristics | Confidence, Likeability, Appearance |
3 | General Work Skills | Organization, Communication |
4 | Writing Skills | Letter, Resume |
5 | Overall Fit | Company Fit, Job Fit |
In summary, these five factors explain a large proportion of the variance, and we can devise reasonable labels for each. These five latent factors drive the values of the 12 variables we measured.
References
Hervé Abdi (2003), “Factor Rotations in Factor Analyses,” In: Lewis-Beck M., Bryman, A., Futing T. (Eds.) (2003). Encyclopedia of Social Sciences Research Methods. Thousand Oaks (CA): Sage.
Brown, Michael W., (2001) “An Overview of Analytic Rotation in Exploratory Factor Analysis,” Multivariate Behavioral Research, 36 (1), 111-150.
Costello, Anna B. and Osborne, Jason (2005) “Best practices in exploratory factor analysis: four recommendations for getting the most from your analysis,” Practical Assessment, Research, and
Evaluation: Vol. 10 , Article 7.
What would be the eign value in efa
Hi Jim, this is an excellent yet succinct article on the topic.
A very basic question, though: the dataset contains ordinal data. Is this ok? I’m a student in a Multivariate Statistics course, and as far as I’m aware, both PCA and common factor analysis dictate metric data. Or is it assumed that since the ordinal data has been coded into a range of 0-10, then the data is considered numeric and can be applied with PCA or CFA?
Sorry for the dumb question, and thank you.
Hi Hendra,
That’s a great question.
For the example in this post, we’re dealing with data on a 10 point scale where the differences between all points are equal. Consequently, we can treat discrete data as continuous data.
Now, to your question about ordinal data. You can use ordinal data with factor analysis however you might need to use specific methods.
For ordinal data, it’s often recommended to use polychoric correlations instead of Pearson correlations. Polychoric correlations estimate the correlation between two latent continuous variables that underlie the observed ordinal variables. This provides a more accurate correlation matrix for factor analysis of ordinal data.
I’ve also heard about categorical PCA and nonlinear Factor Analysis that use a monotonical transformation of ordinal data.
I hope that helps clarify it for you!
Once identifying how much each variability the factors contribute, what steps could we take from here to make predictions about variables ?
Hi Brittany,
Thanks for the great question! And thanks for you kind words in your other comment! 🙂
What you can do is calculate all the factor scores for each observation. Some software will do this for you as an option. Or, you can input values into the regression equations for the factor scores that are included in the output.
Then use these scores as the independent variables in regression analysis. From there, you can use the regression model to make predictions.
Ideally, you’d evaluate the regression model before making predictions and use cross validation to be sure that the model works for observations outside the dataset you used to fit the model.
I hope that helps!
Wow! This was really helpful and structured very well for interpretation. Thank you!
I can imagine that Prof will have further explanations on this down the line at some point in future. I’m waiting… Thanks Prof Jim for your usual intuitive manner of explaining concepts. Funsho
Thanks for a very comprehensive guide. I learnt a lot.
In PCA, we usually extract the components and use it for predictive modeling.
Is this the case with Factor Analysis as well? Can we use factors as predictors?
Hi Howie,
I have not used factors as predictors, but I think it would be possible. However, PCA’s goal is to maximize data reduction. This process is particularly valuable when you have many variables, low sample size and/or collinearity between the predictors. Factor Analysis also reduces the data but that’s not its primary goal. Consequently, my sense is that PCA is better for that predictive modeling while Factor Analysis is better for when you’re trying to understand the underlying factors (which you aren’t with PCA). But, again, I haven’t tried using factors in that way nor I have compared the results to PCA. So, take that with a grain of salt!