Back in 2014, House Speaker John Boehner resigned, and then Kevin McCarthy refused the position of Speaker of the House before the vote. The Republican’s search for a new speaker ultimately led to Paul Ryan. Simultaneously, the Republican Freedom Caucus was making the news with a potential shutdown of the government that was controversial even amongst some Republicans.
During the Republican Presidential nomination process, there was a prominent split between candidates who were pro-establishment and anti-establishment. Of course, the end result of the dramatic 2016 U.S. Presidential election was the inauguration of a complete political outsider.
Change was in the air. Were these events related? Statistical analyses can help us identify the underlying variables. I’ll use binary logistic regression to determine whether the establishment split in the Republican nomination process is also evident in the membership of the Freedom Caucus.
How Does the Freedom Caucus Fit In?
The Freedom Caucus is a faction in the U.S. House of Representatives and contains about 40 Republicans. The Freedom Caucus is also known as the “Hell No” caucus and has been known to be disruptive. Depending on your political views, these disruptions are either positive or negative events!
The Freedom Caucus tends to be described as very conservative. Based on my research, this appears to be the central property of this group. However, I’ll use statistical analyses to determine whether Freedom Caucus membership is predicted by an anti-establishment outlook.
If this is verified, the disruptions caused by the Freedom Caucus and the upheaval in the Republican nomination process are linked to the common theme of an anti-establishment viewpoint. Also, I want to statistically assess whether the choice of Paul Ryan as the Speaker of the House fits this pattern.
Data for these Analyses
The House of Representatives data come from voteview.com. This website analyzes the votes by House members to determine a politician’s conservativeness and establishmentarianism. I also used this Wikipedia article to determine which Republican members of the House belonged to the Freedom Caucus at that point in time.
Here’s how you interpret these scores:
- Conservativeness: Higher scores represent more conservative viewpoints.
- Establishmentarianism: Higher scores represent viewpoints that favor the establishment.
Graphing the House Republican Data
I’ll start by graphing the data so we can see a quick picture of the situation. In the scatterplot, the points represent Republican House members by their two scores. More conservative members are further to right while those who are more against the establishment are further down. Freedom Caucus members are indicated with red squares.
It turns out that not all politicians in the Freedom Caucus are very conservative. However, they are all at least halfway to the right on the graph. Members of the Freedom Caucus also tend to fall in the bottom half of the graph, which indicates a tendency towards anti-establishment viewpoints. It appears that both conservativeness and anti-establishment viewpoints are factors in Freedom Caucus membership.
Binary Logistic Regression Model of Freedom Caucus Membership
I’ll use binary logistic regression to test these two predictors statistically. The response data are binary because Freedom Caucus membership can be only Yes or No. The table below tells us that in 2015 there were 36 Freedom Caucus members in a total of 247 House Republicans.
This table displays the p-values for both of the predictors in our analysis. The very low p-values indicate that both predictors are statistically significant. There is sufficient evidence to conclude that changes in the predictors are related to changes in the probability of Freedom Caucus membership. In other words, both conservativeness and anti-establishment viewpoints play a role.
I did not include the interaction term because it is not statistically significant.
Odds Ratios for Binary Logistic Regression
Because binary logistic regression transforms the data, the coefficients do not provide meaningful information about the relationships between independent and dependent variables. However, if you use the logit linking function, you can obtain odds ratios, which do provide helpful information about these relationships. Here are the odds ratios for my model.
The odds ratio for conservativeness indicates that for every 0.1 increase (the unit of change) in this score, a House member is ~2.7 times as likely to belong to the Freedom Caucus.
Conversely, the odds ratio for establishmentness indicates that for every 0.1 increase in that score, a House member is only ~73% as likely to belong to it.
Taking both results together, House members who are more conservative and less favorable towards the establishment make up the Freedom Caucus.
Learn more about odds and odds ratios.
Graphing the Results
The table tells us that both variables are important. But we don’t know the nature of the relationships between the two predictors and membership. The most intuitive way to understand these relationships is by using several graphs.
The two graphs below are based on the binary logistic regression model and plot the relationships using fitted values. This is important because regression models allow you to change the values of one predictor while holding the other predictors constant. In this manner, you can isolate the role of each variable in relation to the outcome.
The contour plot shows the values of our two predictors and the corresponding fitted probabilities. The highest probabilities are in the bottom-right corner. This indicates that the probability of belonging to the Freedom Caucus increases as the politician becomes more conservative and more anti-establishment.
Related post: Contour Plots: Using, Examples, and Interpreting
The main effects plot highlights how the regression model estimates each effect while holding the other predictor constant. A main effects plot is a special type of line chart.
The conservativeness panel shows that when politicians are more conservative, they are more likely to belong to the Freedom Caucus. The establishment panel indicates that when politicians have stronger anti-establishment opinions, they are more likely to be Freedom Caucus members.
Freedom Caucus membership is more complex than just particularly conservative politicians. It is a combination of conservative and anti-establishment positions that predict membership.
Here’s one more point to drive this home. Kevin McCarthy declined to run for Speaker and many saw Paul Ryan as a perfect candidate. Let’s see how these two compare by looking at their conservativeness and establishment scores. I’ll standardize each variable in order to account for scale differences. Accordingly, the table displays their z-scores.
Conservatism | Establishmentarianism | |
McCarthy | -0.169 | 0.549 |
Ryan | 0.496 | -1.180 |
Compared to McCarthy, Ryan is moderately more conservative, but he is notably more anti-establishment. This shows which way the political winds are blowing!
Collectively, I believe these results demonstrate a multifaceted divide in a changing Republican Party. This divide helps clarify why it was hard to maintain a unified caucus, hard to choose a Speaker, and the unusual nature of the Presidential election of 2016.
Jessica Wade says
Hello Jim – Can you confirm if you are using R here? I am most familiar with using Python, but if you suggest using R, what packages are you using?
Jim Frost says
Hi Jessica,
I’m using Minitab statistical software for this example.
Sarah says
Hi Jim,
Thanks for this example!
I was wondering if looking at the p value in this case was appropriate. I thought p values were only relevant when looking at sample data whereas we seem to be looking at a population in this case. Thanks for your help in clearing this up! Much appreciated.
Jim Frost says
Hi Sarah,
That’s a great question!
In a way, you are correct. We’re looking at a specific group and if we’re not trying to extrapolate to a larger population, no hypothesis testing and p-values are necessary.
However, we are in a way extrapolating to a larger population. We can think of this group as a subpopulation of all Republican politicians. We can also think of this group as the product of a political process. In manufacturing, analysts will often analyze a sample from a production process and then extrapolate from the sample to the population–which is defined as the entire production output from the process over time. And that’s sort what we’re doing here but it’s a political process the produces some members that join the Freedom Caucus. If there were new members going through the political process, we can predict whether they’d become members of the freedom caucus even though they weren’t in the original sample. That’s where hypothesis testing and p-values come in. Of course, it’s imperfect because it’s not a random sample. Although, you often can’t get a perfectly random sample anyway.
It really depends on the goal. If you want to understand just this group of politicians and not draw inferences about a larger population, you’re correct p-values are not necessary. Just assess the relationships in the data and describe them. However, if you want to extrapolate the results to a larger population, such as other Republicans or new members joining the Caucus, then you need p-values because you’re using the sample to draw inferences about a larger population.