Historians rank the U.S. Presidents from best to worse using all the historical knowledge at their disposal. Frequently, groups, such as C-Span, ask these historians to rank the Presidents and average the results together to help reduce bias. The idea is to produce a set of rankings that incorporates a broad range of historians, a vast array of information, and a historical perspective. These rankings include informed assessments of each President’s effectiveness, leadership, moral authority, administrative skills, economic management, vision, and so on.

Sounds complicated!

Fortunately, regression analysis can explain the rankings of the U.S. presidents. A friend of mine had done this and challenged me to improve upon his model. I also wanted to see if I could improve upon a model by Nate Silver. This post is the result of that challenge!

However, my goal isn’t merely to predict the eventual ranking for any President. Instead, I’m much more interested in a fascinating question that regression models can answer. Is the public’s contemporary assessment of the president consistent with the historical perspective, or do they differ?

With this in mind, I’ve collected two additional types of data that provide contemporaneous assessments of the Presidents and the social mood: presidential approval ratings and the Dow Jones Industrial Average.

Along the way, I’ll highlight the problems of overanalyzing small datasets and how to determine if you are!

**Related post**: When Should I Use Regression Analysis?

## Gallup Presidential Approval Ratings

The Gallup organization has tracked the approval rating of the president since the days of Franklin Roosevelt. Gallup uses consistent wording to facilitate comparisons over time. I’ll use fitted line plots for a preliminary investigation into whether this variable is worthy of consideration. I’ll run it three times to see how historians’ Presidential rankings correspond to the highest approval, average approval, and lowest approval for each President.

Looking at the three plots, it’s interesting to note that the highest approval rating produces an R-squared of 0.7%! The fitted line is essentially flat. If you want an exemplar of what no relationship looks like, this is it!

**Related posts**: Interpreting R-squared

### Interpreting the Approval Ratings Fitted Line Plots

The picture is more interesting in the average and low approval ratings plots. The low approval rating plot provides a better fit with an R-squared of 34.7%. Collectively, these plots suggest that it’s more important to know how *low *the approval has gone for each President than how high! Even though there are only 12 data points, the lowest approval rating is significant with a p-value of 0.044.

It seems that history remembers the worst of a President, rather the best!

Eleven data points follow the general trend. However, the one data point in the bottom-left of the plots is an outlier. That data point is Harry Truman (pictured at top). Truman doesn’t fit the model because he had very low approval ratings while he was president, but historians now give him a relatively good rank of #6.

It’s tempting to remove this data point because the model then yields an R-squared of 67%. However, there is no reason to question that data point, and I think it would be a mistake to remove it. It’s not good practice to remove data points only to produce a better fitting model.

You may be wondering, can we add other variables into this model to improve it? Unfortunately, that’s not possible because of the limited amount of data available. In regression, a good rule of thumb is that you should have at least 10 data points per predictor. We’re right at the limit and can’t legitimately add more predictors.

Instead, let’s look at a new variable that allows us to use more data points!

**Related post**: Guidelines for Removing and Handling Outliers

## Presidents and the Dow Jones Industrial Average

Previously, I assessed a model by Prechter et al. that claims to predict whether an incumbent president would be re-elected using just the Dow Jones. The theory states that the stock market is a proxy variable for social mood, not that it directly affects voting. The stock market is a good measure of social mood because if society feels positive enough to invest more money in the stock market, they are presumably happy with the status quo, which favors the incumbent.

The researchers find a positive, significant relationship between several outcomes for presidential elections that have an incumbent and the percentage change in the Dow Jones over three years. The researchers also include the traditional big three predictors of Presidential elections: economic growth, inflation, and unemployment. The study concludes that the three-year change in the DJIA is the best predictor. Further, when the study includes the DJIA predictor, the other “Big Three” predictors become insignificant.

I concluded that their model was statistically valid and used it to accurately predict the outcome of a previous election.

**Related post**: Understanding Proxy Variables

## Historian Rankings of U.S. Presidents and the Dow Jones

Because the Dow Jones Industrial Average is such an essential predictor for re-election, can it also predict how well historians view past presidents?

I gathered the Dow Jones (DJ) data for the beginning and end of each president’s time in office and calculated the percentage change. The Dow Jones began in 1896. For elections before 1896, I used the Foundation for the Study of Cycles data set, which I also used for my election prediction post. This data set uses market data from earlier indices to create a longer DJIA.

The initial exploration looks promising when I graph it in the fitted line plot.

You can see the overall negative slope. In the upper-left corner, the negative Dow Jones changes correspond with worse ranks. In the bottom right, the larger Dow Jones changes occur with better Presidential ranks. The relationship appears to be curvilinear. This curvature makes sense because there is no limit to how much the Dow Jones can improve, but the rankings cannot be better than #1! Consequently, the downward slope has to flatten out as the DJ increases. We’ll incorporate the curvature in our regression models.

My approach will be to add the Dow Jones data to both Nate Silver’s and my friend’s models to see if it increases their explanatory power.

**Related post**: Fitting Curves in Regression Models

## Nate Silver’s Model of Presidential Rankings

Nate Silver’s model below uses the percentage of the electoral vote a president receives for his second term to predict the historians’ ranking. Click to read Nate Silver’s Original Story.

It’s an elegant model because it requires only one easy to collect variable per president. The model yields an R-squared of 38.6%, which is nearly equal to the approval rating model. Silver’s model only applies to presidents who run for a second term. That gives us 29 data points, which is just enough to include the quadratic form of the Dow Jones data.

In the output, we can see that the Electoral College and Dow Jones predictors are all significant, and the R-squared is 56.7%. The adjusted R-squared also increased from Silver’s original model, suggesting that adding the additional predictor is valid. The coefficients are all as expected given the previous analyses.

Winning a higher percentage of the Electoral College and a positive Dow Jones both improve a president’s ranking by historians.

The Electoral College variable reflects the voter’s assessment of an incumbent president. The Dow Jones variable represents the social mood of the time, which has been shown to influence elections. These two variables represent an entirely contemporaneous assessment of both the president and the times and together explain just over half the variability of the historian’s ranking.

Given the number of data points, it wouldn’t be wise to add more predictors to this model. So, we’ll move on to my friend’s model.

**Related post**: Interpreting Adjusted R-squared and Predicted R-squared

## My Friend’s Great Presidents model

My friend’s original model includes these variables: years in office, assassination attempt, and war. Collectively, these variables explain 56.66% of the variance. Let’s add in the Dow Jones data and see what we get.

All of the variables are significant, and all three R-squared values have increased. This model accounts for 63.42% of the variance, or nearly two-thirds.

More years in office, a war, an assassination attempt, and a positive Dow Jones all improve a President’s ranking by historians.

When interpreting the regression coefficients, remember that higher numbered rankings are worse while lower numbered rankings are better. Consequently, the positive coefficients for no assassination attempt and no war indicate that the rankings are worse (higher numeric values) under these conditions. The negative coefficient for Years in Office indicate that additional years correspond to better rankings with lower numeric values.

With five predictors, the model is pushing these 41 data points to their limit. However, I think the model is good. The two main risks of including too many predictors in a model are the following:

- Insufficient power to obtain significance due to imprecise estimates.
- Overfitting the model, which is when the model starts to fit the random noise. The R-squared increases but, because you can’t predict the random noise for new data, the predicted R-squared decreases.

Fortunately, all the predictors are significant, so power isn’t a problem. Further, the predicted R-squared has increased, so we probably aren’t overfitting the model.

**Related post**: Overfitting Regression Models

## The Contemporaneous vs. the Historical Perspective

Is the historical perspective different from the contemporary perspective? How much can you divine from the present about the ultimate assessment by historians? These are fascinating questions. Our best model suggests that contemporary data account for two-thirds of the variance in the rankings by historians.

What about the other third? We can’t say for sure. It’s possible that if we could include more variables or better variables, concurrent data could account for even more of the variance. It’s also likely that the historical perspective does account for some of it. After all, history is complex. With hindsight, additional knowledge, etc., the perspective provided by time could revise the contemporary conclusions somewhat.

However, it’s pretty clear that it’s easy to account for half the variance with a simple model that contains only two contemporaneous variables, and it’s not too difficult to get up to two-thirds! This result reaffirms why I love statistics: You can observe and record the data around you and have a good assessment of reality that withstands the test of time. The historical perspective definitely has its place, but if you find the right data and use the correct analyses, you can gain good insights *right now*!

Rexa Miles says

Thanks for sharing! That sounds like a great way to incorporate a variety of information in your model.

Owen Hewes says

Hi Jim,

What software did you use above to produce the regression tables?

Jim Frost says

Hi Owen, I used Minitab statistical software.

Tony says

Hi Jim,

I ran a regression analysis which has the intercept and the x coefficient . . .

Thanks,

Tony

Jim Frost says

Hi Tony, you posted this question in a totally unrelated topic. Please post your comment in one of my regression posts, such as about the constant or regression coefficients. I’ll answer it there. The topic doesn’t really fit here. Thanks!

Wen says

Hi Jim, please provide links to the data files you used in this post. Thanks.

Jim Frost says

Hi Wen,

I first posted this article awhile ago elsewhere. After the intervening years, I was unable to find the datasets I used back then. If I can find them, I’ll include a link to them.

Jeremy says

What is the predicted R-squared?

Jim Frost says

Hi Jeremy, there’s a link to an article where I discuss predicted R-squared (and adjusted R-squared). That will answer your questions!