2015 HOF Prediction Accuracy

Baseball Player Won-Loss Records
Home List of Articles

Forecast Accuracy: 2015 BBWAA Hall-of-Fame Voting

Last week, I wrote an article with my predictions of how this year's Hall-of-Fame voting would go. The results were announced earlier this week, with four well-deserving players being elected: Craig Biggio, Randy Johnson, Pedro Martinez, and John Smoltz.

So, how did I do at predicting the final results?

The next table answers that. The first two columns were my prediction. The middle two columns show the actual results. The last two columns show the difference.

Player	Votes	Percentage	Votes	Percentage	Votes	Percentage
	My Prediction		Actual Vote		My Error
Randy Johnson	515	94.8%	534	97.3%	-19	-2.4%
Pedro Martinez	493	90.8%	500	91.1%	-7	-0.3%
Craig Biggio	456	84.0%	454	82.7%	2	1.3%
John Smoltz	393	72.4%	455	82.9%	-62	-10.5%
Mike Piazza	379	69.8%	384	69.9%	-5	-0.1%
Jeff Bagwell	310	57.1%	306	55.7%	4	1.4%
Tim Raines	296	54.5%	302	55.0%	-6	-0.5%
Curt Schilling	179	33.0%	215	39.2%	-36	-6.2%
Roger Clemens	177	32.6%	206	37.5%	-29	-4.9%
Barry Bonds	174	32.0%	202	36.8%	-28	-4.7%
Edgar Martinez	166	30.6%	148	27.0%	18	3.6%
Mike Mussina	161	29.7%	135	24.6%	26	5.1%
Lee Smith	153	28.2%	166	30.2%	-13	-2.1%
Alan Trammell	111	20.4%	138	25.1%	-27	-4.7%
Jeff Kent	109	20.1%	77	14.0%	32	6.0%
Fred McGriff	70	12.9%	71	12.9%	1	0.0%
Mark McGwire	69	12.7%	55	10.0%	14	2.7%
Larry Walker	66	12.2%	65	11.8%	1	0.3%
Don Mattingly	39	7.2%	50	9.1%	-11	-1.9%
Gary Sheffield	39	7.2%	64	11.7%	-25	-4.5%
Sammy Sosa	39	7.2%	36	6.6%	3	0.6%
Carlos Delgado	13	2.4%	21	3.8%	-8	-1.4%
Nomar Garciaparra	13	2.4%	30	5.5%	-17	-3.1%
Brian Giles	8	1.5%	0	0.0%	8	1.5%
Tony Clark	2	0.4%	0	0.0%	2	0.4%
Tom Gordon	2	0.4%	2	0.4%	0	0.0%
Troy Percival	2	0.4%	4	0.7%	-2	-0.4%
Jermaine Dye	1	0.2%	0	0.0%	1	0.2%
Darin Erstad	1	0.2%	1	0.2%	0	0.0%
Cliff Floyd	1	0.2%	0	0.0%	1	0.2%
Eddie Guardado	1	0.2%	0	0.0%	1	0.2%
Jason Schmidt	1	0.2%	0	0.0%	1	0.2%
Rich Aurilia	0	0.0%	0	0.0%	0	0.0%
Aaron Boone	0	0.0%	2	0.4%	-2	-0.4%
Total Ballots	543		549		-6	-1.1%
Votes per Ballot	8.17		8.42		-0.25	-3.0%

Average Error					-5.4	-0.7%

Average Absolute Error
Total					12.1	2.1%
Top 23					17.1	3.0%

I will try to be objective in analyzing my results. But I will probably fail, because, outside of one inexcusably bad miss, I'm actually damn impressed with myself.

On average, I underestimated players' vote by 5.4 votes per player or 0.7%. Looking at absolute errors - so that -5% and +5% both count as +5% rather than canceling out, my average absolute error was 12.1 votes and 2.1% of the vote. Focusing only on the top 23 players (through Garciaparra in the above table), my average absolute error was 17.1 votes or 3.0% of the vote. If one were to calculate a root mean-squared error (RMSE) - i.e., the square root of the average squared error (this also treats -5% and +5% the same, but penalizes larger errors more harshly), the RMSE of my percentage predictions for the top 23 vote-getters was 3.9%.

Size of Electorate

The Hall-of-Fame introduced a new registration process for voters this year. I predicted that this would reduce the electorate from 571 last year to 543 this year. In fact, there were 549 ballots cast this year. I have to say, for basically having pulled 543 out of my ass, I'm pretty pleased with that one.

On the other hand, I predicted that the size of the average ballot would decline from 8.39 last year to 8.17 this year. In fact, the average ballot size remained almost perfectly constant: in fact, it actually rose slightly to 8.42.

Players Elected

The BBWAA elected four players. I correctly predicted three of these and the four players elected were the four players who I predicted to get the four largest votes. But the fourth player elected, John Smoltz - who was actually the third-highest vote getter (by one vote over Craig Biggio) - was my worst prediction. I under-estimated Smoltz's final vote total by 62 votes and 10.5%.

I did hedge my bets on Smoltz in my original prediction article, stating that "if I absolutely had to pick one of my predictions and guess the direction of my error, it would be that I'm under-estimating Smoltz's final vote total." At the time that I wrote my original article, public ballots were showing nearly 90% support for Smoltz. In my defense, that fell to 86.3% by the time the results were announced, and even that overstated Smoltz's actual percentage by 3.4%. So, I was right to be somewhat conservative on Smoltz. But not nearly as conservative as I was. That one was a clear - and inexcusable - miss. Removing Smoltz from the calculations lowers the root mean-squared error (RMSE) of my predictions from 3.9% to 3.3%.

My Best Successes

Overall, my predicted vote percentage was within 1.5% of the final vote total for 20 of the 34 players on the ballot. Of course, that included players such as Rich Aurilia, who I correctly predicted would receive zero votes. Limiting the focus to the top 23 vote-getters, I came within 1.5% of the vote total for 9 of the 23 and within 2.5% of the vote total for 12 of 23. My median absolute error was 2.4% (Randy Johnson).

My predictions of which I am most proud would probably be Pedro Martinez (predicted, 90.8%; actual, 91.1%), Mike Piazza (predicted, 69.8%; actual, 69.9%), Fred McGriff (predicted, 12.9%; actual, 12.9%), Larry Walker (predicted, 12.2%; actual, 11.8%), and Tom Gordon, who I correctly predicted would receive exactly two votes.

My Biggest Misses

My biggest miss was Smoltz. By percentages, my next biggest misses were Curt Schilling (predicted, 33.0%; actual, 39.2%), Jeff Kent (predicted, 20.1%; actual, 14.0%), Mike Mussina (predicted, 29.7%; actual, 24.6%), Roger Clemens (predicted, 32.6%; actual, 37.5%), Barry Bonds (predicted, 32.0%; actual, 36.8%), Alan Trammell (predicted, 20.4%; actual, 25.1%), and Gary Sheffield (predicted, 7.2%; actual, 11.7%).

Systematic Errors?

So, looking at the above table, were there any systematic errors? At first glance, nothing obvious jumped out at me. But looking a bit more closely, a few themes emerged.

First, I tended to under-estimate vote percentages for first-year players. The most egregious of these, of course, was John Smoltz. But I also underestimated Randy Johnson and Pedro Martinez, although by 2.4% and 0.3%, respectively. More significantly, I underestimated Gary Sheffield by 4.5%, Nomar Garciaparra by 3.1%, and Carlos Delgado by 1.4%.

In the latter two cases, I lowered my initial guess based on extremely low totals among publicly revealed ballots (2.0% and 1.5%, respectively). In retrospect, this was a mistake, although I'm not sure that there are any general conclusions to be drawn from that.

On the other hand, I tended to over-estimate vote percentages for second-year players. Of course, there were only two second-year players on the ballot, so it's hard to generalize. But I over-estimated Mussina's and Kent's vote percentages by 5.1% and 6.0%, respectively, which were my two worst errors in that direction.

Perhaps interestingly, the errors for Mussina and Kent mirrored opposite errors of similar magnitude for the players who were perhaps the best match for Mussina and Kent in terms of the type of player and support received.

Mike Mussina, starting pitcher, is probably most similar to Curt Schilling among players on this ballot. I over-estimated Mussina's support by 5.1% and under-estimated Schilling's support by 6.2%.

Jeff Kent, middle infielder with middling support, is perhaps most similar to Alan Trammell. I over-estimated Kent's support by 6.0% and under-estimated Trammell's support by 4.7%.

It could be that some of the support I anticipated going to Mussina and Kent from freeing up capped ballot spaces was instead spent on similar players who had been on the ballot longer. Given a choice to fill a newly-available 10th ballot slot with two similar players, it might have made some sense for voters to be more inclined to choose the player who had been on the ballot longer. Or, I could be finding tendencies that are not really there.

In general, the "steroid" players tended to out-perform my predictions, although I actually over-estimated Mark McGwire's support (predicted, 12.7%; actual, 10.0%) and got Sosa's total almost exactly right (predicted, 7.2%; actual, 6.6%). The somewhat better-than-I-predicted results for Bonds and Clemens, however, could suggest some softening of anti-steroid sentiment, with the overly-crowded ballot keeping McGwire and Sosa from benefiting. On the other hand, Bonds's and Clemens's actual vote percentages (36.8%, 37.5%) are essentially identical to their 2013 percentages (36.2%, 37.6%, respectively), so it's not so much that anti-steroid sentiment is softening as that it's not growing as I predicted.

Final Thoughts

Overall, I'm generally pleased with my results. I definitely should have predicted Smoltz's election. But outside of Smoltz, my biggest misses seemed (a) reasonably small, all things considered, and (b) somewhat random.

I will probably follow up my analysis of 2015 Hall-of-Fame results with a reprise of last year's look at the impact of the ballot cap based on this year's public results next week. At that time, I will also probably take a first look at the 2016 Hall-of-Fame ballot and maybe make some preliminary forecasts. That probably won't happen before next week, as I hope that Ryan Thibs can add a few more actual votes to his wonderful ballot-tracking spreadsheet. So, check back later if, like me, you can't get enough talk about the Hall of Fame.

All articles are written so that they pull data directly from the most recent version of the Player won-lost database. Hence, any numbers cited within these articles should automatically incorporate the most recent update to Player won-lost records. In some cases, however, the accompanying text may have been written based on previous versions of Player won-lost records. I apologize if this results in non-sensical text in any cases.

Home List of Articles