Baseball Player Won-Loss Records
Home     List of Articles



Forecast Accuracy: 2015 BBWAA Hall-of-Fame Voting



Last week, I wrote an article with my predictions of how this year's Hall-of-Fame voting would go. The results were announced earlier this week, with four well-deserving players being elected: Craig Biggio, Randy Johnson, Pedro Martinez, and John Smoltz.

So, how did I do at predicting the final results?

The next table answers that. The first two columns were my prediction. The middle two columns show the actual results. The last two columns show the difference.

My Prediction Actual Vote My Error
Player VotesPercentage VotesPercentage VotesPercentage
Randy Johnson 515
94.8%
534
97.3%
-19
-2.4%
Pedro Martinez 493
90.8%
500
91.1%
-7
-0.3%
Craig Biggio 456
84.0%
454
82.7%
2
1.3%
John Smoltz 393
72.4%
455
82.9%
-62
-10.5%
Mike Piazza 379
69.8%
384
69.9%
-5
-0.1%
Jeff Bagwell 310
57.1%
306
55.7%
4
1.4%
Tim Raines 296
54.5%
302
55.0%
-6
-0.5%
Curt Schilling 179
33.0%
215
39.2%
-36
-6.2%
Roger Clemens 177
32.6%
206
37.5%
-29
-4.9%
Barry Bonds 174
32.0%
202
36.8%
-28
-4.7%
Edgar Martinez 166
30.6%
148
27.0%
18
3.6%
Mike Mussina 161
29.7%
135
24.6%
26
5.1%
Lee Smith 153
28.2%
166
30.2%
-13
-2.1%
Alan Trammell 111
20.4%
138
25.1%
-27
-4.7%
Jeff Kent 109
20.1%
77
14.0%
32
6.0%
Fred McGriff 70
12.9%
71
12.9%
1
0.0%
Mark McGwire 69
12.7%
55
10.0%
14
2.7%
Larry Walker 66
12.2%
65
11.8%
1
0.3%
Don Mattingly 39
7.2%
50
9.1%
-11
-1.9%
Gary Sheffield 39
7.2%
64
11.7%
-25
-4.5%
Sammy Sosa 39
7.2%
36
6.6%
3
0.6%
Carlos Delgado 13
2.4%
21
3.8%
-8
-1.4%
Nomar Garciaparra 13
2.4%
30
5.5%
-17
-3.1%
Brian Giles 8
1.5%
0
0.0%
8
1.5%
Tony Clark 2
0.4%
0
0.0%
2
0.4%
Tom Gordon 2
0.4%
2
0.4%
0
0.0%
Troy Percival 2
0.4%
4
0.7%
-2
-0.4%
Jermaine Dye 1
0.2%
0
0.0%
1
0.2%
Darin Erstad 1
0.2%
1
0.2%
0
0.0%
Cliff Floyd 1
0.2%
0
0.0%
1
0.2%
Eddie Guardado 1
0.2%
0
0.0%
1
0.2%
Jason Schmidt 1
0.2%
0
0.0%
1
0.2%
Rich Aurilia 0
0.0%
0
0.0%
0
0.0%
Aaron Boone 0
0.0%
2
0.4%
-2
-0.4%
Total Ballots 543
 
549
 
-6
-1.1%
Votes per Ballot 8.17
 
8.42
 
-0.25
-3.0%
 
Average Error
 
 
-5.4
-0.7%
 
Average Absolute Error
Total
 
 
12.1
2.1%
Top 23
 
 
17.1
3.0%


I will try to be objective in analyzing my results. But I will probably fail, because, outside of one inexcusably bad miss, I'm actually damn impressed with myself.

On average, I underestimated players' vote by 5.4 votes per player or 0.7%. Looking at absolute errors - so that -5% and +5% both count as +5% rather than canceling out, my average absolute error was 12.1 votes and 2.1% of the vote. Focusing only on the top 23 players (through Garciaparra in the above table), my average absolute error was 17.1 votes or 3.0% of the vote. If one were to calculate a root mean-squared error (RMSE) - i.e., the square root of the average squared error (this also treats -5% and +5% the same, but penalizes larger errors more harshly), the RMSE of my percentage predictions for the top 23 vote-getters was 3.9%.
Size of Electorate
The Hall-of-Fame introduced a new registration process for voters this year. I predicted that this would reduce the electorate from 571 last year to 543 this year. In fact, there were 549 ballots cast this year. I have to say, for basically having pulled 543 out of my ass, I'm pretty pleased with that one.

On the other hand, I predicted that the size of the average ballot would decline from 8.39 last year to 8.17 this year. In fact, the average ballot size remained almost perfectly constant: in fact, it actually rose slightly to 8.42.

Players Elected
The BBWAA elected four players. I correctly predicted three of these and the four players elected were the four players who I predicted to get the four largest votes. But the fourth player elected, John Smoltz - who was actually the third-highest vote getter (by one vote over Craig Biggio) - was my worst prediction. I under-estimated Smoltz's final vote total by 62 votes and 10.5%.

I did hedge my bets on Smoltz in my original prediction article, stating that "if I absolutely had to pick one of my predictions and guess the direction of my error, it would be that I'm under-estimating Smoltz's final vote total." At the time that I wrote my original article, public ballots were showing nearly 90% support for Smoltz. In my defense, that fell to 86.3% by the time the results were announced, and even that overstated Smoltz's actual percentage by 3.4%. So, I was right to be somewhat conservative on Smoltz. But not nearly as conservative as I was. That one was a clear - and inexcusable - miss. Removing Smoltz from the calculations lowers the root mean-squared error (RMSE) of my predictions from 3.9% to 3.3%.

My Best Successes
Overall, my predicted vote percentage was within 1.5% of the final vote total for 20 of the 34 players on the ballot. Of course, that included players such as Rich Aurilia, who I correctly predicted would receive zero votes. Limiting the focus to the top 23 vote-getters, I came within 1.5% of the vote total for 9 of the 23 and within 2.5% of the vote total for 12 of 23. My median absolute error was 2.4% (Randy Johnson).

My predictions of which I am most proud would probably be Pedro Martinez (predicted, 90.8%; actual, 91.1%), Mike Piazza (predicted, 69.8%; actual, 69.9%), Fred McGriff (predicted, 12.9%; actual, 12.9%), Larry Walker (predicted, 12.2%; actual, 11.8%), and Tom Gordon, who I correctly predicted would receive exactly two votes.

My Biggest Misses
My biggest miss was Smoltz. By percentages, my next biggest misses were Curt Schilling (predicted, 33.0%; actual, 39.2%), Jeff Kent (predicted, 20.1%; actual, 14.0%), Mike Mussina (predicted, 29.7%; actual, 24.6%), Roger Clemens (predicted, 32.6%; actual, 37.5%), Barry Bonds (predicted, 32.0%; actual, 36.8%), Alan Trammell (predicted, 20.4%; actual, 25.1%), and Gary Sheffield (predicted, 7.2%; actual, 11.7%).

Systematic Errors?
So, looking at the above table, were there any systematic errors? At first glance, nothing obvious jumped out at me. But looking a bit more closely, a few themes emerged.

First, I tended to under-estimate vote percentages for first-year players. The most egregious of these, of course, was John Smoltz. But I also underestimated Randy Johnson and Pedro Martinez, although by 2.4% and 0.3%, respectively. More significantly, I underestimated Gary Sheffield by 4.5%, Nomar Garciaparra by 3.1%, and Carlos Delgado by 1.4%.

In the latter two cases, I lowered my initial guess based on extremely low totals among publicly revealed ballots (2.0% and 1.5%, respectively). In retrospect, this was a mistake, although I'm not sure that there are any general conclusions to be drawn from that.

On the other hand, I tended to over-estimate vote percentages for second-year players. Of course, there were only two second-year players on the ballot, so it's hard to generalize. But I over-estimated Mussina's and Kent's vote percentages by 5.1% and 6.0%, respectively, which were my two worst errors in that direction.

Perhaps interestingly, the errors for Mussina and Kent mirrored opposite errors of similar magnitude for the players who were perhaps the best match for Mussina and Kent in terms of the type of player and support received.

Mike Mussina, starting pitcher, is probably most similar to Curt Schilling among players on this ballot. I over-estimated Mussina's support by 5.1% and under-estimated Schilling's support by 6.2%.

Jeff Kent, middle infielder with middling support, is perhaps most similar to Alan Trammell. I over-estimated Kent's support by 6.0% and under-estimated Trammell's support by 4.7%.
It could be that some of the support I anticipated going to Mussina and Kent from freeing up capped ballot spaces was instead spent on similar players who had been on the ballot longer. Given a choice to fill a newly-available 10th ballot slot with two similar players, it might have made some sense for voters to be more inclined to choose the player who had been on the ballot longer. Or, I could be finding tendencies that are not really there.

In general, the "steroid" players tended to out-perform my predictions, although I actually over-estimated Mark McGwire's support (predicted, 12.7%; actual, 10.0%) and got Sosa's total almost exactly right (predicted, 7.2%; actual, 6.6%). The somewhat better-than-I-predicted results for Bonds and Clemens, however, could suggest some softening of anti-steroid sentiment, with the overly-crowded ballot keeping McGwire and Sosa from benefiting. On the other hand, Bonds's and Clemens's actual vote percentages (36.8%, 37.5%) are essentially identical to their 2013 percentages (36.2%, 37.6%, respectively), so it's not so much that anti-steroid sentiment is softening as that it's not growing as I predicted.

Final Thoughts
Overall, I'm generally pleased with my results. I definitely should have predicted Smoltz's election. But outside of Smoltz, my biggest misses seemed (a) reasonably small, all things considered, and (b) somewhat random.

I will probably follow up my analysis of 2015 Hall-of-Fame results with a reprise of last year's look at the impact of the ballot cap based on this year's public results next week. At that time, I will also probably take a first look at the 2016 Hall-of-Fame ballot and maybe make some preliminary forecasts. That probably won't happen before next week, as I hope that Ryan Thibs can add a few more actual votes to his wonderful ballot-tracking spreadsheet. So, check back later if, like me, you can't get enough talk about the Hall of Fame.



All articles are written so that they pull data directly from the most recent version of the Player won-lost database. Hence, any numbers cited within these articles should automatically incorporate the most recent update to Player won-lost records. In some cases, however, the accompanying text may have been written based on previous versions of Player won-lost records. I apologize if this results in non-sensical text in any cases.

Home     List of Articles