In Response to Bill James:
What Player Won-Lost Records Get Right
On Friday (November 17, 2017), Bill James wrote a provocative article, entitled Judge and Altuve. The specific focus of the article was this year's American League MVP race but the broader focus was what James views as a failing of WAR.
"We reach, then, the key question in this debate: is it appropriate, in assigning the individual player credit for wins, to do so based on the usual and normal relationship of runs to wins, or based on the actual and specific relationship for this player and this team?
...
The logic for applying the normal and usual relationship is that deviations from the normal and usual relationship should be attributed to luck....
... that argument is just dead wrong."
James goes on to lay out "five reasons why it is wrong." You should read his article. Bill James is a great writer and a great thinker about baseball.
On Saturday (November 18, 2017), Joe Posnanski wrote about James's article, and, more generally, Bill James's apparent years-long "problem with WAR". You should also read this article. Posnanski is also a great writer and thinker about baseball. Posnanski basically agreed with James.
"Is a team winning or losing more games than expectation 'chance?' I've always thought that's mostly true, but I will just say: It's a copout to just stop there."
Ironically, on the same day that Posnanski was posting his article, I was giving a presentation to the Chicago chapter of SABR, discussing my Player won-lost records (and my book: Player Won-Lost Records in Baseball: Measuring Performance in Context). This was ironic (I think - I'm more a math guy than an English guy; I could be using the word "ironic" wrong) because the very premise of my talk was that Player won-lost records are an improvement over current sabermetric statistics (including WAR) precisely because Player won-lost records are built up from actual team wins (and losses).
My son videotaped my presentation. The video runs about an hour (my presentation was 20 minutes; the follow-up Q-and-A was 40 minutes) and can be seen on YouTube.
The final slide of my presentation (you may want to zoom in on the screen during my presentation to better read the slides) highlighted the reasons "[w]hy ... we need Player wins and losses".
- [Player won-lost records] fill a niche: Player wins tied to team wins by game
- Most baseball statistics start with [theoretical] runs and convert from [theoretical] runs to theoretical wins
- Starting from a different (better) place - actual wins - reveals a host of fasinating new insights
My data source is Retrosheet. Retrosheet is not going to officially release 2017 play-by-play data until later this week. I was, however, able to get a preview of the 2017 results. So, I have calculated a preliminary set of 2017 Player won-lost records. These data are preliminary and will be updated as part of a full update of my data some time over the next two months. Hopefully, however, the results will not change significantly.
So, let's begin where Bill James began: Jose Altuve vs. Aaron Judge. The next table compares Altuve and Judge in Player won-lost records.
Player won-lost records are calculated two ways:
(1) pWins and pLosses tie to team wins, with the players on a team earning 2 pWins and 1 pLoss in every team win and 1 pWin and 2 pLosses in every team loss;
(2) eWins and eLosses control for context and are not tied to specific team wins and losses.
The last two columns measure wins over positional average (WOPA) and wins over replacement level (WORL).
So, James is correct in his specific contention: Jose Altuve contributed more actual team victories for the Houston Astros than Aaron Judge did for the New York Yankees, although Judge produced more expected wins (by a larger margin than WAR shows).
So, does that mean that Altuve deserved the MVP award over Judge?
In my opinion, yes, it does. But there's no reason why my opinion has to prevail. Other people are entitled to their own opinions. Which is precisely why I calculate Player won-lost records two ways and precisely why I calculate wins over both positional average and replacement level. There's even a page on my website where you can apply your own weights to rank players however you'd like.
But even though I think that one could reasonably prefer eWins to pWins and, hence, believe that Aaron Judge should have been the American League MVP in 2017, I think that Bill James is correct. The failure to link WAR to actual wins at any point in the process is a flaw in the construction of WAR.
Joe Posnanski perhaps lays out the issue best:
"Look: Baseball Reference WAR and Fangraphs WAR go to great care figuring out how many runs a player is worth. They calculate (in different ways) what a positional player's value is as a hitter, as a base runner, as a fielder. They make a positional adjustment .... They make a league-wide adjustment, based on the run-scoring atmosphere of the league ...
This all takes a great deal of calculation and thought and bold viewpoints. WAR is a wonderful formula in so many ways. And when the calculations are done, we are left with a number of runs a player/pitcher is worth, a number that can then be compared with the run value of a replacement player.
And after all this very intense math, how do they get from RAR (Runs Above Replacement) to WAR (Wins Above Replacement)?
They basically just divide the total by 10."
And here is where WAR goes astray. The assumption is that all theoretical runs are the same. But all theoretical runs are not the same.
Player won-lost records, though, start from wins. And they start from actual wins. Expected wins are the second set of numbers I calculate, and they're calculated such that the total number of expected wins match actual wins by component and sub-component. That is, for a given season, the expected value of home runs are set so that the sum of the expected wins from home runs equal the actual wins from home runs. And ditto for doubles, and walks, and ground outs to the third baseman. And to quote myself, "[s]tarting from a different (better) place - actual wins - reveals a host of fasinating new insights."
For a full discussion of the fascinating new insights revealed by Player won-lost records, you should buy and read my book (well, technically, the insights come from reading it, not buying it). Or at least watch the hour-long video of my presentation to SABR Chicago. But to summarize a few of my findings.
- Context measures derived from raw win probability (WPA, Leverage) undervalue the early innings of games. This has particular implications for the valuation of starting pitchers (who primarily pitch those early innings).
- WAR weights fielding too heavily relative to pitching (and offense).
- Events which produce actual runs are more valuable than similar events which do not produce runs. Because of this, home runs are undervalued in a linear weights framework because home runs are guaranteed to produce runs.
So, does one have to take context into account when voting for MVP? No, I don't think so, although I would personally be inclined to do so. But if one is purporting to measure player contributions to wins, then I do think that one ought to begin one's analysis with actual wins. And I believe that my Player won-lost records do this better than any other statistic.
All articles are written so that they pull data directly from the most recent version of the Player won-lost database. Hence, any numbers cited within these articles should automatically incorporate the most recent update to Player won-lost records. In some cases, however, the accompanying text may have been written based on previous versions of Player won-lost records. I apologize if this results in non-sensical text in any cases.
Home
List of Articles