Baseball Player Won-Loss Records
Home     List of Articles

Updates - Jan 2019
Updates to Baseball Player Won-Lost Records: January 2019

I have recently done some thinking about some of the tools by which I analyze Player won-lost records and have made a couple of changes to the numbers that are presented on the website. These will be explored in more detail in an upcoming project which I hope to have ready in early March. But I wanted to put these out here now.

The core calculation of Baseball Player won-lost records is pWins and pLosses. These are calculated play by play such that the total player decisions by a team are exactly three per game, 2 pWins and 1 pLoss for a winning team, 1 pWin and 2 pLosses for a losing team (teams earn 1.5 pWins and 1.5 pLosses in a tie game). Because the number of pWins and pLosses are known for certain at the team level, by construction, pWins and pLosses are an objective calculation. The exact pWins that I calculate may not be precisely correct, but any mis-allocation of pWins (and/or pLosses) will be confined to teammates within a specific game.

Given a set of pWins and pLosses, I then calculate a set of context-neutral Player won-lost records, eWins and eLosses. The precise calculation of pWins and eWins are described in some detail here and in much greater detail in my first book (Player Won-Lost Records in Baseball: Measuring Performance in Context, McFarland, 2017).

I have not changed my calculations of these basic statistics. I have, however, changed the calculations for two measures which I use to compare players: positional averages and expected context.

Positional Averages
Because of the way in which Player won-lost records are calculated, in order to compare players across different positions (and, to perhaps a somewhat lesser extent, over different time periods), I find it desirable to calculate a set of Positional Averages from which I calculate Player wins over positional average (pWOPA). I then use these positional averages to calculate replacement levels (one standard deviation below positional average) which are used to calculate Player wins over replacement level (pWORL).

For non-pitchers, I have not changed the way in which I calculate positional averages except for correcting an error I discovered in my treatment of positional averages for pinch hitters and pinch runners in DH vs. non-DH leagues within a single season.

For pitchers, however, I have changed my thinking somewhat, and, consequently, changed how I calculate positional averages.

I believe that there are three possible ways to calculate positional averages for pitchers.

(1) Pitching is pitching: the positional average is 0.500 for all pitchers for all seasons by construction.
(2) Starting pitchers and relief pitchers should have different positional averages, which should be calculated empirically each season. That is, in 2018, the overall winning percentage for starting pitchers (excluding their offense) was 0.497; the overall winning percentage for relief pitchers (again, only on defense) was 0.504. Those are your positional averages for 2018.
(3) Starting pitchers and relief pitchers should, indeed, have different positional averages, but option (2) assumes that the pool of starting pitchers and the pool of relief pitchers are equal. What we should do, instead, is focus on pitchers who pitched as both starters and relievers. Doing this produces positional averages for 2018 of 0.487 for starting pitchers and 0.518 for relief pitchers.
In the past, I have chose option (3). Last summer, I had an e-mail conversation with a SABR member named Bob Sawyer who objected to my choice here. He raised some good points, which prompted me to re-think this choice.

First, let's think about pWins. The point here is to explain team wins at the game level. From the team's perspective, what is the difference between starting pitchers and relief pitchers? Ultimately, nothing. In this game, the Cubs led 3 - 0 with two outs and runners on first and second base in the top of the sixth inning. The Cubs replaced starting pitcher Jose Quintana with relief pitcher Steve Cishek. The next batter, Aaron Altherr hit a three-run home run tying the score at 3 - 3. Would anything have been any different if Quintana had stayed in the game and given up the home run to Altherr? No. But if the positional average for starting pitchers differs from the positional average for relief pitchers, doesn't that imply that the expectations are different for a starting pitcher than for a relief pitcher? Specifically, if the positional average for starting pitchers is lower than for relief pitchers (as is the case with either option (2) or (3) in 2018), that implies, at some level, that the home run would have been less costly if Quintana had given it up instead of Cishek. That doesn't make sense.

The 2018 season has also introduced a new wrinkle here: the opener. Why is the positional average lower for starting pitchers? In the case of option (3), it is because pitchers perform better when they pitch in relief than when they pitch as starters. Why? Because relief pitchers don't have to pace themselves; they don't face the same batters more than once in a game; they likely have the platoon advantage more often. But now, with the "opener", the starting pitcher - i.e., the first pitcher of the game - gets these advantages and it's the second pitcher of the game who is then expected to work through the lineup multiple times and pitch multiple innings.

Shouldn't openers be treated like "relief" pitchers? But how do you distinguish between a starting pitcher who pitched one inning because that was the plan versus a starting pitcher who pitched one inning because he was pulled from the game early because he was ineffective or injured? I don't know that you can.

There is also, I think, another problem. As the use of relief pitchers has changed so, too, has the use of starting pitchers. Teams are making more and more effort to avoid having starting pitchers face batters a third time in the same game. This is likely making the job of a starting pitcher easier over time.

I have decided, therefore, to change how I calculate positional averages for pitchers. Specifically, I have decided to switch to what I identified above as option (1). That is, I now believe the appropriate way to calculation positional averages for pitchers is to use a positional average of 0.500 for all pitchers in all seasons.

The impact of this is to lower the positional average for relief pitchers - thereby increasing relief pitchers' wins over positional average (WOPA) and replacement level (WORL) - and increase the positional average for starting pitchers - thereby decreasing starting pitchers' WOPA and WORL. The old positional averages (option (3)) were trending away from 0.500 over time for both starting and relief pitchers, so the shift to 0.500 as a positional average for all pitchers has a somewhat greater impact on more recent pitchers.

The next two tables show the top 10 pitchers with the most extreme increase and decrease in career pWOPA due to this change. As you would expect, the top 10 pitchers in increased pWOPA are long-career relief pitchers. The top 10 pitchers in decreased pWOPA are long-career starting pitchers.

Top Increases in Career pWOPA
pWins over Positional Average
Player pWins pLosses Original Revised Change
1Lee Smith111.378.213.316.93.56
2Goose Gossage131.2101.811.715.23.48
3Mariano Rivera126.660.829.632.93.36
4John Franco103.774.311.614.93.35
5Rollie Fingers121.097.29.813.13.25
6Jeff Reardon91.371.
7Hoyt Wilhelm138.6116.010.413.53.10
8Frankie Rodriguez90.056.913.616.62.94
9Trevor Hoffman100.762.316.519.52.93
10Gene Garber88.

Top Decreases in Career pWOPA
pWins over Positional Average
Player pWins pLosses Original Revised Change
1Nolan Ryan356.8328.024.619.7-4.88
2Greg Maddux328.5271.344.839.9-4.84
3Roger Clemens318.1228.151.146.3-4.84
4Tom Glavine279.4249.529.024.8-4.26
5Randy Johnson281.2220.938.534.2-4.26
6Bartolo Colon212.8205.49.95.6-4.22
7Steve Carlton337.9303.931.727.6-4.16
8Bert Blyleven291.2260.022.218.0-4.16
9C.C. Sabathia213.6179.222.218.1-4.13
10Don Sutton320.8295.324.620.5-4.08

Expected Context
Given a set of pWins and pLosses, I then calculate a set of context-neutral Player won-lost records. Oversimplifying, context-neutral wins for a given event are calculated by taking the average value of the pWins associated with the event over the course of the season. Player eWins are not simply the sum of these context-neutral wins, however. Rather, eWins are expected wins and, as such, are adjusted to incorporate expected context and expected win adjustments.

Win adjustments are the difference in winning percentage between a player's pWins and his context-neutral wins. For example, in his career, Hank Aaron had a winning percentage, measured via pWins, of 0.569. Adding up Hank Aaron's eWins - with no adjustments - produces a winning percentage of 0.564. Hence, Hank Aaron's career win adjustment was 0.005. The relationship between player wins and team wins is not precisely linear, but is, in fact, somewhat multiplicative. That is, players who are somewhat better than average will produce a team record that is more above average. The implication of this is that players' expected win adjustments are positively correlated with the difference between a player's record and 0.500. So players who are below average will, on average, have slightly negative win adjustments and players who are above average, like Hank Aaron, will, on average, having slightly positive win adjustments - exactly as Hank Aaron had over his career. Aaron's expected career win adjustment was 0.004, leading to an eWin percentage of 0.568, very similar to his career pWin percentage of 0.569.

I explained expected win adjustments in some detail in my first book and have made no changes in their calculation here.

Context refers to the difference in the number of player decisions a player earns when measured by pWins versus eWins. For example, Hank Aaron earned 863.9 pDecisions (pWins plus pLosses) in his career. Adding up Hank Aaron's eWins - again, with no adjustments - yields a total of 842.2 eDecisions. Hence, for his career, Hank Aaron's overall context was 1.026 - he earned about 2.6% more pDecisions than eDecisions in his career. Generally speaking, context does not matter too much for position players. There's virtually no way to control the context in which one fields. There are, however, some identifiable impacts of context on hitting. Pinch hitters tend to have a higher-than-average context. Better hitters also tend to have a higher-than-average context, although the impact of this is fairly modest.

Differences in context are much more significant for pitchers. For example, Mariano Rivera amassed 123.1 career (regular-season) eDecisions (prior to these adjustments) but 187.4 career pDecisions. That works out to a context of 1.522 - Rivera earned 52.2% more pDecisions than eDecisions. This difference is, of course, because of the timing of when Mariano Rivera pitched. Pitching in the ninth inning of a game which your team leads by one or two runs is a higher-context situation than, say, pitching in the fifth or sixth inning of a game when your team trails by five runs.

In my original formulation of eWins, as described in my first book, I calculated expected context for five positions: starting pitcher, relief pitcher, pinch hitter, pinch runner, and all others. The problem with doing this can be seen by looking at Mariano Rivera. As seen above, his actual context over the course of his career was 1.522, which, as I said, is not a terribly surprising number: I think that's about what we should have expected. But, in my earlier formulation, Mariano Rivera's expected context was not 1.52 or anything close to 1.52. It was 0.860. Why? Because, on average, relief pitchers pitch in lower-context situations than starting pitchers. But does that make sense as an expected context for Mariano Rivera? Not really.

Now, one could argue: the context in which a pitcher appears is a function of how his manager uses him and he shouldn't get credit (or debit) for that. Okay, fine, that may be a reasonable argument for setting everybody's expected context equal to 1.0; but 0.86? I would also counter that, in fact, part of Mariano Rivera's value to his teams in his career was the ability of his managers to leverage his performance by concentrating it in higher-context performance. That is, I think it is completely reasonable to take account of the context in which Rivera performed in evaluating Rivera's career.

But, if not 0.86, and if not 1.0, what makes sense as an expected context for Rivera? We could, perhaps, look at the average context of closers. So how would we define the "closers" who help us to form our expectations of the context in which Rivera performance? Such a definition would almost have to include looking at these players' context. Which seems like a rather circular way to calculate expected context: we would expect Mariano Rivera's context to be similar to players who had similar actual context. Huh?

So, having thought about it, I have decided to use players' actual context in calculating their eWins. Admittedly, there may be some players for whom their actual context was unusual, i.e., unexpected. But the fact is, in many, probably most, cases, a player's context will be a function of the player's expected value. For example, Hank Aaron's career context, 1.026, is consistent with a hitter the quality of Aaron. He tended to hit in lineup positions which have somewhat greater-than-expected context and, of course, Hank Aaron would never have been pinch-hit for as sometimes happens to lesser hitters in high-context situations. One advantage of using actual context in calculating eWins is that it makes it easier to identify and understand differences between a player's pWins and eWins. Consider, for example, Barry Bonds. Using pWins, Bonds had a career won-lost record of 462.0 - 314.9. Using the old expected context formulation, Bonds's record in eWins was 456.6 - 310.0. Was Barry Bonds better in pWins or eWins? The answer is not immediately obvious. He earned 5.4 more pWins than eWins but he also had 4.9 more pLosses than eLosses. Using Bonds's actual context (1.011), his expected record becomes 462.6 - 314.3. This is more readily compared, then, to Bonds's pWin-based record. He earned 0.6 fewer pWins than eWins and 0.6 more pLosses than eLosses. Of course, over a 22-year career that produced almost 800 player decisions, I think it's safe to say that a difference between pWins and eWins of 0.6 is essentially rounding error.

These adjustments to positional averages and expected context are reflected on the player pages on the website and, because most of the articles on my website are set up to calculate any player statistics on the fly, these updated values will also be reflected in the articles on my site (which may lead to some discrepancies between some of the data in some articles and the accompanying text).

All articles are written so that they pull data directly from the most recent version of the Player won-lost database. Hence, any numbers cited within these articles should automatically incorporate the most recent update to Player won-lost records. In some cases, however, the accompanying text may have been written based on previous versions of Player won-lost records. I apologize if this results in non-sensical text in any cases.

Home     List of Articles