Basics


Baseball Player Wins and Losses



The job of a Major League Baseball player is to help his team win games, for the ultimate purpose of making the playoffs and winning the World Series. Since the early history of Major League Baseball, pitchers have been credited with Wins and Losses as official measures of the effectiveness of their pitching. Of course, Pitcher Wins are a fairly crude measure of how well a pitcher did his job, as wins are the product of the performance of the entire team – batters, baserunners, and fielders, in addition to pitchers.

While the implementation of Pitcher Wins as a measure of pitcher effectiveness is less than ideal, nevertheless the concept is perfectly sound. The ultimate measure of a player’s contribution – be he a pitcher, a hitter, a baserunner, or a fielder – is in how much he contributes to his team's wins.

Using play-by-play data, I have constructed a set of Player won-lost records that attempt to quantify the precise extent to which individual players contribute directly to wins and losses in Major League Baseball on the baseball field.

The purpose of this article is to provide a general overview of Player wins. First, I look at career leaders in total Player wins as well as Player wins over average and replacement level. I then explain how I construct Player wins, both with and without context, touching briefly on the contextual factors and individual components underlying Player won-lost records.

I then present an example of the Player won-lost record for one player in one season, Dontrelle Willis in 2005, as well as an example of how Player won-lost records can be used to compare two players, in this case David Ortiz and Alex Rodriguez as they matched up in the race for MVP in the 2005 American League.

Finally, I conclude with a few words on the difference between value and talent and exactly what I am trying measure with Player won-lost records. I hope you enjoy reading about and playing with Player won-lost records as much as I enjoyed creating them.

Player Wins
Player wins are calculated such that the players on a team earn 3 Player decisions per game. I calculate two sets of Player wins: pWins are tied to team wins - the players on a winning team earn 2 pWins and 1 pLoss, the players on a losing team earn 1 pWin and 2 pLosses - while eWins attempt to control for the context in which they were earned, as well as controlling for the abilities of a player's teammates.

Player wins end up being on a similar scale to traditional pitcher wins: 20 wins is a very good season total, 300 wins is an excellent career total.

There are a total of 62 major-league players who have accumulated 300 or more pWins over games for which Retrosheet has released play-by-play data (1938 - 2013). They are shown below.

300 pGame Winners of the Retrosheet Era
Player pWins pLosses pWin Pct. pWOPA pWORL
Hank Aaron493.3373.80.56941.790.1
Barry Bonds462.0316.80.59359.397.0
Willie Mays458.4330.70.58150.795.2
Pete Rose Sr.437.9391.60.52812.355.2
Rickey Henderson428.2353.60.54829.166.1
Carl Yastrzemski427.4362.00.54118.960.8
Frank Robinson396.5308.10.56328.868.2
Dave Winfield394.0344.40.53414.749.1
Al Kaline380.1301.90.55725.663.9
Cal Ripken377.9348.40.52026.761.0
Reggie Jackson367.7298.40.55225.558.6
Joe L. Morgan364.5283.20.56346.279.3
Nolan Ryan359.1332.00.52023.656.6
Robin Yount357.7335.40.51618.850.9
Alex Rodriguez353.8278.60.55942.273.3
Craig Biggio353.2319.90.52517.350.0
Roberto Clemente352.6298.40.54213.550.4
Mickey Mantle352.0229.60.60550.683.3
Andre Dawson349.0315.90.5257.037.9
Derek Jeter348.3297.90.53934.666.3
Eddie Murray346.4286.50.54718.347.8
Lou Brock346.1330.30.512-6.030.3
Brooks Robinson344.7305.50.53014.650.5
Gary Sheffield344.3290.80.54218.049.0
Steve Carlton343.4309.10.52632.062.9
Warren Spahn343.0286.00.54541.573.3
Chipper Jones337.2259.30.56533.262.6
Ken Griffey Jr.337.1297.90.53116.647.4
Phil Niekro335.9326.30.50717.649.4
Mike Schmidt332.9255.00.56632.559.9
Stan Musial332.2251.00.57029.362.5
Billy Williams331.2282.30.54011.345.3
Greg Maddux330.8272.60.54845.577.4
Dwight Evans328.5275.20.54420.348.3
George Brett327.7272.70.54623.751.6
Ozzie Smith327.2302.90.51920.249.5
Gaylord Perry325.7297.70.52223.553.8
Rusty Staub324.9307.80.513-1.831.4
Don Sutton324.8300.20.52024.153.9
Tony Gwynn Sr.322.1289.70.5265.033.9
Vada Pinson321.8298.60.5191.436.1
Roger Clemens319.4228.40.58351.680.5
Sammy Sosa318.8285.00.5286.536.2
Luis 'Gonzo' Gonzalez318.7295.10.5192.032.2
Manny Ramirez318.1252.50.55726.054.0
Omar Vizquel316.8327.00.4923.534.9
Eddie Mathews315.9236.30.57234.265.3
Paul Molitor315.7271.20.53819.747.1
Luis Aparicio315.0309.10.50512.647.8
Tim Raines Sr.314.3274.10.53412.139.7
Dave Parker314.2276.50.5328.836.2
Tom Seaver313.1259.90.54638.665.8
Roberto Alomar309.4275.70.52920.849.2
Rafael Palmeiro308.8267.00.5369.737.6
Steve Finley308.4289.30.5164.834.0
Tony Perez305.5254.40.54614.843.6
Bobby Abreu305.1263.30.53712.340.2
Willie Davis304.4275.50.5255.837.4
Ernie Banks304.1269.60.53013.445.9
Graig Nettles302.5267.80.53014.542.4
Darrell Evans301.1258.60.53814.240.7
Harmon Killebrew300.1235.10.56121.251.3


Accumulating 300 pWins is certainly an accomplishment. But it's fairly clear looking at the above list that the list of the top players in pWins is not necessarily a list of the best players, period. Just to pick out two examples: Omar Vizquel actually has a career winning percentage under 0.500 and Rusty Staub was (slightly) below average over the course of his career.

Don't get me wrong: Omar Vizquel and Rusty Staub both had fine, noteworthy major-league careers. But did they have better careers than, say, 5-time Cy Young winner Randy Johnson, who "only" amassed 282.2 pWins in his illustrious career?
Wins over Positional Average
In constructing Player wins and losses, all events are measured against expected, or average, results across the event. Because of this, fielding Player Won-Lost records are constructed such that aggregate winning percentages are 0.500 for all fielding positions. Hence, one can say that a shortstop with a defensive winning percentage of 0.475 was a below-average defensive shortstop and a first baseman with a defensive winning percentage of 0.510 was an above-average defensive first baseman, but there is no basis for determining which of these two players was a better fielder – the below-average fielder at the more difficult position or the above-average fielder at the easier position.

From an offensive perspective, batting Player Won-Lost records are constructed by comparing across all batters, not simply batters who share the same fielding position. In the National League, this means that offensive comparisons include pitcher hitting, so that, on average, non-pitcher hitters will be slightly above average in the National League, while, of course, because of the DH rule, the average non-pitcher hitter will define the average in the American League.

In order to compare players across positions, it is therefore necessary to normalize players' records relative to an average player at the position(s) a player played. Doing this, we can see, in the above table, that while Rusty Staub amassed a better career winning percentage (0.513) than Omar Vizquel (0.492), Vizquel was above-average for the position(s) he played (mostly shortstop), with 3.5 wins above "positional average", while Staub was below-average for the position(s) he played (mostly RF), with -1.8 wins "above" positional average. I describe the calculation of positional averages in more detail in a separate article.

The top 50 players in career pWOPA over the Retrosheet Era (1938 - 2013) are shown in the table below.

Top 50 Players in Wins over Positional Average
Player pWins pLosses pWin Pct. pWOPA pWORL
Barry Bonds462.0316.80.59359.397.0
Roger Clemens319.4228.40.58351.680.5
Willie Mays458.4330.70.58150.795.2
Mickey Mantle352.0229.60.60550.683.3
Joe L. Morgan364.5283.20.56346.279.3
Greg Maddux330.8272.60.54845.577.4
Alex Rodriguez353.8278.60.55942.273.3
Hank Aaron493.3373.80.56941.790.1
Warren Spahn343.0286.00.54541.573.3
Tom Seaver313.1259.90.54638.665.8
Randy 'Big Unit' Johnson282.2222.20.55938.565.5
Derek Jeter348.3297.90.53934.666.3
Eddie Mathews315.9236.30.57234.265.3
Bob Gibson267.5222.40.54633.958.4
Pedro J. Martinez194.4138.60.58433.752.1
Jim Palmer246.4189.80.56533.554.4
Chipper Jones337.2259.30.56533.262.6
Pee Wee Reese285.0226.40.55733.163.0
Juan Marichal236.3191.00.55332.553.9
Mike Schmidt332.9255.00.56632.559.9
Whitey Ford218.0169.30.56332.452.0
Steve Carlton343.4309.10.52632.062.9
Yogi Berra238.2177.10.57431.855.1
Albert Pujols270.9187.90.59131.353.7
Mariano Rivera127.061.20.67529.542.3
John Smoltz240.2201.10.54429.353.2
Stan Musial332.2251.00.57029.362.5
Mike Mussina224.4174.30.56329.251.0
Rickey Henderson428.2353.60.54829.166.1
Tom Glavine281.4251.60.52829.157.3
Lou Whitaker298.9253.10.54128.854.5
Frank Robinson396.5308.10.56328.868.2
Ted Williams272.6195.60.58228.355.2
Johnny Bench250.6197.10.56027.149.8
Cal Ripken377.9348.40.52026.761.0
Barry Larkin288.4245.10.54126.752.4
Jackie Robinson192.1135.90.58626.444.8
Fergie Jenkins290.4256.10.53126.152.2
Manny Ramirez318.1252.50.55726.054.0
Duke Snider269.0201.10.57225.952.3
J. Kevin Brown205.4166.70.55225.845.8
Joe DiMaggio175.5115.30.60425.643.0
Al Kaline380.1301.90.55725.663.9
Curt Schilling207.6172.80.54625.546.4
Reggie Jackson367.7298.40.55225.558.6
Bob Lemon200.7164.20.55025.244.1
Robin Roberts297.3269.00.52525.253.7
Tommy John286.6253.90.53025.051.0
Don Sutton324.8300.20.52024.153.9
Tim Hudson181.5147.40.55224.141.7


Focusing on players' wins above average helps to highlight players who had relatively short but brilliant careers, players like Pedro Martinez, whose 194.4 career pWins rank a fairly low 349th in the Retrosheet Era, while his 33.7 pWOPA rank a much more impressive 15th, or Mariano Rivera, whose 127.0 pWins rank even lower than Pedro's (889th) but who ranks 25th in career pWOPA with 29.5.

Wins over Replacement Level
Replacement Level is the level of performance which a team should be able to get from a player who they can find easily on short notice – such as a minor-league call-up or a veteran waiver-wire pickup. The theory here is that major league baseball players only have value to a team above and beyond what the team could get from basically pulling players off the street. That is, there’s no real marginal value to having a third baseman make routine plays that anybody who’s capable of playing third base at the high school or college level could make, since if a major-league team were to lose its starting third baseman, they would fill the position with somebody and that somebody would, in fact, make at least those routine plays at third base. This is similar to the economic concept of Opportunity Cost.

For my work, I define Replacement Level as equal to a winning percentage one weighted standard deviation below Positional Average, with separate standard deviations calculated for pitchers and non-pitchers. Unique standard deviations are calculated in this way for each year. These standard deviations are then applied to the unique Positional Averages of each individual player. Overall, this works out to an average Replacement Level of about 0.448 (0.454 for non-pitchers, and 0.437 for pitchers). A team of 0.448 players would have an expected winning percentage of 0.343 (56 - 106 over a 162-game season). I describe the calculation of replacement levels and wins over replacement level (WORL) in more detail in a separate article.

The top 50 players in career pWORL over the Retrosheet Era (1938 - 2013) are shown in the table below.

Top 50 Players in Wins over Replacement Level
Player pWins pLosses pWin Pct. pWOPA pWORL
Barry Bonds462.0316.80.59359.397.0
Willie Mays458.4330.70.58150.795.2
Hank Aaron493.3373.80.56941.790.1
Mickey Mantle352.0229.60.60550.683.3
Roger Clemens319.4228.40.58351.680.5
Joe L. Morgan364.5283.20.56346.279.3
Greg Maddux330.8272.60.54845.577.4
Warren Spahn343.0286.00.54541.573.3
Alex Rodriguez353.8278.60.55942.273.3
Frank Robinson396.5308.10.56328.868.2
Derek Jeter348.3297.90.53934.666.3
Rickey Henderson428.2353.60.54829.166.1
Tom Seaver313.1259.90.54638.665.8
Randy 'Big Unit' Johnson282.2222.20.55938.565.5
Eddie Mathews315.9236.30.57234.265.3
Al Kaline380.1301.90.55725.663.9
Pee Wee Reese285.0226.40.55733.163.0
Steve Carlton343.4309.10.52632.062.9
Chipper Jones337.2259.30.56533.262.6
Stan Musial332.2251.00.57029.362.5
Cal Ripken377.9348.40.52026.761.0
Carl Yastrzemski427.4362.00.54118.960.8
Mike Schmidt332.9255.00.56632.559.9
Reggie Jackson367.7298.40.55225.558.6
Bob Gibson267.5222.40.54633.958.4
Tom Glavine281.4251.60.52829.157.3
Nolan Ryan359.1332.00.52023.656.6
Ted Williams272.6195.60.58228.355.2
Pete Rose Sr.437.9391.60.52812.355.2
Yogi Berra238.2177.10.57431.855.1
Lou Whitaker298.9253.10.54128.854.5
Jim Palmer246.4189.80.56533.554.4
Manny Ramirez318.1252.50.55726.054.0
Don Sutton324.8300.20.52024.153.9
Juan Marichal236.3191.00.55332.553.9
Gaylord Perry325.7297.70.52223.553.8
Albert Pujols270.9187.90.59131.353.7
Robin Roberts297.3269.00.52525.253.7
John Smoltz240.2201.10.54429.353.2
Barry Larkin288.4245.10.54126.752.4
Duke Snider269.0201.10.57225.952.3
Fergie Jenkins290.4256.10.53126.152.2
Pedro J. Martinez194.4138.60.58433.752.1
Whitey Ford218.0169.30.56332.452.0
George Brett327.7272.70.54623.751.6
Harmon Killebrew300.1235.10.56121.251.3
Willie McCovey297.6223.50.57122.951.1
Mike Mussina224.4174.30.56329.251.0
Tommy John286.6253.90.53025.051.0
Robin Yount357.7335.40.51618.850.9


Measuring against replacement level instead of average helps to weed out pure compilers such as Rusty Staub while showing a mix of short excellent careers (e.g., Pedro Martinez) together with long, more modestly above-average careers, such as Brooks Robinson.

Wins vs. WOPA vs. WORL
The choice between wins, WOPA, and WORL will likely depend on exactly what one is looking for. And there's no reason to limit oneself to just one of these three. To help with this, I've created a page that allows one to create a customized statistic using whatever weights one would like. I have also written a separate article which looks at various factors that one might consider in constructing such a statistic.

I look briefly next at a few possible sets of weights that might have some appeal.

Replicating WAR
Probably the most popular "uber-stat" for measuring baseball players' value is Wins above Replacement (WAR). Measures of WAR are presented on player pages at both Baseball-Reference.com as well as at Fangraphs.com. I compare my eWORL to Baseball-Reference's version of WAR in a separate article.

WAR are built up from net wins while WORL are built up from wins over 0.500. I discuss the difference between these two concepts in a separate article. Basically, net wins are double the magnitude of wins over 0.500. WAR and WORL agree, however, in their calculation of the number of wins between average and replacement - what Baseball-Reference calls Rrep (replacement runs). Putting it into formula form,

WAR = (Net Wins over Average) + Wrep
WORL = WOPA + Wrep

where Wrep is Rrep, converted to wins (replacement wins).

(Net Wins over Average) is basically two times WOPA (which is basically wins over 0.500): e.g., a 20-10 record is 10 net wins (20 minus 10) but 5 wins over 0.500 (15-15). From the second equation, we get that Wrep is equal to (WORL - WOPA). Plugging these two facts into the first equation, then, we can express WAR as a function WOPA and WORL as follows:

WAR = 2*WOPA + (WORL - WOPA) = WOPA + WORL

Hence, one can calculate the pWin or eWin-based equivalent of WAR by adding WOPA plus WORL.
Changing Replacement Level
It may be the case that somebody doesn't like my choice for replacement level. One could approximate an alternate replacement level by weighting Wins, WOPA, and/or WORL.

As I mentioned above (and explain in a separate article), my replacement level works out to around 0.450. Wins over positional average (WOPA) works out to 0.500 on average. Knowing this, we can make two general statements:

(1)WORL - WOPA=.050*(Player Decisions)
(2)Wins - WORL=.450*(Player Decisions)


Suppose you wanted to set replacement level at 0.480. You want to add 0.020*(Player Decisions) to WOPA, or, from (1) above: 0.4*(WORL - WOPA), i.e.,

Wins over 0.480 = WOPA + 0.4*(WORL - WOPA) = 0.6*WOPA + 0.4*WORL



Suppose you wanted to set replacement level at 0.350. You want to add 0.100*(Player Decisions) to WORL, or, from (2) above: (.1/.45)*(Wins - WORL), i.e.,

Wins over 0.350 = WORL + (2/9)*(Wins - WORL) = (7/9)*WORL + (2/9)*Wins



The next part of this article explains how I calculate Player won-lost records.

Calculating Player Won-Lost Records: pWins and pLosses

The starting point for constructing pWins and pLosses is Win Probabilities. The concept of Win Probability was first developed by Eldon and Harlan Mills in 1969 and published in their book, Player Win Averages.

The basic concept underlying win probability systems is elegantly simple. At any point in time, the situation in a baseball game can be uniquely described by considering the inning, the number and location of any baserunners, the number of outs, and the difference in score between the two teams. Given these four things, one can calculate a probability of each team winning the game. Hence, at the start of a batter’s plate appearance, one can calculate the probability of the batting team winning the game. After the completion of the batter’s plate appearance, one can once again calculate the probability of the batting team winning the game. The difference between these two probabilities, typically called the Win Probability Advancement or something similar, is the value added by the offensive team during that particular plate appearance (where such value could, of course, be negative).

If we assume that the two teams are evenly matched, then the initial probability of winning is 50% for each team. At the end of the game, the probability of one team winning will be 100%, while the probability of the other team winning will be 0%. The sum of the Win Probability advancements for a particular team will add up to exactly 50% for a winning team (100% minus 50%) and exactly -50% for a losing team (0% minus 50%). Hence, Win Probability Advancement is a perfect accounting structure for allocating credit for team wins and losses to individual players.

Changes in Win Probabilities are credited to the individual players responsible for these changes. These contributions are called Player Game Points here. Positive changes in Win Probabilities are credited as Positive Player Game Points, while negative changes in Win Probabilities are credited as Negative Player Game Points.

Player Game Points are assigned to both offensive and defensive players on each individual play. Anything which increases the probability of the offensive team winning is credited as Positive Points to the offensive player(s) involved and as Negative Points to the defensive player(s) involved. Anything which increases the probability of the defensive team winning is credited as Positive Points to the defensive player(s) involved and as Negative Points to the offensive player(s) involved. Within any individual game, the number of Positive Player Game Points by offensive players on one team will be exactly equal to the number of Negative Player Game Points by defensive players on the other team and vice versa. Similarly, the number of Positive Player Game Points collected by members of the winning team will exactly equal the number of Negative Player Game Points accumulated by the losing team (and, again, vice versa).

Player Game Points assigned in this way provide a perfect accounting structure for assigning 100% of the credit for all changes in Win Probability to players on both teams involved in a game. The sum of the Positive Player Game Points minus the sum of the Negative Player Game Points for one team in one game will always be the same for any team win (0.5) or loss (-0.5). Most Win Probability systems which I have seen focus on a single number, which is (more or less) the difference between Positive Player Game Points and Negative Player Game Points, and define this number as something like Win Probability Advancements or Game State Wins, or the like. For such systems, this is where the process stops, with final results being expressed as some measure of net Win Probability added, although Fangraphs and Baseball-Reference do show what they call "WPA+" and "WPA-" separately.

Personally, I find such a construction unsatisfactory. To my mind, net Win Probabilities added don’t reveal the full context in which a player’s performance took place. From my perspective, 9 wins and 2 losses is a different performance than 15 wins and 8 losses, and that difference needs to be maintained. Moreover, expressing Win Probability Added (WPA) as a single number does not enable one to isolate the specific contextual factors underlying that performance, thereby assessing the extent to which a player’s performance was influenced by the performances of his teammates and the specific timing of his performance.

Hence, I convert these Player Game Points into Context-Dependent Player Wins and Losses, which I call pWins and pLosses. I simultaneously construct Context-Neutral Player Wins and Losses, called eWins and eLosses as well, which can be compared to Context-Dependent Player Wins and Losses to identify the contextual factors affecting players’ performances and how those contextual factors affect the translation of Player Wins and Losses into team wins and losses.

For both Context-Dependent and Context-Neutral Player Games, two adjustments are made to these results to move from initial Player Game Points to Player Won-Lost records.

1.    Normalizing Component Won-Lost Records to 0.500
A key implicit assumption underlying my Player Won-Loss Records is that Major League Baseball players will have a combined winning percentage of 0.500. While this is trivially true at the aggregate level, almost regardless of what you do, it should also be true at finer levels of detail as well.

So, for example, if Player Won-Loss records are calculated correctly, the total number of wins accumulated by baserunners on third base for advancing on wild pitches and passed balls should be exactly equal to the total number of losses accumulated by baserunners on third base for failing to advance on wild pitches or passed balls. Likewise, the total number of wins accumulated by second basemen for turning double plays on ground balls in double-play situations should be exactly equal to the total number of losses accumulated by second basemen for failing to turn double plays on ground balls in double-play situations.

To ensure this symmetry, therefore, I normalize Player Game Points to ensure that the total number of Positive Player Game Points is exactly equal to the number of Negative Player Game Points for every Component of Player Game Points as well as by sub-component, at the finest level of detail which makes logical sense in each case.

2.    Normalizing Player Game Points by Game
The total number of Player Game Points accumulated in an average Major League Baseball game is around 3.3 per team. This number varies tremendously game-to-game, however, with some teams earning 2 wins in some team victories while some other teams may earn 6 wins in team losses. At the end of the day (or season), however, all wins are equal. Hence, in my work, I have chosen to assign each team one Player Win and one Player Loss for each team game. In addition, the winning team earns a second full Win, while the losing team earns a second full Loss. Ties are allocated as 1.5 Wins and 1.5 Losses for both teams. Context-neutral player decisions (eWins and eLosses) are also normalized to average three Player decisions per game. For eWins and eLosses, this normalization is done at the season level, rather than the game level, so that different numbers of context-neutral player decisions will be earned in different games.

Why 3 Player Decisions per Game?

The choice of three Player Decisions per game here is largely arbitrary. I chose three because the resulting Player Won-Lost records end up being on a similar scale to traditional pitcher won-lost records, with which most baseball fans are quite familiar. For example, expressed in this way, Jayson Werth led the major leagues in 2010 with 23.5 (Context-Dependent) Player Wins, while Ichiro Suzuki led the majors with 21.7 losses.

In comparison, C.C. Sabathia and Roy Halladay led all major league pitchers in 2010 with 21 wins (Sabathia amassed 16.2 Player Wins, while Halladay had 17.1 ) while Joe Saunders (14.4 Player losses) led the major leagues with 17 losses. The relationship between team-dependent wins and traditional Pitcher Wins is explored elsewhere. Over the entire Retrosheet Era, the most pWins accumulated by a player in a single season was 31.0 by Babe Ruth in 1927 (against 15.1 pLosses). The most single-season pLosses were accumulated by Vladimir Guerrero in 2001 with 23.2 pLosses (and 25.5 pWins).

This normalization process has no effect on the relative ordering of players – if pWins and pLosses were normalized to be equal to 6 per game, Jayson Werth would have continued to lead the major leagues in wins in 2010, he simply would have had twice as many of them. Nor does it affect player winning percentages, as pWins and pLosses are scaled proportionally.

One consequence of my choice of three Player Decisions per team game is that, as a result of this normalization process, total Player Wins for a league as a whole will be equal to total Win Shares as constructed by Bill James. Hence, one might think of Player Won-Lost records as calculated here as measuring “true” win shares.

Because the players on a team receive only two team-dependent wins for each team win, however, the total number of team-dependent wins will be less than the total number of Win Shares for teams with winning records. On the other hand, because the players on a team receive one team-dependent win for each team loss, the total number of team-dependent wins will be greater than the total number of Win Shares for teams with losing records. The comparison between team-dependent wins and Win Shares is explored elsewhere.

Why Do Players Get Wins in Games Their Team Loses?

If one is interested in assigning credit to players for team wins or blame to players for team losses, one might think that it would make sense to only credit a player with Player wins in games which his team won and only credit Player losses in games which his team lost. I have chosen instead to give players some wins even in team losses and some losses even in team wins. I do this for a couple of reasons.

Most simply put, baseball players do tons of positive things in team losses and baseball players do tons of negative things in team wins. Throwing away all of those things based solely on the final score of the game leads, in my opinion, to too much valuable data simply being lost. It makes the results too dependent on context.

As I noted above, in the average major-league baseball game of the Retrosheet Era (1938 - 2013), the average team amasses 3.3 Player Game Points. The win probability for the winning team goes from 50% at the start of the game to 100% at the end, so that the winning team will amass exactly 0.5 more positive Player Game Points than negative Player Game Points by construction. This means that the players on an average winning team will amass a combined record of something like 1.9 - 1.4 in an average game. That works out to a 0.576 winning percentage, or about 93 wins in a 162-game schedule (93-69). Put another way, more than 40% of all Player Game Points (1 - 0.576) would be zeroed out in a system that credited no Player wins in team losses (or Player losses in team wins). That's simply too much for me to be comfortable making such an adjustment.

There are two reasons why such a large percentage of plays do not contribute to victory. First, it is indicative, I think, of the fairly high level of competitive balance within Major-League Baseball. Put simply, even very bad Major-League Baseball teams are not that much worse than very good Major-League Baseball teams.

But the other reason why such a large percentage of plays do not contribute to victory, and why I assign player wins even in team losses and vice-versa, is because of the rules of baseball. Because there is no clock in baseball, the only way for a game to end is for even the winning team to do some things that reduce its chances of winning: it has to make 3 outs per inning for at least 8 innings (not counting rain-shortened games). Likewise, a losing team is guaranteed to do some things that increase its chance of winning: it must get the other team out 3 times per inning.

My system still rewards players who do positive things that contribute to wins more favorably than players who do positive things that lead to losses. As I noted above, an average team will amass a player winning percentage of approximately 0.576 in team wins (and 0.424 in team losses). By assigning 2 wins and only 1 loss in team wins, however, players will amass a 0.667 player winning percentage in team wins (and 0.333 in team losses). So, player wins that lead to team wins will still be more valuable than player wins that happen in team losses. The latter are simply not worthless.

Relationship of Player Decisions to Team Decisions

Under my system, to move from players’ team-dependent won-lost records (pWins and pLosses) to a team won-lost record, one subtracts out what I call “background wins” and “background losses.” One-third of a player’s decisions are background wins and one-third of a player’s decisions are background losses. Mathematically, then, if the sum of the team-dependent won-lost records of the players on a team is W wins and L losses, then the team’s won-lost record will be as follows:

Team Wins = W – (W + L) / 3;     Team Losses = L – (W + L) / 3

What this means is that a team of, say, .450 players will not play .450 ball, but will instead play something closer to .350 ball. Consider, for example, the 2011 Houston Astros. The 2011 Astros finished with a record of 56-106 (a .346 wining percentage). I view them as having been essentially a replacement-level team. The players on the 2011 Astros earned a total of -0.3 pWORL. The combined winning percentage of the Astros' players, however, was not the .346 winning percentage the Astros amassed, but, instead, was 0.449.

The relationship between player and team wins, expressed as wins in a 162-game season are shown in the table below.

Team Wins Team Win Pct. Player Wins Player Win Pct.
50 0.309 212 0.436
60 0.370 222 0.457
70 0.432 232 0.477
75 0.463 237 0.488
80 0.494 242 0.498
81 0.500 243 0.500
82 0.506 244 0.502
85 0.525 247 0.508
90 0.556 252 0.519
95 0.586 257 0.529
100 0.617 262 0.539
110 0.679 272 0.560


There are two implications to this relationship between player wins and team wins. First, the range of winning percentages for players is narrower than the range of team winning percentages. This is important in evaluating the concept of replacement level. As noted above, in my work, team-level replacement level is a winning percentage between 0.340 and 0.350. But, player-level replacement level is closer to 0.450.

The second implication is that player wins and losses do not have a purely additive effect on team wins and losses; instead, the effect is somewhat more multiplicative. In an average game, the players on the winning team will amass a (context-neutral) winning percentage of approximately 0.576 - not all that much above 0.500. Having players who are a little bit better than average will translate into a team that is a lot better than average. In fact, as the above table shows, a team of 0.576 players would win well over 100 games in a 162-game season. The reverse is true of below-average players. A team of slightly below-average players will lose far more often than they win: as noted earlier, the players on the 2011 Houston Astros amassed a pWinning percentage of 0.449. In fact, that number has already been adjusted to reflect the Astros' team record of 56-106, and hence understates the raw context-neutral performance of the Astros' players. In terms of raw context-neutral numbers, with no adjustments, the combined performance of the players on the 2011 Houston Astros was a player winning percentage of 0.482. In other words, in this case, a team of 0.482 players became a 0.346 team.

Players' final won-lost records will be pushed away from 0.500 depending on exactly how their performance translates into team wins and losses. So, the final Player records of the Astros' players fell from 0.482 to 0.449 because the players' losses contributed more to losses than the players' wins were able to contribute to team victories. By tying to team wins and losses, pWins and pLosses for a player will be dependent on the context in which they take place. Part of that context is the quality of a player's teammates.

But even beyond the actual context of pWins and pLosses, this tendency of player records to push away from 0.500 also affects eWins and eLosses for a player as well. We can expect players with context-neutral won-lost records over 0.500 to have their record translate into (slightly) more wins than might be implied by their raw record, and players with context-neutral won-lost records below 0.500 to have their record translate into (slightly) more losses than their raw record. This expected team win adjustment to context-neutral player won-lost records (eWins and eLosses) is discussed later in this article.

pWins vs. Win Probability Advancements (WPA)
As noted above, the central building block in calculating pWins and pLosses is the concept of Win Probability.

Win Probabilities have achieved some recent popularity through a statistic called WPA (Win Probability Advancements). Player won-lost records, pWins and pLosses, are built from the same building blocks as Win Probability Advancements. In effect, the base for pWins is what Baseball-Reference (and others) calls WPA+, "positive Win Probability Added", and the base for pLosses is WPA-, "negative Win Probability Added". I do not end there, however, so that pWins are not simply equal to either WPA or WPA+ (nor are pLosses equal to WPA-).

As I explain above, in calculating pWins and pLosses, I normalize Win Probability Advancements (what I call "Player Game Points") by Game. The total number of Player Game Points accumulated in an average Major League Baseball game is around 3.3 per team. This number varies tremendously game-to-game, however, with some teams earning 2 wins in some team victories while some other teams may earn 6 wins in team losses. At the end of the day (or season), however, all wins are equal. Hence, in my work, I have chosen to assign each team one Player Win and one Player Loss for each team game. In addition, the winning team earns a second full Win, while the losing team earns a second full Loss. Ties are allocated as 1.5 Wins and 1.5 Losses for both teams. Context-neutral player decisions (eWins/eLosses) are also normalized to average three Player decisions per game. For eWins and eLosses, this normalization is done at the season level, rather than the game level, so that different numbers of context-neutral player decisions will be earned in different games.

A few examples may help to clarify differences between WPA and pWins.

pWins vs. WPA, Example 1: How Many Wins can One Player Get in One Game?
Win Probability Advancements are structured such that net Win Probability Advancements (WPA+ minus WPA-) sum to exactly 0.5 for every team win and exactly -0.5 for every team loss. Hence, team wins are equal to exactly two times WPA. In Game 6 of the 2011 World Series, David Freese and Lance Berkman accumulated a combined WPA of 1.8 (according to Baseball-Reference). In other words, Freese and Berkman combined to "win" 3.6 games that night - and their teammates (mostly their bullpen) combined to "lose" 2.6 games. But, of course, when Game 6 of the World Series was over, the Cardinals had only won 1 game, not 3.6, and they had certainly not lost 2.6 games.

Normalizing the Win Probability Advancements of the St. Louis Cardinals in that game, the players on the Cardinals earned a combined 2 pWins and a combined 1 pLoss by construction. And how many net batting pWins did Freese and Berkman combine for? It turns out that David Freese and Lance Berkman earned a combined 0.75 net pWins (for their hitting).

Sure, they were more responsible for that win than any of their teammates, but at the end of the day, it was still just one win - a very important, very dramatic win, but just one win.

pWins vs. WPA, Example 2: Solo Home Runs in 1-0 Games
In a separate article, I look at how context can affect value as I measure it via Player won-lost records by comparing two games during the 2002 season which the Los Angeles Doders won at Dodger Stadium by a score of 1-0 with the only run of the game scoring on a solo home run.

A comparison of these two games gives a nice example of how my Player wins differ from a straight application of Win Probability Advancements.

On August 28, 2002, starting pitcher Odalis Perez hit a solo home run with two outs in the bottom of the fifth inning off of Arizona’s Rick Helling for the only run in a 1-0 Dodgers win.

On September 27, 2002, Paul LoDuca led off the bottom of the tenth inning with a home run off of San Diego’s Jeremy Fikac to break a scoreless tie and give the Dodgers a walkoff 1-0 victory.

From a context-neutral perspective, Perez’s and LoDuca’s home runs were exactly equal in value: 0.1377 wins, since both home runs took place in the same run-scoring environment – Dodger Stadium in 2002. From a prospective perspective, on the other hand, LoDuca’s home run, which ended the game, was more than twice as valuable, 0.3617 wins, as Perez’s home run, at 0.1691 wins. These values correspond exactly to a WPA valuation system: Baseball-reference.com, for example, reports the WPA of these home runs as being 37% for LoDuca and 17% for Perez.

In retrospect, Perez’s home run was more valuable than it seemed at the time, however, as it turned out to be the difference in the game. The normalization process which I employ reflects this by boosting the final value of this home run, relative to its prospective (WPA) values, to a final value of 0.2531 wins. On the other hand, LoDuca’s home run is reduced in value retrospectively to 0.3363 wins since the high context at the time of LoDuca's home run was created in large part by the events which came before (most notably the fine pitching of LoDuca's teammates Omar Daal and Eric Gagne). The final result is that LoDuca’s home run is still more valuable than Perez’s, because it was a game-ending home run, while the Diamondbacks still had four more innings to come back from Perez’s home run. But LoDuca’s home run is not twice as valuable as Perez’s but instead is less than one-third more valuable.

This is all summarized in the table below.

Player Wins
Context- Prospective Context- Contexts
Date Batter Situation Neutral Inter-Game Dependent Inter-Game Intra-Game Combined
8/28/2002 Odalis Perez 2 out, bottom 5 0.1377 0.1691 0.2531 1.2287 1.4966 1.8388
9/27/2002 Paul LoDuca Lead-off, bottom 10 0.1377 0.3617 0.3363 2.6271 0.9298 2.4428


I think that these results are a reasonable reflection of the relative value of these two home runs.

pWins vs. WPA, Example 3: Value of Breaking a Game Open Early vs. Coming up Clutch Late
Later in this article, I compare the 2005 seasons of David Ortiz and Alex Rodriguez when they finished 1-2 in voting for the AL MVP award.

David Ortiz's MVP candidacy that year rested in large part on his having been particularly clutch. For example, Big Papi led the American League in Win Probability Added (WPA) that season. WPA tracks what I call inter-game win adjustments. But I also adjust for what I call intra-game adjustments, which normalize the total number of pWins (and pLosses) to be constant across all team wins and losses.

While David Ortiz beat Alex Rodriguez (and everybody else) in inter-game win adjustments, Rodriguez beat Ortiz in intra-game adjustments. A comparison of two of their games is instructive in this regard.

On September 29, 2005, David Ortiz went 3-5 including a home run leading off the bottom of the 8th inning to tie the score 4-4 and a walkoff RBI single with one out in the bottom of the 9th inning. Baseball-Reference.com credits Ortiz with a WPA of 0.584 for the game. Obviously, those hits were huge for the Red Sox and Ortiz was rightly celebrated as the hero of that game.

On April 26, 2005, the Yankees defeated the Los Angeles Angels of Anaheim (or whatever they were calling themselves that season) 12-4. The Yankees took a 3-0 lead in the bottom of the first inning and led 10-2 by the end of the 4th inning. Obviously, there weren't a lot of "clutch" situations in this game. It was over early. Do you know why it was over early? Because Alex Rodriguez hit a 2-out, 3-run home run in the bottom of the first inning to give the Yankees that 3-0 lead, he hit a 2-out, 2-run home run in the bottom of the third inning to extend the Yankees' lead to 5-2, and he capped it off with a 2-out grand slam in the bottom of the 4th inning to give the Yankees that aforementioned 10-2 lead. For all of that, Baseball-Reference.com only credits Alex Rodriguez with a WPA of 0.490 for that game.

Take Ortiz's two RBIs off the scoreboard for the Red Sox in that September 29th game, and the Blue Jays would have won that game 4-3. Then again, if Ortiz made a (single) out in his final at-bat, Manny Ramirez would have come to bat with the potential winning run still in scoring position (albeit with two outs).

Take Rodriguez's ten RBIs off the scoreboard for the Yankees on April 26th and the Angels would have won that game 4-2. Moreover, all three of Rodriguez's home runs came with two outs in the inning. Turn them into outs and the Yankees would have had no further opportunities in any of those innings.

In retrospect, Alex Rodriguez's performance that day was not merely every bit as valuable as Ortiz's, but almost certainly more so, even if it was less "clutch" by a conventional inter-game "win probability" reckoning. My Player won-lost records credit David Ortiz with a batting won-lost record that day of 0.476 - 0.020, good for 0.456 net wins. Alex Rodriguez had a batting won-lost record on his big day of 0.908 - 0.000, good for 0.907 net wins.

As with the example of the home runs at Dodger Stadium, I think my adjustments result in more reasonable measures of the relative values of these two batting performances.

Context-Neutral Player Won-Lost Records: eWins and eLosses

In addition to pWins and pLosses, which tie to team wins, I also construct a set of player wins and losses which attempt to control for context. I call these eWins and eLosses, where the "e" stands for "expected". These are, in effect, how many wins (and losses) a player would have been expected to contribute to a team if he played in a perfectly neutral context with perfectly average teammates.

These eWins and eLosses are built up from three pieces: context-neutral win probabilities, expected context, and an expected team win adjustment.

Context-Neutral Win Probabilities
Traditionally, win-probability systems are purely context-dependent. In fact, however, I do not think that this is necessarily the appropriate starting point for measuring player value. Rather, I am interested in beginning with an assessment of players’ performances in the absence of the contexts in which the players actually performed. That is, what would the expected won-lost record be for a player, given his actual performance, assuming that performance had come in a neutral context? To answer this question, I construct a set of context-neutral Player Game Points. Once these are constructed, I can then add back in the contextual information in a way that clearly identifies how players’ values were affected by the context in which they performed.

Player Game Points are divided into three categories for the purpose of calculating context-neutral win probabilities: independent events, base-state dependent events, and purely contextual events.
1.    Independent Events
Most events can happen regardless of the base-out situation. One can strike out at any time, regardless of how many baserunners or outs there are. Similarly, a triple could happen at any time regardless of the number of baserunners. All batter results, except for double plays (which are base-state dependent), intentional walks, and bunts, fall into the category of independent events. Intentional walks and bunts are treated as purely contextual events, which are described below.

For independent events, the expected win probability of such an event is calculated for each event within the league-year using the Win Probability Matrix for the ballpark in which the event took place.

For example, the win probability of a home run at Wrigley Field in 2005 is calculated by taking every plate appearance that took place in a National League ballpark in 2005 and calculating, for that plate appearance, what the added win probability would have been had the game been played in Wrigley Field and the batter hit a home run. The context-neutral win probability of a home run at Wrigley Field in 2005 is then equal to the average of all of these probabilities. In this case, the average win probability added by a home run at Wrigley Field in 2005 was 0.140 wins.

In the case of events which may or may not lead to baserunner advancement – e.g., outs, singles, doubles – expected results are calculated based on average baserunner advancement, just as is done with contextual Player Game Points.
2.    Base-State Dependent Events
Some events can only happen given certain baserunners or a certain number of outs. For example, one can only ground into a double play with at least one baserunner on and less than two outs. Any Player Game Points accumulated by a baserunner on third base can, of course, only be accumulated in a base-out state that includes a runner on third base.

For baserunner game points (except for stolen bases, which are treated as purely contextual events and discussed below) and double plays, the context-neutral win probability of the event is calculated the same as for independent events, except that the average win probability is only calculated across events with relevant base-out states.

So, for example, the context-neutral Player Game Points associated with a double play are calculated as the average win probability, given the ballpark in which the game takes place, added from hitting into a double play across double-play situations (runner on first base and less than two out). For a ground ball to the shortstop at Wrigley Field in 2005, the average win probability added by a double play is 0.015 losses (from the batter’s perspective) (on top of the 0.046 losses accrued from the initial ground-out).

For baserunner advancements and baserunner outs, context-neutral win probabilities are only averaged given the specific batting event and hit type. That is, the context-neutral Player Game Points for a runner on third base advancing on a fly out are calculated only considering plays in which a runner on third base advances on a fly out. Similarly, the context-neutral Player Game Points for a runner on first base who only advances to second base on a single are calculated only considering plays in which a runner on first base does not advance to third on a single.
3.    Purely Contextual Events
While it is possible to remove much, if not all, of the context from most plays, there are certain plays which are, essentially, purely elective plays, and are therefore inextricably tied to the context in which they take place. In my opinion, it would be wrong to attempt to divorce these plays from their context.

Three types of plays fall into this category: intentional walks, stolen base attempts (including stolen bases, caught stealings, pickoffs, and balks), and bunts (regardless of either situation or outcome). In each of these three cases, the context-neutral Player Game Points are simply set equal to context-dependent Player Game Points.

Constructing eWins and eLosses from Context-Neutral Win Probabilities
Context-neutral Player Wins and Losses are normalized to be equal to aggregate Context-dependent Player Wins and Losses for each component and sub-component. Hence, the total number of Context-Neutral Player Wins accumulated for a particular type of event or sub-event – say, home runs – will equal the total number of Context-Dependent Player Wins accumulated over the same set of events. This normalization is done at the season/league level. At the game or team level, however, the total number of context-neutral player decisions need not be equal to the number of context-dependent decisions, either at the component level or in the aggregate.

Having completed this normalization process, one might think that the construction of eWins and eLosses is complete. In fact, however, eWins and eLosses are intended to reflected expected wins and losses. As such, two more adjustments are made to produce final eWins and eLosses.

Specifically, context-neutral Player Game Points are converted into eWins and eLosses by making two expected contextual adjustments: Context and Win Adjustments.

Expected Context
In relating player wins and losses to team wins and losses, the context in which a player’s performance takes place matters. This is reflected in two context measures related to Context-Dependent player decisions: inter-game context and intra-game context.

In calculating context-neutral player decisions, one might think that the most obvious thing to do would be to simply set inter-game and intra-game context both equal to 1 for all players. In fact, however, this will lead to there being a clear and obvious relationship between players’ positions and their tendency to have more or less Context-Dependent player decisions (pWins, pLosses) than Context-Neutral decisions (eWins, eLosses). Because of this, I think that it is more appropriate to calculate an Expected Context for each player, based on the position(s) which the player played. This is done as follows.

Offense: Batting and Baserunning

Expected contexts are calculated for four different positions: pinch hitter, pinch runner, pitcher, and other. For each of these positions, expected context is set equal to the average context for the position for the league and season in question.

Pitching

Starting pitchers have an average inter-game context of 1.000 and an average intra-game context of 1.078, a combined average context of 1.078. For relief pitchers, the numbers are 1.000, 0.821, and 0.821, respectively. Expected contexts for starting pitchers are set equal to the average context for starting pitchers for the relevant league and season. The same is true for relief pitchers: expected context is set equal for all relief pitchers – closers, setup men, mopup men – regardless of their actual context.

Fielding

Over the Retrosheet Era, there is no obvious relationship between context and fielding position. Hence, expected context is set equal to 1 for all fielding player decisions.

Final Results

Expected context for a player is calculated by taking the weighted average of the expected contexts for the player’s offensive, pitching, and fielding decisions.

Expected Win Adjustments
One of the key implications of my work is that the difference between winning and losing is very small in Major-League Baseball. In an average Major-League Baseball game during the Retrosheet Era, for example, the winning team accumulated around 1.9 positive Player Game Points – the building block of Player Wins – and 1.4 negative Player Game Points – the building block of Player Losses. In other words, the average winning team compiled a team winning percentage of 1.000 (by definition), but the Players on that team compiled a combined winning percentage of something like 0.576, which works out to about an 93-69 record in a 162-game schedule.

Looking at the issue from the opposite direction, teams whose players compiled a combined Player winning percentage around 0.510 (0.505 – 0.515) had an average team winning percentage of about 0.564 (91-71), while teams whose players compiled a combined Player winning percentage around 0.490 (0.485 – 0.495) had an average team winning percentage around 0.442 (72-90).

Being a little above average helps a lot in producing team victories.

For pWins and pLosses, this is reflected in the Intra-Game Win Adjustment which ties Player Won-Lost records to team won-lost records. In looking at intra-game win adjustments, it is obvious that intra-game win adjustments correlate at least somewhat reasonably well with player winning percentages – good players tend to have positive intra-game win adjustments, while weaker players tend to have negative intra-game win adjustments. This correlation is not perfect, as one’s intra-game win adjustments are also affected by one’s teammates.

To recognize this correlation, eWins and eLosses are adjusted for intra-game win adjustments. But to maintain the context-neutrality of the results, these records are adjusted based upon expected intra-game win adjustments.

Expected intra-game win adjustments for a player are calculated based on the expected impact of the player on the record of a 0.500 team. The exact process by which these are calculated is described in some detail here. Expected win adjustments will be positive for players with context-neutral winning percentages over .500 and negative for players with context-neutral winning percentages below .500. Hence, this has the effect of increasing the spread in context-neutral player winning percentages among players.

The top (and bottom) 100 players in career regular-season eWins (total as well as over positional average, and replacement level) can be found here.

pWins and eWins: Some Details

Contextual Factors
As explained above, I calculate two measures of Player won-lost records: pWins & pLosses and eWins & eLosses. Comparing the results for these two sets of Player records, it is possible to isolate and identify the specific contextual factors that affect how player performance translates into team wins.

pWins and pLosses are tied to team wins: the players on a team earn a total of 2 pWins and 1 pLoss in every team win, and 1 pWin and 2 pLosses in every team loss. These records are highly contextual. That is, hitting a grand slam with two outs in the bottom of the ninth inning with your team trailing by three runs will earn more pWins than hitting a solo home run leading off the top of the 8th inning with your team trailing 13-1. Positive events that contribute to wins are more valuable than positive events that end up going for naught in team losses. I believe that a good case can be made that pWins and pLosses do a better job of truly capturing player value - which is an inevitable function of the context in which it occurs. Nevertheless, calculating Player wins and losses in this way leads to player value being due, at least in part, to factors outside of a player's control - the quality of his teammates, the timing of his performance.

Because of this, I also calculate a set of Player won-lost records which attempt to control for the quality of a player's teammates and the context in which he performed. I call these expected Player won-lost records, or eWins and eLosses.

Most sabermetric measures - e.g., Linear Weights, bWAR, fWAR, WARP, et al. - are designed to be context-neutral, and are therefore most comparable to my eWins and eLosses. Bill James's Win Shares do tie to team wins, but the linkage of team wins to player Win Shares is done via an across-the-board adjustment based on end-of-season data, rather than linking to team wins on a game-by-game basis, like my pWins and pLosses. Context does come into play for some subsets of players for some statistics. For example, both Baseball-Reference and Fangraphs incorporate leverage into their WAR statistics for relief pitchers.

There are two ways in which pWins & pLosses might differ from eWins & eLosses, which I call "context" and "win adjustments".

Context refers to the importance of a specific play in terms of determining team victories relative to a play of average importance. Differences in context will affect the total number of player decisions, so that, for example, a player who performed in an above-average context (>1) will earn more context-dependent player decisions (pWins + pLosses) than context-neutral player decisions (eWins + eLosses).

Win Adjustments measure differences in a player's player winning percentage across different situations, i.e., the increase in a team's probability of victory relative to the average increase in win probability associated with a particular event. So, for example, a player who hits better in the clutch than at other times may have a higher winning percentage when measured using pWins and pLosses than based on eWins and eLosses. The player's "win adjustment" would be the difference between these two winning percentages.

Context and Win Adjustments can both differ across two dimensions: inter-game or intra-game. Inter-game refers to differences in the relative importance of situations within a single game. Intra-game refers to differences in the relative importance of situations across different games.

Here are a few specific examples of players whose pWins differ from their eWins and why.

Trevor Hoffman, for example, earned 46.0 more pWins than eWins in his career (84%) mostly because the actual context in which he performed was 68% greater than his expected context.

Derek Jeter has earned 10.6 more pWins than eWins in his career largely because he has had the good fortune to play with above-average teammates, so that Jeter's positive contributions have been more likely to contribute to victories than would have been expected. This is reflected in what I call Jeter's intra-game win adjustment, which has raised his pWinning percentage by 0.021 (to 0.539) over Jeter's expected winning percentage (accounting for 13.4 additional wins).

David Ortiz earned 2.3 more pWins than eWins in 2005 mostly because of the timing of his hits (i.e., Big Papi was a great clutch hitter that year). The timing of Ortiz's hits, which I call his inter-game win adjustment, increased his winning percentage by 0.038, adding 1.2 wins to Ortiz's record.
Certainly, some people may feel that some of these differences are more "real" than others. In theory, one could adjust player records based on individual contextual factors. I have not allowed for that possibility (yet) on my custom-weight page. I do, however, allow one to assign different weights to pWins and eWins.

The choice between pWins and eWins will likely depend on one's purposes in putting together a list. One could think of pWins as measuring what actually happened, while eWins perhaps measure what should have happened. Personally, I think both of these measures provide us with useful and interesting information.

I discuss the contextual factors underlying Player won-lost records in considerably more detail in a separate article.

Components of Player Wins and Losses
Player Wins and Losses are calculated using a nine-step process, each step of which assumes average performance in all subsequent steps. Each step of the process is associated with a Component of Player Wins and Losses (Player Decisions). These nine components are outlined briefly below. Each of these components is discussed in detail in a separate article. There are four basic positions from which a player can contribute toward his baseball team’s probability of winning: Batter, Baserunner, Pitcher, and Fielder. Player Decisions are allocated to each of these four positions, as appropriate, within each of the following nine Components.

Component 1: Basestealing

Player Decisions are assessed to baserunners, pitchers, and catchers for stolen bases, caught stealing, pickoffs, and balks.

Component 2: Wild Pitches and Passed Balls

Player Decisions are assessed to baserunners, pitchers, and catchers for wild pitches and passed balls.

Component 3: Balls not in Play

Player Decisions are assessed to batters and pitchers for plate appearances that do not involve the batter putting the ball in play: i.e., strikeouts, walks, and hit-by-pitches.

Component 4: Balls in Play

Player Decisions are assessed to batters and pitchers on balls that are put in play, including home runs, based on how and where the ball is hit.

Component 5: Hits versus Outs on Balls in Play

Player Decisions are assessed to batters, pitchers, and fielders on balls in play, based on whether they are converted into outs or not.

Component 6: Singles versus Doubles versus Triples

Player Decisions are assessed to batters, pitchers, and fielders on hits in play, on the basis of whether the hit becomes a single, a double, or a triple.

Component 7: Double Plays

Player Decisions are assessed to batters, baserunners, pitchers, and fielders on ground-ball outs in double-play situations, based on whether or not the batter grounds into a double play.

Component 8: Baserunner Outs

Player Decisions are assessed to batters, baserunners, and fielders on the basis of baserunner outs.

Component 9: Baserunner Advancements

Player Decisions are assessed to batters, baserunners, and fielders on the basis of how many bases, if any, baserunners advance on balls in play.
For components where Player decisions are shared across multiple players (e.g., pitchers and fielders in Component 5), I divide credit between players based on the extent to which player winning percentages within the particular component persist over time. I describe this process in more detail in a separate article.

The distribution of Player Wins and Losses by Component varies across seasons and across leagues, depending on the exact distribution of plays. The average distribution of Player decisions by Component across all seasons of the Retrosheet Era (1938 - 2013 for now) is as follows.

Breakdowns of Player Game Points by Component: 1938 - 2013
Distribution of Player Decisions

Percentage of Component Decisions Allocated to Player Decisions
Percent of Total Batters Baserunners Pitchers Fielders
Component 12.3%0.0%100.0%45.2%54.8%
Component 21.4%0.0%100.0%75.8%24.2%
Component 315.0%100.0%0.0%100.0%0.0%
Component 435.6%100.0%0.0%100.0%0.0%
Component 532.2%100.0%0.0%30.5%69.5%
Component 63.5%100.0%0.0%25.2%74.8%
Component 71.6%86.0%14.0%36.6%63.4%
Component 82.3%41.5%58.5%0.0%100.0%
Component 96.2%45.1%54.9%0.0%100.0%
Total Off. / Def. Decisions91.4%8.6%64.0%36.0%
Total Player Decisions45.7%4.3%32.0%18.0%


I discuss the individual components of Player won-lost records in considerably more detail in a separate article.

The next two sections of this article try to go into a bit more detail about exactly what Player won-lost records are and how they can be used as analytical tools via example. First, an example Player won-lost record is shown for one player in one season: Dontrelle Willis in 2005. I then use Player won-lost records to analyze the 2005 AL MVP race between Alex Rodriguez and David Ortiz.



Example of Player Won-Lost Record: Dontrelle Willis, 2005

Perhaps the easiest way to explain what Player Won-Lost records measure and how they measure it is with an example. For my example, I will use the player that I believe was the most valuable player in the National League in 2005: Dontrelle Willis.

I present Player Won-Lost records in two ways: “context-dependent” (pWins, pLosses) and “context-neutral” (eWins, eLosses). The final Player Won-Lost records for Dontrelle Willis in 2005, expressed in these two ways are shown below.

Basic Player Wins and Losses



pWins pLosses pWin Pct. pWOPA pWORL | eWins eLosses eWin Pct. eWOPA eWORL
18.611.70.6154.56.0|16.511.30.5943.54.9


‘Context-Dependent’ Player won-lost records (pWins/pLosses) adjust for context, the timing of the player’s performance, as well as the interaction between a player and his teammates. These Player won-lost records are constructed such that a team’s won-lost record can be uniquely determined by the sum of its player won-lost records. Dontrelle Willis’s ‘context-dependent’ won-lost record in 2005 was 18.6 - 11.7, a 0.615 winning percentage, and 6.0 context-dependent wins over replacement level.

'Context-Neutral’ Player Won-Lost records (eWins/eLosses) disregard differences across players in the relative importance of their performance (e.g., it does not distinguish between the innings pitched by “closers” and those pitched by starting pitchers) as well as differences in the timing of performance by a player (i.e., it does not give any credit to “clutch” hitting over and above “non-clutch” hitting). So-called ‘context-neutral’ Player Won-Lost records also attempt to control for the talent of one’s teammates. Dontrelle Willis accumulated a ‘context-neutral’ Player Won-Lost record of 16.5 - 11.3, a ‘context-neutral’ winning percentage of 0.594. I estimate that this led Willis to accumulate 4.9 context-neutral
wins over replacement level.

A brief decomposition of these records follows. The calculation and description of these factors are given more thorough treatment in separate articles that are linked below as appropriate.

Context-Neutralized, Teammate-Adjusted Won-Lost Records by Factor
The initial building blocks for Player Won-Lost records are context-neutralized and teammate-adjusted won-lost records by
component. These component Player Won-Lost records are summed up by factor: batting, baserunning, pitching, and fielding. These figures, summed across all nine components, for Dontrelle Willis are as follows:

Context-Neutralized, Teammate-Adjusted, Won-Lost Records by Factor

Batting Baserunning Pitching Fielding
Wins Losses Wins Losses Wins Losses Wins Losses
1.51.90.20.112.99.00.30.3


Dontrelle Willis amassed a 0.447 batting winning percentage, a 0.670 baserunning winning percentage, a 0.588 pitching winning percentage, and a 0.469 fielding winning percentage.

Combined, these add to a basic record of 14.9 – 11.3, a 0.569 winning percentage over 26.3 basic player decisions. These translate into the Context-Neutral and Context-Dependent won-lost records shown above through a series of adjustments which are described next.

Adjustments to Basic Player Won-Lost Records
Basic context-neutral player records are adjusted five ways to tie Player won-lost records to team wins and losses.

Context: Inter-Game and Intra-Game

The first two of these are Context multipliers which adjust the player’s total games based on the relative importance of the timing of his performance.

Inter-game context measures the importance of a player’s performance at the time at which it took place. This is analogous to the sabermetric concept of Leverage (the linked article also has parts two and three). Dontrelle Willis’s inter-game context in 2005 was 0.8724.

Intra-game context adjusts Player Won-Lost records so that the total number of Player games is the same in all games. Dontrelle Willis’s intra-game context in 2005 was 1.3230.

Inter-game and intra-game contexts are somewhat negatively correlated as games with higher-than-average inter-game context (i.e., lots of high-leverage situations) will generate more Player wins and losses than games with fewer high-context situations. One can see this negative correlation between inter-game and intra-game context in the case of Dontrelle Willis in 2005.

Willis’s inter-game context was fairly low at 0.872 (i.e. 12.8% below average). This is a fairly common (if slightly low) context level for a starting pitcher. In very close games, particularly extra-inning games, starting pitchers are long gone by the highest-context moments late in these games. Pitcher plate appearences (of which Willis had 101 in 2005) tend to be extremely low-context, since (a) in high-context plate appearances with runners on base, pitchers tend to bunt and I treat bunts as purely context-dependent, which caps the context of these plate appearances at one, and (b) in later-inning high-context plate appearances, pitchers are typically pinch-hit for.

In contrast, Willis’s intra-game context is quite high (1.323 – 32.3% above average) because a large number of his games were extremely one-sided games which, consequently, had very low inter-game context and, hence, led to very few Player wins and losses. In approximately one-third of Willis’s starts (11), for example, the final run differential was 6 or more runs. In Willis’s first two starts of 2005, the Marlins scored single runs in the bottom of the first and second inning and never trailed as Willis won back-to-back complete-game shutouts 9-0 and 4-0. The average intra-game context for those two games was 2.317.

The combined effect of the inter-game and intra-game context multipliers ( 0.8724*1.3230 = 1.1542) is to increase Willis’s Player Games by 15.4% (to 30.3).

Win Adjustments: Inter-Game and Intra-Game

The next two adjustments modify a player’s winning percentage based on the timing of his performance. As with Context, the two considerations here are inter-game performance, and intra-game performance. Inter-game performance measures, in effect, clutch performance. A player who performs better in high-context situations than in low-context situations will have a positive inter-game win adjustment, while a player who performs better in low-context situations than in high-context situations will have a negative inter-game win adjustment. Dontrelle Willis performed very well in the clutch in 2005, as reflected by his inter-game win adjustment which serves to increase his winning percentage by 2.1%. As one example of the “clutchiness” of Willis’s performance, of the 11 home runs allowed by Willis in 2005, only 3 occurred with a runner on base and none occurred with any runners in scoring position. With the bases loaded, batters were 0-for-10 against Willis with one walk and one sacrifice fly.

Intra-game win adjustment adjusts for how Willis’ performance coincided with Marlins’ wins versus losses. Dontrelle Willis had an Intra-Game Win Adjustment of 2.9%. Overall, the Florida Marlins were 23-11 in Willis’s 34 starts. In the 23 Marlins wins, Willis pitched 173-1/3 innings with a 1.40 ERA. In the 11 Marlins losses in which Willis pitched, he pitched 64 innings with a 5.91 ERA. Willis’s performance was more valuable than average to the Florida Marlins in 2005 because he concentrated the best of his performance into games that the Marlins ended up winning.

Teammate Adjustments

The final adjustments to tie Player won-lost records to team decisions are teammate adjustments. These are the number of additional wins that a player is credited with because of his teammates based on shared offensive plays and shared defensive plays. In this case, Dontrelle Willis’s total teammate adjustment is -0.10. That is, Willis loses 0.10 wins because his teammates did a poorer-than-average job of playing behind him: e.g., advancing fewer bases than an average baserunner would have or converting fewer balls-in-play into outs than expected. The relationship between teammates on shared Player wins and losses is described here.

Adjustments to Context-Neutral Player Won-Lost Records: Expected Context, Expected Team Win-Normalization

The adjustments described above are not necessarily randomly distributed. Instead, some of these adjustments can be predicted, to some extent, based on either the position played or the performance of the player.

Context (inter-game and intra-game combined) can be predicted as a function of player position. On offense, pitchers tend to bat in below-average contexts, while pinch hitters and pinch runners perform in above-average contexts. On defense, starting pitchers tend to perform in a higher average context than relief pitchers (due to a higher intra-game context). Dontrelle Willis’s expected context in 2005 was 1.0575. This is about 8% lower than Willis’s actual context of 1.1542.

Expected Team Win Adjustment is an adjustment to the Player’s won-lost record to recognize the decentralizing influence of player wins on team wins. That is, a team of players who are each slightly over 0.500 will, in fact, win the overwhelming majority of their games. Expected team win adjustments are the intra-game win adjustment that would be expected on a team that was average with the exception of the particular player. Above-average players will have positive expected team win adjustments and below-average players will have negative expected team win adjustments. Dontrelle Willis had an Expected Intra-Game Win Adjustment of 2.5% in 2005. As noted above, Willis’s actual Intra-Game Win Adjustment was 2.9%, somewhat better than expectations.

For the most part, differences between a player’s expected team win adjustment and his actual intra-game win adjustment will reflect the overall talent of the player’s teammates – a player who plays for an above-average team will tend to have a better intra-game win adjustment than expected and vice-versa. The 2005 Florida Marlins had an above-average offense, finishing second in the National League in park-adjusted OPS (106), third in the NL in runs scored per game on the road (4.60), and, in fact, even managing to finish ninth in the NL in total runs scored despite playing in the second-best pitching park in the National League. Given that, it’s not too surprising that Dontrelle Willis’s intra-game win adjustment was slightly better than expected.

Dontrelle Willis’s final Context-Neutral won-lost record is calculated by taking his basic record of 14.9 - 11.3 and adjusting it in two ways. First, Willis’s total number of Player decisions is adjusted by multiplying his basic number of decisions (26.3) by his expected context (1.0575), giving him a total of 27.8 context-neutral player decisions. Willis’s context-neutral winning percentage is adjusted by taking his basic winning percentage (0.569) and adding his expected team-win adjustment (0.025) to produce a final context-neutral winning percentage of 0.594.

Multiplying Dontrelle Willis’s total context-neutral decisions (27.8) times his context-neutral winning percentage (0.594) produces 16.5 context-neutral wins for Dontrelle Willis. Subtracting this from his total decisions yields 11.3 context-neutral losses. So, Dontrelle Willis’s final context-neutralized, teammate-adjusted won-lost record in 2005 (eWins, eLosses) was 16.5 - 11.3.

Context-Dependent, Teammate-Dependent, Win-Dependent
Combining basic context-neutral Player wins and losses with the context multipliers and win adjustments outlined above produce a final set of context-dependent, teammate-dependent, win-dependent player wins and losses, pWins and pLosses. These won-lost records are constructed such that the sum of player wins for a team will be equal to team games plus team wins and the sum of player losses for a team will be equal to team games plus team losses. The rationale and methodology for tying player wins and losses to team wins and losses was described earlier.

The construction of a player’s context-dependent Won-Lost record is done as follows:

(1)     Adjust basic player wins using teammate adjustment

The teammate adjustment is added to basic context-neutral wins by factor. In the case of Dontrelle Willis, this produces 1.7 offensive wins plus 13.2 defensive wins plus a teammate adjustment of (-0.1) for a total of 14.8 teammate-adjusted wins and a teammate-adjusted winning percentage of 0.565.

(2)    Adjust total player decisions for inter-game and intra-game context

Basic player decisions (26.3 for Willis in 2005) are multiplied by inter-game (0.8724) and intra-game (1.3230) context to calculate total context-dependent player decisions. For Dontrelle Willis, this works out to 30.3 total context-dependent games.

(3)    Adjust player winning percentage with inter-game and intra-game win adjustments

Dontrelle Willis’s teammate-adjusted winning percentage, calculated above, 0.565, is adjusted by adding Willis’s inter-game win adjustment (2.1%) and his intra-game win adjustment (2.9%), to yield a final context-dependent winning percentage of 0.615.

(4)    Calculate total player wins and losses

Dontrelle Willis’s final ‘Context-Dependent’ wins total is then simply equal to games (30.3) times winning percentage (0.615), 18.6 wins. Player losses are equal to Games (30.3) minus Wins (18.6), in this case, 11.7. So Dontrelle Willis's final won-lost record in 2005 (pWins, pLosses) was 18.6 - 11.7

Comparing Players Across Positions: Positional Averages and Positional Replacement Levels
While Player Won-Lost Records are (in my opinion) an excellent overall measure of player value, raw Player Won-Lost records are not really an effective tool for comparing players across positions. In constructing Player Won-Lost records, all events are measured against expected, or average, results across the event.

To facilitate cross-positional comparisons, I construct Positional Averages for every player, to provide the context against which their record should be measured. Positional Average is the expected won-lost record for the player given the position(s) which he played. In the case of Dontrelle Willis, the Positional Average (0.467) is below 0.500 because an average pitcher’s offensive winning percentage is considerably below 0.500 (0.314 in 2005). The "positional average" winning percentage for starting pitchers is also slightly below 0.500 (0.491 in 2005) while the "positional average" winning percentage for relief pitchers is slightly above 0.500 (0.518 in 2005).

Positional Replacement Level is one weighted standard deviation below Positional Average. Separate standard deviations are calculated for position players (4.7 percentage points in 2005) and pitchers (5.0 percentage points), which reflects the fact that pitchers are somewhat harder to replace than position players historically. Positional averages and Positional replacement levels are identical for context-neutral and context-dependent records. Wins over Replacement Level are simply equal to Wins minus the number of wins that a Replacement Level player would have been expected to earn in Willis’s games.

This works out to 4.9 Context-Neutral Wins over Replacement Level (eWORL) and a National League-leading 6.0 Context-Dependent Wins over Replacement Level (pWORL) for Dontrelle Willis in 2005.

One more example can hopefully help to show how Player won-lost records can be used as an analytical tool to compare players.



David Ortiz v. Alex Rodriguez: Who Was the 2005 American League MVP?

The 2005 American League Most Valuable Player race was a two-man race between Alex Rodriguez of the New York Yankees and David Ortiz of the Boston Red Sox. Rodriguez received 16 first-place votes to 11 for Ortiz (Vlad Guerrero received one first-place vote) and beat Ortiz overall 331 – 307. A comparison of the Player Won-Lost records of Ortiz and Rodriguez is very instructive, I think, at highlighting the real strengths of this system.

1.    Basic Offensive Statistics
Alex Rodriguez and David Ortiz put up similar offensive statistics in 2005. Their traditional statistics are shown below.

Basic Offensive Statistics, 2005
G PA AB H 2B 3B HR R RBI BB SO BA OBP SLG
Alex Rodriguez 162 715 605 194 29 1 48 124 130 91 139 0.321 0.421 0.610
David Ortiz 159 713 601 180 40 1 47 119 148 102 124 0.300 0.397 0.604



Rodriguez had 2 more plate appearances and 14 more hits, the latter of which was nearly offset by Ortiz drawing 11 additional walks. Ortiz hit 11 more doubles, but Rodriguez hit one more home run. While Ortiz drove in 18 more runs, Rodriguez managed to score 5 more runs. All in all, these are extremely similar season lines.

Turning to more advanced offensive metrics simply reinforces the same thing.

Advanced Offensive Statistics, 2005
OPS RC RC/27 OPS+
Alex Rodriguez 1.031 163 10.2 173
David Ortiz 1.001 149 9.0 158



Not surprisingly (in fact, encouragingly), the similarity between these two players carries over to their context-neutral batting Player Wins and Losses:

Player Won-Lost Record: Batting, Context-Neutral
eWins eLosses eWinPct eWins over .500
Alex Rodriguez 17.110.80.6133.2
David Ortiz 17.011.50.5952.7



Once again, Rodriguez comes out slightly, but clearly, ahead, with 0.2 more wins* and 0.7 fewer losses.
*Some numbers may not look right here due to rounding. For example, the actual difference in wins here is 0.16.

2.    Everything Else
There’s more to playing baseball than simply batting, of course.

•     Baserunning

David Ortiz is a rather notoriously slow baserunner with 11 career stolen bases. Alex Rodriguez stole nearly twice as many bases in 2005 alone (21). Even beyond stolen bases, Rodriguez is, in general, a much better baserunner than David Ortiz. The context-neutral baserunning Player Game Points accumulated by each of them are shown below.

Player Won-Lost Record: Baserunning, Context-Neutral
eWins eLosses eWinPct eWins over .500
Alex Rodriguez 1.21.30.487-0.0
David Ortiz 0.50.90.374-0.2



While Rodriguez is a better baserunner than Ortiz, they were actually both below-average baserunners in 2005. Nevertheless, Rodriguez gains about 0.1 net wins (Wins minus Losses) on Ortiz thanks to his baserunning.

•     Fielding

David Ortiz is a poor-fielding first baseman who played a total of 78 innings in the field in 2005. Alex Rodriguez is a former Gold-Glove winning shortstop who played 1,390 innings in the field in 2005.

Let’s see how they compare.

Player Won-Lost Record: Fielding, Context-Neutral
eWins eLosses eWinPct eWins over .500
Alex Rodriguez 4.95.30.479-0.2
David Ortiz 0.20.30.391-0.0



While A-Rod looks to have been a below-average third baseman, he is nevertheless a better fielder, relative to his position, than Ortiz, by about 9 percentage points, and, perhaps more importantly, because Ortiz was primarily a designated hitter in 2005, A-Rod accumulated 23 times as many context-neutral Fielding Wins as Big Papi. Because Rodriguez was a sub-.500 fielder, those extra decisions actually lead to Rodriguez accumulating fewer Fielding wins over .500, though.

Adding up what we have so far, here is how Alex Rodriguez and David Ortiz compare in terms of basic, context-neutral Player Wins and Losses.

Player Won-Lost Record: Context-Neutral
eWins eLosses eWinPct eWins over .500
Alex Rodriguez 23.217.40.5722.9
David Ortiz 17.712.70.5822.5



So far, this really isn't much of a race. David Ortiz has the better winning percentage, 0.582 to 0.572, but because Rodriguez played the field all season, he leads Ortiz in Wins over .500, 2.9 to 2.5 and in total wins, 23.2 to 17.7.

3.     Contextual Adjustments
So why did David Ortiz do as well as he did in the MVP voting?

The answer is hinted at in the one basic offensive statistic in which Ortiz beats Rodriguez fairly convincingly: RBI. David Ortiz led the American League with 148 runs batted in, while Rodriguez, despite one more home run as well as a higher batting average and slugging percentage, managed to finish only 4th in the American League with 130 RBIs.

So how did Ortiz lead the league in RBIs? Well, supposedly, he had a phenomenal season batting in the clutch. The MVP argument in support of David Ortiz basically revolved around the notion that he was the best clutch hitter in all of baseball.

There is an argument that invariably arises every year at MVP voting time, that the "Most Valuable Player" is not the best player, but the player who contributed most to his team's success. In other words, the argument goes, to be the MVP, it doesn't just matter what you do, it also matters when you do it.

Of course, Player Wins and Losses are perfectly designed to exactly measure the extent to which a player's performance contributed to real wins and real losses by his team.

Player Wins and Losses are adjusted in two ways to reflect the impact of the timing of player performance: Inter-Game and Intra-Game.

•     Inter-Game Adjustments: Performance in the Clutch

Inter-game contextual factors adjust for the relative importance of a player's performance within the context of a given game. In other words, hitting a home run in the bottom of the ninth inning of a tie game is worth more than hitting a home run leading off the top of the 5th inning of a game in which the player's team is already leading 12-1.

There are two inter-game adjustments to Player won/lost records: Context and Win Adjustments.

•     Inter-Game Context

Inter-game context is basically what some other people refer to as Leverage. This measures the relative importance of situations within the context of a single game. In 2005, Alex Rodriguez performed in an average inter-game context of 0.989, about 1.1% below average. This serves to lower A-Rod's total player decisions by 0.5 games.

In contrast, David Ortiz performed in an average inter-game context of 1.034, about 3.4% above average, which increased Papi's total player decisions by 1.0 games.

•     Inter-Game Win Adjustment

Of course, the issue in not simply how many high-leverage situations a player performs in, but how well he does in those situations. The MVP argument for David Ortiz was not simply that he had a lot of high-leverage at-bats (which, as we just saw, he did), but that he rose to the occasion in those situations, performing even better in those high-leverage situations than his already-excellent self.

In this regard, David Ortiz excelled. Overall, in 2005, Ortiz batted .300/.397/.604. With runners in scoring position, he improved that to .352/.462/.580. With two outs and runners in scoring position, he batted .368/.507/.719. When the score was tied, Ortiz batted .289/.405/.583. In "late and close" situations, Ortiz batted .346/.447/.846. No matter how you slice the data, Papi delivered big-time in the clutch in 2005. Because of this, his effective winning percentage was better than his context-neutral winning percentage of 0.582. In fact, his inter-game win adjustment increased his winning percentage by 0.038 to an inter-game adjusted winning percentage of 0.620.

Alex Rodriguez, on the other hand, while not as "un-clutch" as maybe some people thought at the time, performed almost exactly the same regardless of the inter-game context, so that his inter-game win adjustment was 0.000.

Taking inter-game context and inter-game win adjustments into account, the comparison between Alex Rodriguez and David Ortiz looks thus.

Player Won-Lost Records, Inter-Game Adjusted: 2005

Wins Losses WinPct Wins over .500
Alex Rodriguez 23.117.10.5743.0
David Ortiz 19.611.80.6233.9

Player wins are also adjusted to control for the performance of one's teammates in shared components.

Now we see why David Ortiz did so well in MVP voting. Taking inter-game performance into account, Ortiz moves much more decisively ahead of Rodriguz in winning percentage, 0.623 - 0.574, and also moves ahead of him in wins over 0.500, 3.9 - 3.0.

•     Intra-Game Adjustments: Performance in Team Wins versus Team Losses

In addition to adjusting for inter-game context, I also adjust for intra-game context. As with inter-game adjustments, I adjust for two factors here: Context and Win Adjustments.

•     Intra-Game Context

Intra-game context adjusts player wins and losses to normalize the total number of player decisions per game to be equal to exactly three decisions per team per game.

Alex Rodriguez played in an average Intra-Game context of 1.071 about 7.1% above average. This increased Rodriguez's total player decisions by 2.9.

David Ortiz had an average Intra-Game context of 1.017 (1.7% above average), increasing his total player decisions by 0.5.

The intra-game context adjustment basically gives A-Rod as much of an edge in player decisions over Ortiz as the inter-game context adjustment gave to Papi. This is because intra-game context is somewhat negatively correlated to inter-game context. This is because games with lots of high-leverage plays will tend to generate more raw Player Game Points than games with relatively few high-leverage plays. But, at the end of the day, all games count exactly the same in the standings: a team can only win a game once no matter how many clutch hits its players managed to get.

•     Intra-Game Win Adjustment

There is one final adjustment that I make to Player Won-Lost records. This adjusts player wins and losses such that the players on a team earn exactly two player wins in any team win and exactly one win in any team loss, and that players earn exactly two player losses in any team loss and exactly one loss in any team win. In this adjustment, positive events which contributed to wins are weighted more heavily than positive events which happened in team losses, while negative events which contributed to team losses get more weight than negative events which happened in team wins.

This final adjustment improves David Ortiz's player winning percentage by 0.006 and A-Rod's winning percentage by 0.023.

This final adjustment benefits both Rodriguez and Ortiz, as they both tended to perform better in games which their teams won than they did in games which their teams lost. Of course, this is true of most players (that’s why their teams win those games after all). Rodriguez and Ortiz were both also helped by the fact that their teams won 95 games apiece.

While this adjustment helped both players, the help to Ortiz was fairly minimal, an extra 0.2 wins (and a reduction of 0.2 losses). Rodriguez, on the other hand, gained more than 5 times as many wins as Ortiz (1.0) by virtue of having produced better in Yankee victories than in Yankee losses.

Does this make sense?

Well, here are A-Rod’s numbers.

Alex Rodriguez's Batting Line in 2005
G PA BA OBP SLG Runs RBI
Yankee Wins 95 437 0.376 0.490 0.736 101 101
Yankee Losses 67 278 0.241 0.313 0.430 23 29



Now, as I said, most players perform better in games that their team wins than in games that their team loses. On aggregate that would have to be true; that’s why the winning teams win and the losing teams lose. But, let’s compare Rodriguez’s numbers to David Ortiz’s numbers.

David Ortiz's Batting Line in 2005
G PA BA OBP SLG Runs RBI
Red Sox Wins 94 431 0.332 0.441 0.685 92 104
Red Sox Losses 65 282 0.253 0.330 0.490 27 44



Now, compare just those top lines. A-Rod outhit Ortiz in victories by 0.044 in batting average, 0.049 in OBP, and 0.051 in slugging (which makes his OPS a full 0.100 higher than Ortiz). He outscored him by 9 runs and had only 3 fewer RBIs. And remember, the Yankees and Red Sox won the same number of games (although Ortiz sat out one Red Sox victory).

So, while Ortiz performed better in situations that were very valuable within the context of a particular game - inter-game context - A-Rod performed better in situations that ended up contributing to Yankee wins - intra-game context. And, in fact, combining batting, baserunning, and fielding, these effects were nearly offsetting: Ortiz gained about 1.3 wins on Rodriguez through inter-game adjustments while Rodriguez got 0.9 of those wins back through intra-game adjustments.

Let me try to illustrate with an example. I presented this example earlier in this article, but repeat it here for convenience.

On September 29, 2005, David Ortiz went 3-5 including a home run leading off the bottom of the 8th inning to tie the score 4-4 and a walkoff RBI single with one out in the bottom of the 9th inning. Baseball-Reference.com credits Ortiz with a WPA of 0.584 for the game. Obviously, those hits were huge for the Red Sox and Ortiz was rightly celebrated as the hero of that game.

On April 26, 2005, the Yankees defeated the Los Angeles Angels of Anaheim (or whatever they were calling themselves that season) 12-4. The Yankees took a 3-0 lead in the bottom of the first inning and led 10-2 by the end of the 4th inning. Obviously, there weren't a lot of "clutch" situations in this game. It was over early. Do you know why it was over early? Because A-Rod hit a 2-out, 3-run home run in the bottom of the first inning to give the Yankees that 3-0 lead, he hit a 2-out, 2-run home run in the bottom of the third inning to extend the Yankees' lead to 5-2, and he capped it off with a 2-out grand slam in the bottom of the 4th inning to give the Yankees that aforementioned 10-2 lead. For all of that, Baseball-Reference.com only credits A-Rod with a WPA of 0.490 for that game.

Take Ortiz's two RBIs off the scoreboard for the Red Sox in that September 29th game, and the Blue Jays would have won that game 4-3. Then again, if Ortiz made a (single) out in his final at-bat, Manny Ramirez would have come to bat with the potential winning run still in scoring position (albeit with two outs).

Take Rodriguez's ten RBIs off the scoreboard for the Yankees on April 26th and the Angels would have won that game 4-2. Moreover, all three of Rodriguez's home runs came with two outs in the inning. Turn them into outs and the Yankees would have had no further opportunities in any of those innings.

In retrospect, A-Rod's performance that day was not merely every bit as valuable as Ortiz's, but almost certainly more so, even if it was less "clutch" by a conventional inter-game "win probability" reckoning. My Player won-lost records credit David Ortiz with a batting won-lost record that day of 0.476 - 0.020, good for 0.456 net wins. Alex Rodriguez had a batting won-lost record on his big day of 0.908 - 0.000, good for 0.907 net wins.
3.    Comparing a Third Baseman to a Designated Hitter
Taking everything into account, here is where we stand with Alex Rodriguez and David Ortiz in 2005.

Final Player Won-Lost Records: 2005

pWins pLosses pWinPct Wins over .500
Alex Rodriguez 25.717.40.5964.2
David Ortiz 20.111.80.6294.1



So, Alex Rodriguez earned more Player Wins (pWins) than David Ortiz, 25.7 - 20.1, while Ortiz had a higher winning percentage, 0.629 - 0.596. A-Rod leads in pWins over .500, 4.152 - 4.106, but by an amount that can probably best be described as trivial.

But is this really a totally fair comparison? In terms of fielding wins, is an average third baseman worth the same as an average first baseman or, worse, an “average” designated hitter? Clearly, an average third baseman is a better fielder than an average first baseman and is considerably more valuable than an average designated hitter.

Why? Think of it this way. To replace David Ortiz, all the Boston Red Sox would have had to do in 2005 would have been to find the best possible hitter they could find. That hitter would surely be a good deal worse than David Ortiz, but the pool of possible replacements for Ortiz was nevertheless fairly large: the population of major-league caliber hitters.

On the other hand, if the New York Yankees had to replace Alex Rodriguez, they would have not only had to have found a hitter, but they would have had to find a hitter who could also play third base. The pool of possible replacement candidates to replace Rodriguez – major-league caliber third basemen - would clearly be smaller than the pool of possible replacements for Ortiz.

•     Positional Average

A player who hit (and ran) like an average third baseman given Alex Rodriguez’s batting opportunities, and fielded like an average third baseman given A-Rod’s fielding opportunities, would have been expected to compile a 0.501 winning percentage. In contrast, a player who hit (and ran) like an average DH/1B given David Ortiz’s batting opportunities, and fielded like an average first baseman given Ortiz’s fielding opportunities, would have been expected to compile a 0.515 winning percentage.

Using these figures for “average”, then, Alex Rodriguez’s final won-lost record was 4.1 pWins over Positional Average (pWOPA) while David Ortiz compiled a pWOPA of 3.6, a somewhat more decisive lead for A-Rod.

•     Replacement Level

Alex Rodriguez earned 35% more Player decisions than Ortiz because he played so many more innings in the field than Ortiz. If Rodriguez had earned the same number of decisions as Ortiz (if, say, he missed 40 games to injury), is it likely that the Yankees could have found an average player (which, in Rodriguez’s case, means a 0.501 player) to make up those extra decisions? No, it is not. Instead, the most likely scenario is that the Yankees would have had to make up those Player decisions with a below-average player. Consider who the Yankees played at third base in April of 2009 while A-Rod recovered from a hip injury: Cody Ransom, who batted a robust .190/.256/.329 for the Yankees.

Hence, instead of comparing A-Rod and Papi to average players, a more relevant measure of the relative value contributed by Alex Rodriguez and David Ortiz is to measure how many Wins they contribute over Replacement Level (WORL). In my work, I set Replacement Level one standard deviation below positional average. The standard deviation of Player winning percentages for non-pitchers for 2005 was 4.7%, so that the relevant replacement level for Rodriguez is 0.454 (0.501 - 0.047) and for Ortiz is 0.467.

Wins over Replacement Level for Rodriguez and Ortiz are shown below.

Final Player Won-Lost Records: 2005

Wins over Positional
pWins pLosses pWinPct Average Repl Level
Alex Rodriguez 25.717.40.5964.16.1
David Ortiz 20.111.80.6293.65.1


Alex Rodriguez was 6.1 Wins over Replacement Level in 2005 for the New York Yankees. David Ortiz was 5.1 Wins over Replacement Level for the Boston Red Sox. Both Rodriguez and Ortiz had excellent seasons that were extremely valuable to their respective teams, but, ultimately, I think that the voters got this one right: Alex Rodriguez deserved to be the Most Valuable Player in the American League in 2005.



Value vs. Talent

Sabermetricians often distinguish between two measures of player performance: value and true talent. My Player won-lost records, pWins and pLosses, are purely the former, a value measure. Unfortunately, as anybody who has ever read an MVP debate knows, the word "value" can have different definitions to different people.

The first link above defines value thusly: "A player's value is his contributions to his team based upon his on-field performance (hitting, running, fielding and pitching) in a neutral context." I would define value slightly differently. My definition of value would be thus: A player's value is his contributions to his team's on-field success. Player value is a retrospective evaluation, which quantifies what happened in the past.

True talent, on the other hand, is defined in the latter link above as the "probabilistic expectations of a player’s output at a given point in time, given that we know everything there is to know about that player." In other words, "true talent" is a prospective measure of expected performance, which predicts what will happen.

As I said, Player won-lost records are a measure of player value, by which I mean a player's (on-field) contributions to his team's on-field performance, measured in wins and losses. Value, defined in this way, is highly dependent on context. Several key types of context which affect player value include the following.

1.    Run-Scoring Environment
Runs are more valuable in a lower run-scoring environment. Scoring one run is more likely to lead to winning in an environment where 1-0 victories are fairly common than in an environment where the average final score is 8-6. This is why Player won-lost records control for the run-scoring environment, both for the season and league in which the game took place as well as for the ballpark in which the game was played.

2.    Timing of Events
The timing of events within a game can affect the value of those events. Hits which drive in runners on base can be viewed as more valuable than hits with the bases empty which do not produce runs. Home runs are more valuable in tie games than when the score is 15-0 (in either direction).

3.    Retrospective Context
The value of a win is greater than the value of a loss. Retrospectively, one can argue that this means that the value of events are greater if they contribute to a win than if they contribute to a loss.

I have to concede at this point that value is ultimately subjective and, hence, my Player won-lost records are ultimately subjective. The main point of subjectivity is the value of a win versus the value of a loss. I value team wins at 2 pWins and 1 pLoss, and I value team losses at a pWin-pLoss record of 1-2. I explained and attempted to defend that choice above.

There is also some inherent subjectivity in the assignment of value to specific players. I have attempted to make these assignments as objectively as possible. Again, my choices in this respect are explained briefly above, and in somewhat more detail elsewhere. Note, however, that given the overall value of team wins and team losses, the total value for a team is fixed, which means that, to the extent one assigns too much value to one player on a team it must be at the expense of assigning too little value to one of his teammates.

Player won-lost records, as I calculate them, represent a complete accounting of all value accumulated within a major-league baseball game. Note that this means that "luck" has to be accounted for somewhere, regardless of whether we think the accumulation of that "luck" was the result of any skill, whether any such skill "persists", or whether there is any predictive ability associated with such events.

Value versus True Talent
So what is the difference between "value" and "true talent"? The key difference, as I see it, is that "value" can be directly observed, while "true talent" can only be inferred. Going one step farther, "true talent" can only be inferred from value. Hence, to my mind, measuring value is a necessary first step to being able to assess true talent.

Unfortunately, I think that too often there is confusion between value and true talent, where "true talent" measures make their way into what are intended to be "value" measures. For example, in his Win Shares system, Bill James increases the fielding Win Shares for third basemen if they played for a team with a below-average number of innings pitched by left-handed pitchers. The rationale for this is that left-handed pitchers allow more balls hit toward the third baseman (because LHP face more RHB).

I assume that this is true, but, even if it is true, that would simply mean that third basemen are less valuable with right-handed pitchers on the mound than with lefties pitching. This is also a good example of why a single-number value system can be misleading, although Bill James has corrected for this by adding Loss Shares to his Win Shares system.

Another example of a "value" system that slips in some "true talent" into its calculations is Fangraphs' calculation of WAR (Wins above Replacement). For pitchers, Fangraphs calculates WAR based on FIP (Fielding Independent Pitching). Rather than considering the actual number of runs allowed by a pitcher, FIP calculates how many runs a pitcher would be expected to allow given his walks, strikeouts, and home runs allowed. As such, FIP doesn't explain what did happen, it explains what would be expected to have happened.

Now, there's an argument to be made for using FIP and it's baked right into the name: it controls for the fielders behind the pitcher. The fielders are then valued based on their fielding (using UZR). The problem is that UZR controls for the hardness of the balls-in-play, for the hit types, for the handedness of the pitcher and hitter, etc. In other words, for a bunch of things that are NOT captured in FIP. Which leaves those things uncaptured at all. So we're left with WAR measuring what we would have expected players to be worth, not what they really were worth.

So What's the Point of Context-Neutral Wins and Losses (eWins, eLosses)?
So, if context is a necessary condition of measuring player value, then what is the point of the context-neutral wins and losses that I calculate, eWins and eLosses? By constructing wins and losses that are stripped of context, it becomes possible to distinguish the value of what players do (eWins, eLosses) from when players do these things via the Contextual Factors that relate eWins and eLosses to pWins and pLosses.

In this way, value can be divided into its myriad sub-components, not simply batting versus baserunning versus pitching versus fielding, or basestealing versus baserunner outs versus baserunner advancement, but also inter-game context versus inter-game win adjustments versus the impact of one's teammates on one's fielding, etc. In this way, I believe that Player won-lost records can serve as something of the Platonic ideal of baseball statistics, with everything expressed in the same units - wins and losses - and with everything accounted for in a way which ties back perfectly to what actually happened on the baseball field.

I hope you enjoy reading about and playing with Player won-lost records as much as I enjoyed creating them.

Home     List of Articles


The information used here was obtained free of charge from and is copyrighted by Retrosheet. Interested parties may contact Retrosheet at "www.retrosheet.org". Baseball player won-lost records have been constructed by Tom Thress. Feel free to contact me by e-mail or follow me on Twitter.