In addition to pWins and pLosses
, which tie to team wins
, I also construct a set of player wins and losses which attempt to control for context. I call these eWins and eLosses, where the "e" stands for "expected". These are, in effect, how many wins (and losses) a player would have been expected to contribute to a team if he played in a perfectly neutral context with perfectly average teammates.
These eWins and eLosses are built up from three pieces: context-neutral win probabilities
, expected context
, and an expected team win adjustment
Context-Neutral Win Probabilities
Traditionally, win-probability systems
are purely context-dependent. In fact, however, I do not think that this is necessarily the appropriate starting point for measuring player value. Rather, I am interested in beginning with an assessment of players’ performances in the absence of the contexts in which the players actually performed. That is, what would the expected won-lost record be for a player, given his actual performance, assuming that performance had come in a neutral context? To answer this question, I construct a set of context-neutral Player Game Points. Once these are constructed, I can then add back in the contextual information
in a way that clearly identifies how players’ values were affected by the context in which they performed.
Player Game Points are divided into three categories for the purpose of calculating context-neutral win probabilities: independent events, base-state dependent events, and purely contextual events.
1. Independent Events
Most events can happen regardless of the base-out situation. One can strike out at any time, regardless of how many baserunners or outs there are. Similarly, a triple could happen at any time regardless of the number of baserunners. All batter results, except for double plays (which are base-state dependent), intentional walks, and bunts, fall into the category of independent events. Intentional walks and bunts are treated as purely contextual events, which are described below.
For independent events, the expected win probability of such an event is calculated for each event within the league-year using the Win Probability Matrix for the ballpark in which the event took place.
For example, the win probability of a home run at Wrigley Field in 2005 is calculated by taking every plate appearance that took place in a National League ballpark in 2005 and calculating, for that plate appearance, what the added win probability would have been had the game been played in Wrigley Field and the batter hit a home run. The context-neutral win probability of a home run at Wrigley Field in 2005 is then equal to the average of all of these probabilities. In this case, the average win probability added by a home run at Wrigley Field in 2005 was 0.140 wins.
In the case of events which may or may not lead to baserunner advancement – e.g., outs, singles, doubles – expected results are calculated based on average baserunner advancement, just as is done with contextual Player Game Points.
2. Base-State Dependent Events
Some events can only happen given certain baserunners or a certain number of outs. For example, one can only ground into a double play with at least one baserunner on and less than two outs. Any Player Game Points accumulated by a baserunner on third base can, of course, only be accumulated in a base-out state that includes a runner on third base.
For baserunner game points (except for stolen bases, which are treated as purely contextual events and discussed below) and double plays, the context-neutral win probability of the event is calculated the same as for independent events, except that the average win probability is only calculated across events with relevant base-out states.
So, for example, the context-neutral Player Game Points associated with a double play are calculated as the average win probability, given the ballpark in which the game takes place, added from hitting into a double play across double-play situations (runner on first base and less than two out). For a ground ball to the shortstop at Wrigley Field in 2005, the average win probability added by a double play is 0.015 losses (from the batter’s perspective) (on top of the 0.046 losses accrued from the initial ground-out).
For baserunner advancements and baserunner outs, context-neutral win probabilities are only averaged given the specific batting event and hit type. That is, the context-neutral Player Game Points for a runner on third base advancing on a fly out are calculated only considering plays in which a runner on third base advances on a fly out. Similarly, the context-neutral Player Game Points for a runner on first base who only advances to second base on a single are calculated only considering plays in which a runner on first base does not advance to third on a single.
3. Purely Contextual Events
While it is possible to remove much, if not all, of the context from most plays, there are certain plays which are, essentially, purely elective plays, and are therefore inextricably tied to the context in which they take place. In my opinion, it would be wrong to attempt to divorce these plays from their context.
Three types of plays fall into this category: intentional walks, stolen base attempts (including stolen bases, caught stealings, pickoffs, and balks), and bunts (regardless of either situation or outcome). In each of these three cases, the context-neutral Player Game Points are simply set equal to context-dependent Player Game Points.
Constructing eWins and eLosses from Context-Neutral Win Probabilities
Context-neutral Player Wins and Losses are normalized to be equal to aggregate Context-dependent Player Wins and Losses for each component and sub-component. Hence, the total number of Context-Neutral Player Wins accumulated for a particular type of event or sub-event – say, home runs – will equal the total number of Context-Dependent Player Wins accumulated over the same set of events. This normalization is done at the season/league level. At the game or team level, however, the total number of context-neutral player decisions need not be equal to the number of context-dependent decisions, either at the component
level or in the aggregate.
Having completed this normalization process, one might think that the construction of eWins and eLosses is complete. In fact, however, eWins and eLosses are intended to reflected expected
wins and losses. As such, two more adjustments are made to produce final eWins and eLosses.
Specifically, context-neutral Player Game Points are converted into eWins and eLosses by making two expected contextual adjustments
and Win Adjustments
In relating player wins and losses
to team wins and losses, the context
in which a player’s performance takes place matters. This is reflected in two context measures related to Context-Dependent player decisions
: inter-game context
and intra-game context
In calculating context-neutral player decisions, one might think that the most obvious thing to do would be to simply set inter-game and intra-game context both equal to 1 for all players. In fact, however, this will lead to there being a clear and obvious relationship between players’ positions and their tendency to have more or less Context-Dependent player decisions (pWins, pLosses
) than Context-Neutral decisions (eWins, eLosses). Because of this, I think that it is more appropriate to calculate an Expected Context
for each player, based on the position(s) which the player played. This is done as follows.
Offense: Batting and Baserunning
Expected contexts are calculated for four different positions: pinch hitter, pinch runner, pitcher, and other. For each of these positions, expected context is set equal to the average context for the position for the league and season in question.
Starting pitchers have an average inter-game context of
0.999 and an average intra-game context of
1.078, a combined average context of
1.078. For relief pitchers, the numbers are
0.827, respectively. Expected contexts for starting pitchers are set equal to the average context for starting pitchers for the relevant league and season. The same is true for relief pitchers: expected context is set equal for all relief pitchers – closers, setup men, mopup men – regardless of their actual context.
Over the Retrosheet Era, there is no obvious relationship between context and fielding position. Hence, expected context is set equal to 1 for all fielding player decisions.
Expected context for a player is calculated by taking the weighted average of the expected contexts for the player’s offensive, pitching, and fielding decisions.
Expected Win Adjustments
One of the key implications of my work is that the difference between winning and losing is very small in Major-League Baseball. In an average Major-League Baseball game during the Retrosheet Era, for example, the winning team accumulated around 1.9 positive Player Game Points – the building block of Player Wins – and 1.4 negative Player Game Points – the building block of Player Losses. In other words, the average winning team compiled a team winning percentage of 1.000 (by definition), but the Players on that team compiled a combined winning percentage of something like 0.576, which works out to about an 93-69 record in a 162-game schedule.
Looking at the issue from the opposite direction, teams whose players compiled a combined Player winning percentage around 0.510 (0.505 – 0.515) had an average team winning percentage of about
0.564 (91-71), while teams whose players compiled a combined Player winning percentage around 0.490 (0.485 – 0.495) had an average team winning percentage around
Being a little above average helps a lot in producing team victories.
For pWins and pLosses
, this is reflected in the Intra-Game Win Adjustment
which ties Player Won-Lost records to team won-lost records. In looking at intra-game win adjustments, it is obvious that intra-game win adjustments correlate at least somewhat reasonably well with player winning percentages – good players tend to have positive intra-game win adjustments, while weaker players tend to have negative intra-game win adjustments. This correlation is not perfect, as one’s intra-game win adjustments are also affected by one’s teammates.
To recognize this correlation, eWins and eLosses are adjusted for intra-game win adjustments. But to maintain the context-neutrality of the results, these records are adjusted based upon expected
intra-game win adjustments.
Expected intra-game win adjustments for a player are calculated based on the expected impact of the player on the record of a 0.500 team. The exact process by which these are calculated is described in some detail here
. Expected win adjustments will be positive for players with context-neutral winning percentages over .500 and negative for players with context-neutral winning percentages below .500. Hence, this has the effect of increasing the spread in context-neutral player winning percentages among players.
The top (and bottom) 100 players in career regular-season eWins (total as well as over positional average
, and replacement level
) can be found here
List of Articles