Fielding Player Won-Lost Records vs. Other Fielding Measures
To the best of my knowledge, there are two other fielding systems which rely on largely the same data source as I do (Retrosheet play-by-play data) and have publicly presented career fielding records. The first of these is Defensive Runs Saved (DRS), which were originally presented by Sean Smith in a Hardball Times article and are available now online at Baseball-Reference. The second system is Defensive Run Average (DRA), which was created by Michael Humphreys, who explained the system in his wonderful book, Wizardry: Baseball’s All-Time Greatest Fielders Revealed.
In his book, Humphreys presents career fielding numbers (measured in net runs) for all players who played a significant time (typically, more than 3,000 innings) at six defensive positions: second base, third base, shortstop, and each of the three outfield positions. At the time of publication, Humphrey’s book included statistics through 2009.
Baseball-Reference presents DRS values for players from 1953 to the present.
To compare my results to Smith’s and Humphrey’s numbers, therefore, I compared career results for all of the players listed by Humphrey whose career started in 1953 or later. I then excluded any data after 2009. This left a total of 958 players, ranging from 152 left fielders to 168 second basemen.
The first table summarizes the results for DRA (Humphreys), DRS (Smith), and (context-neutral, teammate-adjusted) Net Fielding Wins (eWins minus eLosses).
||Net Fielding Runs
(per 1000 innings)
|Net Fielding eWins
(per 1000 innings)
||No. of Players
The first thing that we have to do before we can compare DRA and DRS to Player wins is to put them on the same scale. DRA and DRS are expressed in runs while Player wins are, of course, expressed in wins. Traditionally, in sabermetric measures, one win is equivalent to approximately 10 runs. Looking at the standard deviations in the above table, however, the ratio of DRA/DRS to Player wins is closer to 20. In other words, even if you converted DRA and DRS to wins, using a conventional run-to-win translation, the spread of players' DRA and DRS is roughly double the spread of players' net fielding wins.
Why is the Spread on Player Fielding Wins lower than Defensive Runs?
I believe that the spread on my net fielding wins is less than the spread of other fielding measures because I assign more credit on balls-in-play to pitchers, whereas stand-alone fielding measures implicitly assign all of the credit on balls-in-play to fielders, since that's all that they are measuring.
Specifically, looking at my Components 4 (excluding home runs), 5, 6, 7, 8, and 9, I assign 54.5% of the (defensive) credit for these to pitchers and only 45.5% of the (defensive) credit to fielders.
Putting Things on the Same Scale
In order to really compare DRA, DRS, and what I'll start calling NFW (net fielding wins), it is necessary to put them all on the same scale. To do this, I created "z-scores" associated with all three statistics. The basic formula for a z-score of variable x is (x - m) / s, where m is the mean of the statistic and s is the standard deviation. I calculated z-scores for each player for all three fielding stats using a value of m equal to zero (since all three of these statistics are constructed to be relative to league average by construction) and the standard deviations from the above table.
For example, Al Cowens scores at -2.16 DRA (per 1000 innings in RF), 1.22 DRS, and 0.236 NFW. From the previous table, the standard deviations associated with these three numbers
are 6.10, 5.73, and 0.311, respectively. This translates, therefore, into z-scores for Al Cowens in right field of -0.35 for DRA, 0.21 for DRS, and 0.76 for NFW.
I did this for every player referenced in the earlier table. I then calculated simple correlations between DRA, DRS, and NFW by position.
||DRA v. DRS
||DRA v. NFW
||DRS v. NFW
I'm not always exactly sure how to interpret correlations. If we thought that one of the other two measures (DRA or DRS) was a very bad measure of fielding, for example, then we probably would prefer a fairly low correlation. On the other hand, if we thought that one of the other two measures was a perfect measure of fielding, then we could view NFW's correlation with it as a measure of how close to perfect Player won-lost records are at measuring fielding.
Of course, neither of these hypotheticals are true. DRA and DRS are both quite good, but nevertheless imperfect, measures of fielding.
Given that, the correlations here, which are all very high, strike me as very good. I'm probably not doing something terribly wrong here and, perhaps, I'm even doing something a little more right than some other people.
My Player won-lost records (NFW) correlate more strongly with DRS (Sean Smith's numbers, as found at Baseball-Reference.com) than with DRA (Michael Humphreys' numbers from his book Wizardry). This makes sense, since both DRS and NFW are constructed play by play, whereas DRA data are calculated (rather well) at a seasonal level.
The next section of this article looks more closely at how DRA, DRS, and NFW compare on a position-by-position basis.
For the 168 players evaluated here, the average difference in z-scores between DRA and NFW (DRA minus NFW) is -0.035. The average absolute difference in z-scores between DRA and NFW is 0.596. For DRS, the corresponding numbers are -0.039 and 0.493.
There are a total of nine players for whom the difference in z-scores is greater than one (in absolute value) for both DRA vs. NFW and DRS vs. NFW. These players are shown in the next table.
The first player in this table, Bill Mazeroski, is a good candidate for a somewhat closer look.
Bill Mazeroski was elected to the Baseball Hall of Fame in 2001 on the basis of two things: hitting a World Series winning home run in 1960 and being considered by many to be the best defensive second baseman in major-league history.
Michael Humphreys ranks Mazeroski as the second-best defensive second baseman in MLB history and his career DRA record rates as a z-score of 1.322. He scores even better in DRS, with a career z-score of 1.688. In contrast, his Player fielding record, while not bad, is much more pedestrian, with a career z-score of only 0.316. The next table compares Mazeroski's season-by-season z-scores for DRA (Humphreys), DRS (Smith), and NFW (Thress).
Retrosheet has released play-by-play data for every game since 1937, i.e., for every game of Bill Mazeroski's career. For some games in the 1950s and 1960s (and earlier), however, these play-by-play data were deduced from box score and newspaper accounts. In these cases, the information available is very rudimentary, including, in many cases, a lack of specificity on outs. That is, there are many plays for which it is known that the batter made an out, but there is no information on which fielder(s) recorded the out. The Pirates are one team for whom this data is particularly sparse in some of these seasons. The result of this is that, in many cases, I do not know which fielder recorded certain outs for the Pirates. When this happens, I spread the fielding credit for these plays in proportion to league-wide out distributions for known plays. This probably results in me under-crediting good fielders and over-crediting bad fielders on a team for plays made (and over-debiting good fielders and under-debiting bad fielders for hits allowed).
Retrosheet's play-by-play data generally gets more reliable over time. And, in fact, as the above table indicates, my view of Bill Mazeroski's fielding (a) improves and (b) becomes much more consistent with DRA and DRS in the latter part of his career. From 1956 - 1962, Mazeroski's z-scores are 1.339 for DRA, 1.254 for DRS, and -0.496 for Net Fielding wins. From 1963 - 1972, the z-scores are 1.324, 2.043, and 1.019, respectively. From 1965 - 1972, the three z-scores are 0.749, 1.752, and 1.289.
In this case, DRA and DRS may be better measures of Bill Mazeroski's career fielding record, or at least the first half of it.
For the 155 players evaluated here, the average difference in z-scores between DRA and NFW (DRA minus NFW) is -0.007. The average absolute difference in z-scores between DRA and NFW is 0.594. For DRS, the corresponding numbers are 0.004 and 0.413.
There are a total of 4 players for whom the difference in z-scores is greater than one (in absolute value) for both DRA vs. NFW and DRS vs. NFW. These players are shown in the next table.
Former Dodger Jim "Junior" Gilliam appears on each of the previous two lists. Both DRA and DRS view Gilliam as an average fielder at both second and third base. Player won-lost records, on the other hand, view Gilliam as an excellent fielder at both positions. Player won-lost records also rate Gilliam as having been excellent in his (more limited) time in left field. DRS actually agrees that Gilliam was an excellent left fielder (+21 runs in 203 innings), while Humphreys did not report Gilliam's DRA for LF because he had too few innings.
Gilliam was before my time, retiring two years before I was born, so I will leave it to others to judge whether Gilliam was an average or excellent defensive infielder.
For the 163 players evaluated here, the average difference in z-scores between DRA and NFW (DRA minus NFW) is 0.036. The average absolute difference in z-scores between DRA and NFW is 0.553. For DRS, the corresponding numbers are 0.037 and 0.474.
There are a total of 6 players for whom the difference in z-scores is greater than one (in absolute value) for both DRA vs. NFW and DRS vs. NFW. These players are shown in the next table.
For the 152 players evaluated here, the average difference in z-scores between DRA and NFW (DRA minus NFW) is 0.057. The average absolute difference in z-scores between DRA and NFW is 0.619. For DRS, the corresponding numbers are 0.064 and 0.533.
There are a total of 8 players for whom the difference in z-scores is greater than one (in absolute value) for both DRA vs. NFW and DRS vs. NFW. These players are shown in the next table.
Most fielding measures focus primarily on what is almost certainly the most significant aspect of fielding: how well a fielder turns balls in play into outs. It is my understanding that this is the primary focus of both DRA and DRS. Both DRA and DRS do, however, attempt to incorporate infielders' ability to turn double plays and outfielders' ability to throw out runners and/or prevent baserunner advancement.
Decomposition of Fielding Value
Player won-lost records assign Fielding won-lost records within five components. Component 5 measures whether balls in play become hits or outs and is, therefore, perhaps most directly comparable to other fielding systems. Component 6 measures whether hits become singles, doubles, or triples. To the best of my knowledge, no other fielding system attempts to measure anything comparable to this. Component 7 measures whether ground balls are converted into double plays in double play situations (runner on first, less than two outs). I believe that both Humphreys (DRA) and Smith (DRS) make some attempt to incorporate similar information within their systems. Component 8 measures whether fielders are able to put baserunners out on the bases. Component 9 measures the extent to which baserunners are able to advance more or less than average on a particular play. Many fielding systems (including both DRA and DRS, I believe) make at least some effort to incorporate these latter two factors for outfielders. My system goes a step farther, however, and calculates Component 8 and 9 player won-lost records for infielders as well.
Some of the differences, then, between how Player won-lost records view some players' fielding vis-a-vis DRA and DRS (and other systems) is that Player won-lost records are incorporating additional aspects of these players' fielding skills.
For example, of the eight players on the above list, only two of them - Gates Brown and Pete Rose - would appear on a comparable list comparing Net Component 5 Fielding wins to DRA and DRS.
Curiously, though, Net Component 5 Fielding Wins are actually less strongly correlated to DRA and DRS than total Net Fielding Wins for outfielders and while only two of the eight players above differ by more than one z-score in Net Component 5 Wins from both DRA and DRS, there are four other players who also differ by at least one z-score in Component 5 Wins but do not differ by as much when total Net Fielding Wins are considered. This is likely because, while DRA and DRS do not (in my opinion) model all of the other aspects of fielding as accurately as Player won-lost records, they nevertheless do capture some of these aspects, and do so reasonably well. Still, I do believe that this is an example of how the imperfect correlations between Player won-lost records and other fielding systems are indicative that Player won-lost records are doing a better job of measuring many aspects of fielding.
For the 153 players evaluated here, the average difference in z-scores between DRA and NFW (DRA minus NFW) is -0.102. The average absolute difference in z-scores between DRA and NFW is 0.661. For DRS, the corresponding numbers are 0.018 and 0.544.
There are a total of 10 players for whom the difference in z-scores is greater than one (in absolute value) for both DRA vs. NFW and DRS vs. NFW. These players are shown in the next table.
|Jose Cruz, Sr.
For the 167 players evaluated here, the average difference in z-scores between DRA and NFW (DRA minus NFW) is -0.076. The average absolute difference in z-scores between DRA and NFW is 0.666. For DRS, the corresponding numbers are -0.053 and 0.573.
There are a total of 16 players for whom the difference in z-scores is greater than one (in absolute value) for both DRA vs. NFW and DRS vs. NFW. This is the most players for any of the six positions compared here. These players are shown in the next table.
|Brian L. Hunter
Two center fielders in the above table perhaps warrant some further discussion: Amos Otis and Garry Maddox.
I rate Amos Otis much more highly than either Humphreys or Smith. In my original version of this comparison, the difference was even more stark. At that time, Amos Otis ranked first in career net fielding wins in center field among all players for whom I had calculated Player won-lost records. I revised my Player won-lost records somewhat this past spring and Amos Otis does not look quite so good (his z-score here fell from 1.838 to 1.092). But Humphreys' and Smith's systems, on the other hand, think that Amos Otis was a below-average defensive centerfielder over the course of his career, so Otis still shows up on the above table.
Humphreys quotes Bill James calling Otis "a 'magnificent' fielding center fielder", and he did win 3 Gold Gloves in his career (in 1971, 1973, and 1974). But everybody can surely think of at least one fielder who won a Gold Glove award or two that he didn't deserve. And anyway, there's a pretty large gap between "3-time Gold Glover" and "best fielder of the past 65 years".
In my original article, I expressed "doubt" that Amos Otis was really the "best centerfielder of the past 65 years" and I am actually somewhat relieved that my revised results agree that he was not quite that good (although he's still top 10). That said, my revised results still think much more highly of the "magnificent" fielding of 3-time Gold Glove winner Amos Otis than DRA and DRS.
I think that one reason why my system loves Amos Otis's defense so much has to do with his home ballpark in Kansas City. The next table shows team ballpark factors for the Kansas City Royals in the 1970s (the seasons when Amos Otis was their everyday centerfielder). Numbers here are expressed in relation to the batting team with 100 being average, so, for example, a Doubles factor of 102 would mean that doubles are 2% more common in Royals games than in the AL in general (because of the ballparks, not the players).
The numbers bounce around a bit from year to year but, in general, Kansas City's ballpark boosted run-scoring by boosting doubles and triples while suppressing home runs. The result is a higher-than-average number of balls in play in Kansas City with a higher-than-average number of these balls falling in for hits in general, and for extra-base hits in particular.
Because hits-in-play were more plentiful in Kansas City, the value of outs on balls-in-play there were greater than average. By measuring value using ballpark-specific win probabilities, my Player won-lost records (fielding, batting, baserunning, and pitching) implicitly adjust for ballpark context. So, my Player won-lost records like Amos Otis's defense better because he played in a ballpark where outfield hits were more common, making it a more difficult ballpark to play centerfield.
I think that this is a real advantage of my fielding (and batting, baserunning, and pitching) won-lost records.
One of the more troubling results I encountered when I was first evaluating my Player won-lost records was the fielding record of Garry Maddox. Garry Maddox won 8 consecutive Gold Gloves from 1975 through 1982 and was considered the gold standard of centerfield defense.
My Player won-lost records, on the other hand, show Garry Maddox to have been an average defensive centerfielder over the course of his career.
Now, as anyone familiar with Gold Gloves knows, they are not necessarily the best measure of fielding prowess - far from it in many cases. And similarly, one of the lessons of modern fielding metrics is that looks can frequently be deceiving when it comes to judging major-league fielding ability.
But DRA and DRS both agree with the consensus of the time: Garry Maddox was a great centerfielder. He led his league in DRS among centerfielders 4 times (1976, 1978-80) and finished second 3 other times (1975, 1977, 1981). For his career, his z-score in DRA is 1.279 and for DRS it's 1.296. But for Fielding Player won-lost records, his net fielding wins earn a z-score of 0.111.
This concerned me: it seemed like an obvious mistake on the part of my Player won-lost records. But then I read Michael Humphreys' entry on Garry Maddox in his book:
"[Maddox] was at best an average fielder when he came up with the Giants. Traded to the Phillies, he played next to possibly the worst outfielder of all time: Greg "The Bull" Luzinski. On almost all teams, the centerfielder takes all chances in the outfield that he can, including soft flies that could be handled in the gaps by the corner outfielders. But with The Bull, Maddox may have taken what would normally be fly ball chances of the left fielder. Maddux had only one good season when he wasn't playing next to Luzinski, the strike-shortened 1981." (Wizardry, p. 302)
Here's how Garry Maddox's record looks in the three measures I'm comparing here season by season. The seasons where Maddox was not teamed with Luzinski are bolded.
My numbers for Maddox are much more stable with and without Greg Luzinski as a teammate. But does that mean that my numbers are the ones that are right?
One possible problem that my system might be having with the Luzinski/Maddox outfields could be if Maddox ended up tracking down a fair number of hits that were Luzinski's fault. Under my system, one key (perhaps "the key") defining characteristic of balls-in-play is the first fielder to touch the ball. Specifically, certain assumptions about the probability that a play could have been turned into an out and by whom are calculated based on who the first fielder was to touch the ball. So, for example, if a double is fielded by the center fielder, the system assigns more "blame" for that double to the center fielder than to the adjoining fielders.
If Garry Maddox routinely ran down hits that were Greg Luzinski's fault, my system might be under-rating Maddox (in Components 5 and 6) and offsettingly over-rating Greg Luzinski. While this seems plausible to me, in fact, I basically agree with Humphreys' and Smith's assessment of Luzinski's fielding. For his career in left field, he gets z-scores of -2.192 in DRA, -1.860 in DRS, and -2.348 from me. If anything, I am scoring Luzinski a bit more harshly than Humphreys and Smith.
Let me pick out one season. I don't claim this season is representative, it's just the first one that I looked at. According to Player won-lost records, the 1979 Phillies accumulated a total of approximately 1.2 net fielding wins overall and the Phillies outfield accumulated approximately -0.6 net fielding wins. This ranked them 6th in the National League that year in net fielding wins. According to Baseball-Reference.com, on the other hand, the Phillies led the National League in Defensive Runs Saved with +54, with their starting outfield scoring a combined +24 (+26 by Maddox, +18 by Bake McBride, and -20 by Luzinski).
At the team level, we should be able to get a pretty good sense of how good a team's defense is by looking at the team's Defensive Efficiency Rating (DER, the percentage of balls-in-play turned into outs). According to Baseball-Reference, the Phillies ranked 6th in the NL in DER in 1979 at 0.703 vs. a league-wide value of 0.700. Those numbers are perfectly in line with my assessment of Phillies' team fielding. And while it's a much worse measure of just fielding, it might also be worth noting that the Phillies ranked 9th in the 12-team NL in runs allowed per game (4.40 vs. league-avg of 4.22).
I'm reluctant to shout this result from the rafters and claim that I have established definitively that Garry Maddox was wildly overrated as a defensive centerfielder. But he might have been.
Infielders versus Outfielders
There were a total of 16 infielders (counting Junior Gilliam twice) whose z-scores differed by more than 1.0 when comparing my NFW to both DRA and DRS, about 5 per position and 3.3% of the total infielders that I compared. For outfielders, the number is 34 players, 11 per position and 7.2% of all outfielders that I compared.
Overall, I'm quite pleased with the results here. The overall correlation, across all six positions (958 players) investigated here, between Humphreys' DRA and my Fielding won-lost records - expressed in terms of z-scores - was 0.692. The z-scores associated with these two systems differed by more than 1 in 191 cases (19.9%). The correlation between Smith's DRS and Fielding won-lost records was 0.792 with disagreements of 1 or more in 108 cases (11.3%). For a little context, the correlation between DRA and DRS was 0.747 and the two systems disagreed by more than 1 z-score in 147 cases (15.3%).
As discussed above, my results disagree with DRA and DRS more often for outfielders than for infielders. Even here, however, my overall correlations with DRA and DRS are 0.660 and 0.759, respectively. Moreover, as I discuss above, much of the lower correlation in this case is because I incorporate outfielders' (and infielders') individual ability to prevent extra-base hits. In this case, therefore, I believe this modestly lower correlation is an indication of the extent to which I am doing a somewhat better job of fully measuring the overall fielding value of these players.
This is not to say that my Fielding won-lost records are necessarily better than alternative fielding measures (including DRA and DRS). But I am confident that my Fielding records stack up very well with the best alternative fielding measures out there.
This article was updated to incorporate revised data for Player won-lost records on October 9, 2014
All articles are written so that they pull data directly from the most recent version of the Player won-lost database. Hence, any numbers cited within these articles should automatically incorporate the most recent update to Player won-lost records. In some cases, however, the accompanying text may have been written based on previous versions of Player won-lost records. I apologize if this results in non-sensical text in any cases.
List of Articles