**Matchup Formula**

One of the coolest formulas I’ve come across in sabermetrics is the Matchup Formula, sometimes called the LogBasic Formulas

If a team with a 0.667 winning percentage faces a team with a 0.450 winning percentage, how often would you expect the 0.667 team to win?The answers to all of these questions can be solved with the Matchup Formula.

If a 0.300 hitter faces a pitcher with a 0.290 batting average against, in a league with a 0.250 batting average, how well do we expect the batter to hit?

If 49.0% of a particular type of ball were turned into outs in a league when 70.2% of all balls-in-play were turned into outs, what percentage of these would be outs in a league where only 69.4% of all balls-in-play were converted into outs?

1. Overview of Matchup Formula

I’ll begin with the simplest version of the Matchup Formula. Let W

Probability of Team 1 Winning = W_{1}•(1 – W_{2}) / [W_{1}•(1 – W_{2}) + W_{2}•(1 – W_{1})]

The formula implicitly assumes that both of these teams faced average (or at least equivalent) opposition in compiling those winning percentages. So, what if Team 1 has a 0.667 winning percentage, but the average record of their opponents was only 0.440, while Team 2’s 0.450 winning percentage was amassed against opponents with an average winning percentage of 0.520?

Let W

Probability of Team 1 Winning = W’_{1}•(1 – W’_{2}) / [W’_{1}•(1 – W’_{2}) + W’_{2}•(1 – W’_{1})]

where

W’_{1} = W_{1}•O_{1} / [W_{1}•O_{1} + (1- W_{1})•(1- O_{1})]

W’_{2} = W_{2}•O_{2} / [W_{2}•O_{2} + (1- W_{2})•(1- O_{2})]

There is still one more additional piece of information. The formula so far assumes that all of the numbers within the formula are relative to a 0.500 context. What if we return to our batting average example from the first paragraph? If a 0.300 hitter faces a pitcher with a 0.290 batting average against in a league with a 0.250 batting average, how well do we expect the batter to hit? Relating this to our earlier formulae, the 0.300 corresponds to W

Let P

P_{0} = W’_{1}•(1 – W’_{2}) / [W’_{1}•(1 – W’_{2}) + W’_{2}•(1 – W’_{1})]

where

W’_{1} = W_{1}•O_{1} / [W_{1}•O_{1} + (1- W_{1})•(1- O_{1})]

W’_{2} = W_{2}•O_{2} / [W_{2}•O_{2} + (1- W_{2})•(1- O_{2})]

and

P_{1} = P_{0}•(1 – L) / [P_{0}•(1 – L) + L•(1 – P_{0})]

2. Use of Matchup Formula to Estimate Event Weights

The probabilities that underlie the calculation of basic Player Game Points are dependent on the exact location of the ball and how it was hit. For example, the probability of driving in a runner from third is vastly different on a ground out to the pitcher (16.5% in the 2006 National League) versus a fly out to center field (84.5% in the 2006 National League). Hence, in theory, ball-in-play probabilities should be calculated for each unique location/hit type combination.

My data source is Retrosheet event files. The amount of information on locations and hit types provided by Retrosheet event files varies considerably by year. Full location and hit type information are available for most balls-in-play for the years 1989 – 1999. For other years, event probabilities are imputed based on final outcomes in the year of interest and location probabilities for the 1989 – 1998 period using the Matchup Formula. This process is described here .

Hopefully, one example will give some indication how this works. Based on 1989 – 1998 data, a line drive single to left field had an a priori probability of being an out of 18.81%. This is W

Plugging in all of that, then, we would expect line drive singles to left field to have an a priori probability of having been outs of 13.75% in the 2005 National League.

3. Use of Matchup Formula in Allocating Credit for Player Game Points

For those components where multiple players share credit for Player Game Points, such as pitchers and catchers with respect to stolen bases, the relative credit is divided between the relevant players through a process described here .

The major drawback to Player Won-Lost records that are tied to team records as developed here is that, for a particular play, the pitcher and catcher are assumed to bear equal responsibility – not in terms of equivalent Player Game Points, but in terms of the fact that wins are credited to both pitchers and catchers for plays in which the defensive team earns wins and losses are debited to both pitchers and catchers for plays in which the defensive team earns losses. In reality, it is perfectly reasonable to envision a scenario whereby, for example, a pitcher does a terrible job of holding a baserunner on and is only saved by a perfect throw from the catcher to catch the runner stealing. In such a case, it may be more reasonable to credit the pitcher with a loss for his role in preventing stolen bases while crediting the catcher with more wins than he currently receives. Another example of this would be a catcher who, while normally excellent at preventing wild pitches and avoiding passed balls, has the misfortune of regularly catching a knuckleball pitcher.

In terms of Context-Dependent Wins and Losses (pWins/pLosses), where the object is to ensure that Player Wins and Losses relate perfectly to team wins and losses, such a situation is largely unavoidable. If one wants to neutralize individual player records in order to move beyond team records, however, then, at a seasonal level, one could use the Matchup Formula to adjust for the performance of the other players with whom a particular player shared credits.

Suppose, for example, that a pitcher compiled a Component 1 (basestealing) winning percentage of 0.515 but that the catchers with whom he shared that Component 1 credit compiled an average winning percentage (weighted by the number of Component 1 points which they shared with this particular pitcher) of 0.535.

In such a case, the Matchup Formula can be used to adjust the pitcher’s Component 1 winning percentage. Here, the pitcher’s winning percentage (0.515) would correspond to W

In order to properly adjust both pitchers’ and catchers’ winning percentages in this way, one needs to use an iterative process. That is, one first adjusts pitchers’ winning percentages given the winning percentages of their catchers. One would then need to adjust catchers’ winning percentages given the adjusted winning percentages of their pitchers. Having adjusted the catchers’ winning percentages, however, one would want to re-estimate adjusted winning percentages for pitchers using these new adjusted catcher winning percentages. This process would continue until neither pitcher nor catcher winning percentages change between iterations. This process is repeated four times here. These results are used in constructing Context-Neutral, Teammate-Adjusted Player Won-Lost records (eWins and eLosses) as well as to determine the appropriate allocation of Player Game Points across players.

4. Adjusting Player Game Points for the Level of Competition

The final way in which the Matchup Formula could be useful in adjusting Player Game Points would be to adjust Player Game Points based on differences in the average level of competition faced by different players. That is, if two batters compiled identical offensive winning percentages (say 0.510), but one faced pitchers with an average winning percentage above 0.500 (say 0.505) and the other faced pitchers with an average winning percentage below 0.500 (say 0.495), the former batter would actually be a better hitter. The Matchup Formula, in fact, would say that the first batter (0.510 versus 0.505 pitchers) actually accumulated an adjusted winning percentage of 0.515 while the second batter, with a 0.510 winning percentage against 0.495 pitchers, accumulated an adjusted winning percentage of 0.505.

As with the shared Player Game Points, adjustments of this type would have to be made through an iterative process. As of now, I have not yet made any such adjustments.