Thursday, March 26, 2009

Another one bites the clutch

A lot of words have been spent on clutch hitting and by many people smarter than me.
Anyway I would like to share some work I did a few years ago on my Italian website. That work has gone through some imporant refinement before giving birth to this post.

At the time I wrote the Italian article, the wonderful FanGraphs didn't have the Clutch stat yet, thus I was looking for some way to measure if (and possibly how much) a particular season by a particular player was clutch.
(Those were the times when A-Rod won the MVP award and many Sox fans - and Yankee haters - cried that Big Papi was more deserving due to his clutch performance).

The following was my line of thinking.

1. Using average Run Values for batting events, I can assign a value to each plate appearence of a particular player. This value measures the outcome of the at bat without taking into account the importance (leverage... more on that in few moments) of the situation.

2. And I have Leverage, that is a number that measures the importance of the moment without looking at the outcome of the AB.

Suppose there's a player who is the clutchiest guy on the face of earth. He may have whatever batting line, but he will get the greatest production (HRs) in the highest leverage situations and the least production (Ks) in the lowest leverage situations.
Take all the at bats of this player, put the run values in a spreadsheet column, then put the corresponding leverage index in the next column.
Sort your spreadsheet for the run value column; now sort for the leverage column. Nothing changes for this particular guy.

Obviously such a player doesn't exsist, but I can use correlation between run values and leverage index to look at a player clutchiness.
If a player has a correlation of zero, then he achieved his best and worst results randomly across the leverage spectrum; the closer to one the more he has been clutch, the closer to minus one the more he has been a choker.
But how closer to 1 (or -1) can a player go in a season (or a career)?

This leads to the first enhancement I have made over my original article. Since at bat run values can only assume a limited set of values, correlation between RV and LI can not be exactly one even for the hypothetical super clutch player. Thus, when I show confidence intervals for the correlation, they have been calculated using bootstraping techniques.
The other improvement has to do with the way Leverage is calculated. At that time I used something like Keith Woolner's formula (Baseball Prospectus 2005), now it's something more like Tango's and Elan Fuld's (he uses the acronym PIOGO for his index, and his work can be found here).

Note: I realize that Tango's numbers for LI are now widely used and I would like to perform my analyses using his values, but I had a table with my own values that was already formatted as I needed. I checked some of my values with the corresponding ones on Tango's website and I didn't find big discrepancies (save for the fact that he chooses to put LI = 1 at the average leveraged situation, while I put LI = 1 at the beginning of the game).

There's a third refinement, that occurred when I was ready to post (thus forcing me to redo all you read from now on).
Run Values are centered at 0 and are additive: a natural scale is the right thing to use.
Leverage Index is a ratio, centered at one; a LI of 0.5 and a LI of 2 are conceptually equidistant from a LI of 1: we need a logarithmic transformation to get the right behaviour from the numbers (e.g. log(1)=0; log(0.5)=-0.69; log(2)=0.69).

Well, let's look at some players.
Fangraphs has Pujols in 2006 as one of the best recent seasons for clutch hitting performance (Clutch = 3.25). Correlation between his RVs and the LI at which he compiled them is 0.05, with a 95% confidence interval of -0.02 - 0.13.
Such a "clutch" season might have been occurred by chance, since the confidence interval contains zero.

Here's Big Papi in 2005, the clutch season par antonomasia (FanGraphs has it at a 3.31 Clutch score).
Correlation: 0.02 (95% confidence interval: -0.06 - 0.09).
We can make a couple of observations on these numbers. Even a very clutch season likely won't result in a correlation significantly different from zero (i.e. the 95% confidence interval doesn't have the value zero in between), and should we find one we'd have to look at - at least - the 2nd decimal value.

And now, A-Rod (A-Choke?) 2008 (-3.16 by FanGraphs metric).
Correlation: -0.09 (95% confidence interval: -0.18 - 0.004).

It seems that it's really difficult to get a value significantly different from zero. Here we must note one important thing: to get a high value of correlation, a player should overperform in high leverage situations AND underperform in low leverage situations. Thus players like Pujols or Ortiz who always perform near the excellence level can't possibly overachieve in high leverage situations - I believe that a correlation significantly different from zero (albeit we have to look at the second or third decimal point) on the positive side is noteworthy nonetheless; it's no surprise that we have found a significant (though minuscule) effect on the negative side for a very good hitter like Rodriguez.

In Weaver on Strategy, the Earl of Baltimore talked about Eddie Murray as a guy who always produced when the game was on the line, but tanked a bit in lopsided games. Let's have a look at his run value / leverage index correlation values.

Year - Correlation (95% Confidence Interval)
1977 - 0.004 (-0.09 - 0.07)
1978 - 0.06 (-0.003 - 0.12)
1979 - 0.02 (-0.06 - 0.08)
1980 - 0.04 (-0.03 - 0.11)
1981 - -0.07 (-0.19 - 0.01)
1982 - 0.01 (-0.09 - 0.09)
1983 - -0.05 (-0.14 - 0.02)
1984 - 0.07 (0.01 - 0.14)
1985 - 0.10 (0.03 - 0.16)

1986 - 0.01 (-0.08 - 0.09)

1987 - 0 (-0.08 - 0.06)

1988 - 0 (-0.08 - 0.07)

1989 - 0.01 (-0.07 - 0.08)

1990 - 0.08 (0.01 - 0.14)

1991 - 0.06 (-0.01 - 0.13)

1992 - 0.01 (-0.08 - 0.10)

1993 - -0.02 (-0.10 - 0.04)

1994 - 0.04 (-0.06 - 0.12)

1995 - -0.03 (-0.13 - 0.05)

1996 - -0.003 (-0.08 - 0.06)

1997 - 0.05 (-0.07 - 0.16)

There are a few seasons in which Eddie beat the zero, only one of them under Weaver (1985 - and the book came out one year earlier).
I reapeat. When we find an effect, it is very small (so small we can doubt it really is an effect). Anyway, this player that was perceived to be clutch by one of his managers never had a choke season (defined as one for which the entire 95% CI is under zero).

The next natural step is to look at correlation values for a career.
Since I reported every season for Eddie Murray, here's his career clutch line:
Correlation: 0.02 (95% confidence interval: 0 - 0.03).

Here are the other players I mentioned so far.
Albert Pujols: Correlation: 0.02 (95% confidence interval: -0.01 - 0.04).
David Ortiz: Correlation: 0.03 (95% confidence interval: 0.004 - 0.05).
Alex Rodriguez: Correlation: -0.002 (95% confidence interval: -0.02 - 0.02).

Since we are looking at a very small effect, it's easier to find something statistically significant (if there is any effect) when looking at a career - i.e. at more observations.

I'm tempted to write a few more lines of code in order to calculate correlation values with confidence intervals for every player in the Retrosheet era (for every single season, and for entire careers). Maybe I will do that. Would it be useful, and for what purpose? Sure you can't say that Ortiz's career values are better than Murray's: the respective confidence intervals overlap a lot; thus it won't be helpful to single out the clutchiest player ever.
I'm curious to see if there are any players that beat the zero correlation year in and year out (I doubt) or players that both shows clutch and choke seasons.

Thursday, March 19, 2009

Toward valuing a hit kept in the infield

Sunday night I was playing with my newly populated Retrosheet Database (thanks a lot Colin Wyers!) and watching Japan vs Cuba.
As I finished calculating run values for various events, Johjima was on second with one out; Iwamura hit a groundball that looked headed to centerfield. Instead second baseman Yulieski Gourriel made a diving grab, turned his body and tried to get the runner at first. Akinori was safe, but the play by Gourriel prevented Johjima from scoring (he would undoubtely have, had the ball gone through).
Fielding metrics that compare outs made vs expected outs treat such a play as a failure for the infielder; maybe Gourriel wouldn't be charged with a great negative contribution, since the ball was in a zone where not many outs are expected to be made. Anyway I believe the Cuban second baseman might deserve more than a mild negative score on that play.
Using data from the last five years, I calculated that a single that goes to the outfield has a run value of 0.477, while a single that stays in the infield is worth 0.409 runs.
Those are values averaged across all the possible base/outs situation. In the specific enviroment in which the play occurred the difference would be between 0.681 and 0.482.
Let's stick with the average values for singles. This implies that infielders tries with same effort to keep the ball in the infield when the bases are empty (can we consider this to be true?).
Let's assume that the ball hit by Iwamura, on average, goes to the outfield 80% of the times, ends as an infield hit 15% of the times and is converted in an out the remainig 5% of the times (numbers entirely made up on the spot). The expected run value is 0.8 x 0.477 + 0.15 x 0.409 + 0.05 x (-0.288) = 0.429 (-0.288 is the value for the out I got from the 2004-08 data).
Gourriel kept the ball in the infield, for a run value of 0.409, thus we should credit him for preventing 0.02 runs.
I haven't read the second edition of the Fielding Bible, but I suspect that the plus/minus system doesn't properly value a play like this.
I'm pretty sure that David Pinto's PMR (that I prefer to +/-) doesn't deal with this either (PMR is outs made vs expected outs).
It's been a long time since I've read MGL's work on UZR at Baseball Primer, so I don't remember if UZR addresses similar events.
I don't know if factoring this kind of play into a fielder's evaluation has a significant impact on his contribution, but I believe it's worth to investigate into the matter.

Note: the percentages I made up have been chosen by design to give a positive value for Gourriel's play. If we delve into real data and find real probabilities, infield singles will give the fielder negative contribution most of the time (less negative than outfield singles, anyway).

Wednesday, March 11, 2009

Arm 2.0

A few years ago I did some work on right fielders arm on my Italian website. I won't go into details of that work because it was very similar to what John Walsh did at THT. Actually my first article and his first article saw the light on the very same day.
Conceptually our works were very similar: they used the run expectancy matrix and compared the run prevented (allowed) by an outfielder with his arm compared to league average.
The main difference between our analyses was that I focused my research on a very limited subset of actions. I just considered singles fielded by right fielders with a man on first and second base empty. This choice, while leaving out a lot of opportunities in which an outfielder arm is tested, was intended to compare the RFs on equal opportunities - I supposed that singles to right are very similar one another (i.e. they are all fielded in a small area of the field), while that's not true of doubles (that may go into the gap or down the line) or fly outs (that can be made close to the infield or against the wall).
All the work was done using Retrosheet data.

Enter Gameday data.
In the following lines I'll try to see if my assumption - that singles to right are created (nearly) equal (relative to "arm testing") - is true, and if outfielders' arm evaluation need to (and can) be corrected using batted ball location data.

Again, I take every base/out situation (in my restricted example it's just 0-1-2 outs and man on 1st / men on 1st and 3rd), and calculate the runs prevented (allowed) by an average arm right fielder on a single.
This time I add the location information. Ideally I want to have the average run value at every possible location where a right fielder can collect a single.
Obviously there can't be enough observations to have every single point on the field covered, thus I used loess smoothing.
Very quickly on loess smoothing: you estimate the (run-)value of every point on the field using the run values of points in the neighborhood which have observations - nearer points have more weight on the estimation.

The chart above shows how much the expected run value of a single with a man on first is increased (lighter blue) or decreased (darker blue) due to where the ball is collected by the right fielder.
It makes quite sense. Maybe I was expecting something like the following (the more the distance from home plate - and from third - the more likely the runners will advance extra bases):

But looking back at the previous chart, I think I can give a reasonable explanation to the increasing run value of the areas close to the infield: while the throwing distance is short when you field a ball in those zones, it's likely that you had to run many more yards to get there - and the runners were circling the bases in the meantime.

OK, so for every outfielder I adjust his expected run value according to where he fielded the singles batted at him.
Here's a table containing both unadjusted and adjusted arm run values for the right fielders with at least 30 chances (singles fielded with man on first and second base open) in 2008.

N arm adj_arm adjustment
Abreu Bobby 64 0.52 0.70 0.18
Church Ryan 40 0.10 0.39 0.29
Drew J.D. 46 1.12 0.47 -0.65
Dye Jermaine 51 -0.05 -0.22 -0.17
Ethier Andre 42 -0.42 -0.43 -0.02
Francoeur Jeff 43 -0.51 -0.56 -0.05
Fukudome Kosuke 45 -0.87 -0.90 -0.03
Giles Brian 53 1.23 1.24 0.00
Griffey Jr. Ken 31 0.29 0.25 -0.05
Gross Gabe 35 -1.01 -1.00 0.01
Guerrero Vladimir 32 0.85 0.79 -0.06
Guillen Jose 31 -0.68 -0.53 0.15
Gutierrez Franklin 50 -1.28 -1.59 -0.31
Hart Corey 53 0.66 -0.40 -1.06
Hawpe Brad 62 -0.02 -0.31 -0.29
Hermida Jeremy 60 -0.76 -0.74 0.02
Kearns Austin 34 -0.39 -0.52 -0.14
Ludwick Ryan 48 -0.52 -0.59 -0.06
Markakis Nick 60 0.53 0.35 -0.18
Nady Xavier 38 -0.06 -0.32 -0.26
Ordonez Magglio 63 0.95 0.88 -0.07
Pence Hunter 52 -2.30 -2.35 -0.05
Span Denard 32 -0.14 -0.01 0.12
Suzuki Ichiro 32 -1.67 -1.68 -0.00
Teahen Mark 33 1.81 1.68 -0.13
Upton Justin 43 1.59 1.59 -0.01
Winn Randy 43 1.40 0.99 -0.42

In many cases the adjustment is quite small, but there are some exceptions.
J.D. Drew, for example, sees a significant improvement when taking into account the batted ball locations: as we see in the following graph he fielded a couple of singles very far away from home.

Corey Hart has even more balls collected in unusual (for a single) places.

Maybe the adjustment I applied using loess smoothing is way to big, since a couple of outlying observations can change a lot in the valuing of a fielder. Anyway, I think the work I've done outlines that some correction is due when evaluating outfielders' arms. It's possible that the couple of "long singles" against Drew are the product of some unusual event (the batter slipping while rounding first, or not running because of an injury, or whatever else); without looking at batted ball location data, we assume that those hits are just like any other single and that JD has an average chance of holding the runner at second or gunning him at third. Obviously, that's not the case and, if we don't trust in the adjustments I proposed, we should at least consider dropping the two outliers.

As I did three years ago using Retrosheet data, I considered a limited subset of plays that test an outfielder's arm. The subset was intended to consist of very similar batted balls, but we ended seeing at least some unusual observations. If we want to have a complete evaluation of the arms, I'm sure the locations will have a higher impact: as I said at the beginning of this post, doubles go in the gap or down the line, flies are caught calling off an infielder or against the wall, and this makes a huge difference in the chances an outfielder has on holding/killing the runner(s).

PS: as I was writing this, Pete Jensen published his normalization of Gameday coordinates at THT. While applying his correction would certainly affect the values of RF arms, I believe that what I wrote in the final paragraphs holds.

Saturday, March 7, 2009


I believe I'm in good company, among stats oriented baseball fans, having read Michael Lewis' article on Shane Battier on February 13th.
Michael tells a fascinating story, but numbers don't appear in his writing (and I think they shouldn't). So I was curios and I did some checking, thanks to data provided at
It's a quick and dirty work, but I think it's worth a look.

The following table shows how the Lakers perform with Kobe on the court, with Kobe off the court and with Kobe on the court against Battier and the Rockets (data from the full 2007/08 season).

Lakers production

pts/min off reb / min def reb / min poss/min pts/poss
with Bryant 2.29 0.27 0.74 2.01 1.14
w/o Bryant 2.08 0.29 0.70 1.94 1.07
with Briant
vs. Battier
1.79 0.32 0.70 1.90 0.94

Lakers scoring drops by 0.2 points per minnute when Kobe is sitting on the bench. Their possessions per minute drop to some extent too: this can be something done intentionally by the team (their best player is off the court, so they slow down the game pace to minimize the effect of the absence).
Something else may be going on, too: difficult shots that a superstar player can take (and make) are not taken by other players, thus needing a longer time for the team to find a shoting opportunity.
The slower pace when Bryant is off the court is not enterily responsible of the drop in points, as the points per possessions also go down a bit; the fact that LA grabs more rebounds under the opponents board when their star is out is likely due to worse shoting percentage.
When Battier is playing against the Lakers, and Bryant is on the court, LA sees a substantial drop, performing even worse than in other games with Bryant sitting on the bench (LA offensive rebounds rise, perhaps due to lower shoting percentages).

I add another comparison table.
Here is Lakers with Bryant vs Rockets with Battier compared to Lakers with Bryant vs Rockets without Battier.

Lakers (with Bryant) production VS Rockets

pts/min off reb / min def reb / min poss/min pts/poss
w/o Battier 2.02 0.26 0.74 1.97 1.03
with Battier 1.79 0.32 0.70 1.90 0.94

We see that LA (with Kobe playing) scores less against Houston (compared to what they do against the league), but Battier's presence on the court is what seems to bring Lakers offense down most.
Since basketball has a lot more interaction between players than baseball, we can't conclude that Lakers scoring less when Shane is playing is entirely Shane's responsibility. Maybe Battier is always playing together with a very good defenseman and they get benched together.

I only looked at scoring (with a quick glance at rebounding) because those were the data readily available at; there's another wonderful source of basketball play-by-play data (, where one can look deeper at the issues I presented.

Friday, March 6, 2009

Predictability - A baseline for future analyses

Here are the most and least predictable pitchers... in theory.
What do I mean?
I set the conditions in my previous posts (I - II).
Every MLB pitchers I considered has his repertoire and his mix combination.

The Minimum Level of Predictability of a pitcher is the percentage of correct guesses that can be made on his selection... if he chooses his pitches without being at all influenced by the situation (opponent, score, outs, inning, men on, etc.)

High Theoretical Predictability
pitcher MLP repertoire
Tim Wakefield 99.7% KN-FB-CB
Grant Balfour 86.2% FB-SL-CH-CB
Jonathan Papelbon 78.7% FB-SL-CH-SF
Neal Cotts 76.7% FB-SL-CH-CB
Matt Thornton 76.4% FB-SL-CT-CH
Brad Ziegler 75.2% FB-CB-SL-CH
Dennis Sarfate 69.7% FB-CB-SL
David Riske 69.5% FB-CH-SL
Joe Beimel 69.4% FB-CB-CH-SL
Daniel Cabrera 68.9% FB-SL-CH-CB

Low Theoretical Predictability
pitcher MLP repertoire
Andy Sonnanstine 24.1% FB-CT-CB-SL-CH
R.A. Dickey 24.2% KN-FB-CB-CH-SL-SF
Jorge Campillo 26.9% FB-SL-CH-CB-CT
Shaun Marcum 27.0% FB-SL-CH-CB-CT-SF
Bronson Arroyo 27.2% FB-CB-CH-SL-CT
Carlos Villanueva 27.2% FB-SL-CH-CB
Mike Mussina 28.1% FB-CB-SL-CH-SF
Doug Davis 28.7% CT-FB-CB-CH-SL
Lance Cormier 29.2% CB-SL-FB-CT-CH
Jered Weaver 29.7% FB-CH-SL-CT-CB

Some observations
The pitch repertoire is what comes out from MLBAM Gameday (not necessarily correct)

I showed the top-10 and bottom-10 because lot of people like to see ranking tables, but I think in this case they don't make a lot of sense. Anyway a quick look at them can't be harmful.

First of all, my dataset has Wakefield throwing nearly 100% knuckleballs (thus yelding the highest predictability value); I don't know if either I did something wrong in importing pitchf/x data or MLBAM classification algorithm marks every offering by Tim as a knuckler - anyway Fangraphs (BIS data) has his split at 81% KN, 13% FB, 6% CB, and ESPN Insider at 82% KN, 13% FB, 5% CB (using these sources I would get 67.7% and 69.2% predictability value, respectively).

Wakefield leads to another point: knuckleballs have a very high degree of variability, so knowing that the pitch coming at you is called knuckleball, won't tell you anything about the path the ball will travel.

You can make a similar case for Mariano's cutter (Mo's out of the top ten list): it's not really a single pitch, because he varies the position of his fingers to obtain different cuts.

Analyzing how much Wakefield or Rivera are predictable on their selection is pretty useless, as it is for guys like Ballfour who throw (nearly) nothing but fastballs: hitters already know what is gonna come at them - Ballfour, Wakefield and Mo don't live on surprising opponents.

Further analyses on this subject can be more useful if done on pitchers with, say, three very distinguishable pitches (e.g. Fastball, Slider, Changeup) none thrown more than 3/4 of the time, and that's what I'll try to do in the future.

Now you might ask what's all this effort in getting theoretical predictabilities for.

I'd like to look at individual pitchers' predictabilities using advanced statistical techniques, such as multinomial logistic regression: I would like to build models to help me predict what a pitcher will throw on a specifical situation (i.e. against a right hander, late in a close game, with a runner on first, nobody out and following a slider); but I needed to set a minimum usefulness for the models... if I build model performs worse than the Minimum Level of Predictability, then it is useless.
On the other side, if a model for a pitcher significantly outperforms the Minimum Level of Predictability, then maybe the pitcher is falling into patterns, and this fact could be exploited by the opposition.

As I said, I'll tackle the models in the future.

Tuesday, March 3, 2009

Predictability (Play Ball!)

I found the answer to the predictability question in last post with basic probability.
In such a hypothetical situation, what the pitcher is going to throw and what the hitter thinks is coming at him are independent events.

Thus we have four scenarios (the following example is for Pitcher B - the one who throws 90% Fastball, 10% slider).
  1. Pitcher B is going to throw a fastball (probability = 90%, or P(PF) = .9) and Hitter C is looking for a fastball (P(HF) = .9): the probability of this scenario is P(PF) * P(HF) = .9 * .9 = .81;
  2. Pitcher B is going to throw a fastball (P(PF) = .9) and Hitter C is looking for a slider (P(HS) = .1): the probability of this scenario is P(PF) * P(HS) = .9 * .1 = .09;
  3. Pitcher B is going to throw a slider (P(PS) = .1) and Hitter C is looking for a fastball (P(HF) = .9): the probability of this scenario is P(PS) * P(HF) = .1 * .9 = .09;
  4. Pitcher B is going to throw a slider (P(PS) = .1) and Hitter C is looking for a slider (P(HS) = .1): the probability of this scenario is P(PS) * P(HS) = .1 * .1 = .01;

The probability of a correct guess is the sum of the probabilities for first and last scenarios, i.e. 0.82.

You can be more general and calculate the Minimum Level of Predictability given any number of pitch types a pitcher has in his toolbox, any way he mixes them, and the information the hitter has on him (that might not be accurate).
It's as simple as:

Prob pitcher throws fastball * Prob hitter looks for fastball
+ Prob pitcher throws slider * Prob hitter looks for slider
+ Prob pitcher throws curve * Prob hitter looks for curve
+ and so on...

Obviously the Minimum Level of Predictability is lower for pitchers who
  • have more pitches in their repertoire
  • and use them in equal proportions.

Here are some combinations coming out of my mind as examples:

# pitches in repertoire selection percentages MLP
2 50-50 50%
2 90-10 82%
3 90-5-5 81.5%
3 50-30-20 38%
3 34-33-33 33%
4 50-30-15-5 36.5%
4 90-5-3-2 81.4%
4 25-25-25-25 25%

I think it's too much for theory (well, not so fast... look at the note below). Next time I'll write some real pitcher names.

If you think that Pitcher P is never going to change his mixing (90% FB, 10% SL), the guessing hitter will have more success if he always guesses Fastball (he'll be correct 90% of the times instead of 82%).
Obviously, in such a case, after some time the pitcher will stop throwing fastballs to that hitter at all; then the hitter will adjust and look only for sliders. After some back and forth adjustments the couple will reach an equilibrium. Ideally that would be at 50-50, since it's the most unpredictable combo; but the pitcher, as we said in our example, might not have equal confidence in both pitches and/or one pitch might be more stressful for his body; thus he will reach a different equilibrium (90-10 in our example).

Monday, March 2, 2009

Predictability (cerimonial first pitch)

Can we measure how much a pitcher is predictable in choosing the type of pitch he will deliver?
Today I will introduce the Minimum Level of Predictability.

Let's say Pitcher A has two pitches in his arsenal, e.g a fastball and a slider. He is so confident on both pitches that he delivers fastball 50% of the time and slider the other 50%. Suppose also he is perfectly random in his selection: no matter the count, the on base situation, the score, the hitter, the last pitch he has thrown - every delivery is just like a coin toss.

Pitcher B has a fastball and a slider too, but he chooses the hard one 90% of the time. Though his predilection toward the fastball is huge, he doesn't care the situation either: whatever is happening (or has just happened) on the field, his selection will be absolutely random (though heavily unbalanced).

Now we have Hitter C, who is an extreme guess hitter. Every time he's going to face a pitch he spins a roulette in his head to decide what pitch to look for. He is not entirely devote to Lady Luck: he talks a lot with the advanced scouts and takes notes.
The scouts have told him that Pitcher A throws fastball-slider in a 50-50 proportion and Pitcher B throws the same combination but with a 90-10 ratio.
He sets up his mental roulette accordingly.

How many times will he correctly guess offerings from Pitcher A and from Pitcher B?
Short answer: 50% correct against Pitcher A, 82% correct against Pitcher B.
Long answer: ...will see you tomorrow night!