Using Pitchf/x data I looked at the players who swing frequently at bad pitches, and I tried to figure if they hurt themselves by trying to hit everything.
Initially I planned to look at all swings occurred on a pitch that was a ball according to the rulebook strike zone; then I decided to do things differently.
Using spatial smoothing I calculated the probability of a pitch to be called strike, given it's location. A previous study I run on my Italian website, and other researches by other authors at THT and elsewhere, had showed that batter handedness influences umpire decisions more than pitcher handedness, so I calculated different probabilities for RHB and LHB.
A couple of charts will summarize this part.
I decided to classify a pitch as a "bad ball" when its probability to be called strike is lower than 10% (the cutoff value is purely subjective, and I'd welcome suggestions for a different choice).
Here are the top ten bad ball swingers:
No Vlad on the list? I must have done something wrong!... No, he's the eleventh, just out of the table, at 37%.
I compared my full list with the one at FanGraphs and while they don't coincide, they are quite similar. Anyway they don't have to coincide, since FG charts outside zone swing %, while I'm charting "really outside zone swing %".
Here's the bottom of the list.
For this work I selected players who have seen at least 300 bad pitches. I don't know if this choice caused some players who don't swing at bad pitches to be left out.
Next thing I investigated is run value for swings on bad balls.
Many players are notorious bad ball swingers, but they are also feared because they can do a lot of damage even when they chase pitches in the dirt.
In the following table I summed up the run values obtained by hitters when they swung at a bad pitch.
But what if all those pitches were let go by?
I calculated the net run value for bad pitches, which you will find in the next table, as following:
- if the batter didn't swing, assign the run value of the pitch (likely the run value of a ball; but if the ump called it a strike, then the run value of a strike);
- if the batter swung, assign the run value of the event minus the expected run value of the pitch had the batter not swung (that is something like 90+% * run value of a ball + 10-% * run value of a strike).
As we see from the table, Gary Sheffield is the only player in MLB to have a positive value for his bad pitch swinging (at 29% his swinging percentage on bad pitches is middle-of-the-pack). Jose Reyes, the worst in this ranking, has lost 32 runs by swinging at balls way out of the zone.
Players like Ryan Howard and Vladimir Guerrero can have a gross production of more than 11 runs when swinging at bad balls, but when you look at what they would have produced had they let those pitches go by you get a net loss of nearly 30 runs.
I think I made you wander too long in the dark by giving just a few top-ten tables. Here's a spreadsheet with all the players that made my cut of 300 bad pitches seen.