Saturday, July 4, 2009

Fastball vs Change-up, featuring Johan Santana

Here is my latest effort at THT.
Master of Fooling featuring Johan Santana and his feared change-up... and a few nice 3D charts.

Friday, June 19, 2009

Friday, June 5, 2009

The best locations for each pitch

A followup to my first article at The Hardball Times.
This time both horizontal and vertical locations are taken into account (but not together yet).

Meanwhile, in case you missed it, the previous article generated a lot of discussion at The Book Blog.

Saturday, May 23, 2009

"Weighted" shift

My third article at The Hardball Times deals with defensive alignment too.
Some of the issues that arose from my first post here are taken into account.
There is already some discussion going on at The Book's Blog.

Sunday, May 10, 2009

On the shift again

Here's my second article at THT.
I'm looking at defensive alignment one more time. This time I try to see if moving slightly according to the pitch selection may help the fielders.

Friday, April 24, 2009

Painting the corners... at THT!

My plan was to analyze the table I posted a couple of weeks ago: I still would like to figure who are the pitchers that induce swings on balls way off the zone.

I changed plans because I got the call by The Hardball Times.
Here you can find my first article for them.
I should be writing there every other Thursday.

Please come back here now and then, maybe I will add something here too.

Saturday, April 18, 2009

The year of The Bird

I like doing statistical analysis and producing charts, so this is my tribute to Mark Fidrych.


Click on the chart to have it zoomed.
A red cross (plus day of the week) is placed on games started by The Bird. Information on attendance is not available for a couple of his starts at the end of the season.
Thanks a lot to Retrosheet for providing the data.

Monday, April 13, 2009

Foolers - The other side of the coin

This will be a post consisting of three tables and very little commentary.

In my next to last post I looked at batters swinging at really bad pitches. Here I will present the numbers for the pitchers.

Here are the ten pitchers with the highest percentage of swings induced on balls way out of the zone (probability of being called strike lower than 10%).

LAST FIRST n swung pct
Johnson Josh 309 162 52%
Volstad Chris 334 160 48%
Hampton Mike 338 150 44%
Balester Collin 329 138 42%
Nippert Dustin 300 124 41%
Liriano Francisco 314 128 41%
Davies Kyle 506 197 39%
Lackey John 556 209 38%
Baker Scott 575 216 38%
Sabathia C.C. 880 329 37%

And here are the bottom ten.

LAST FIRST n swung pct
Gorzelanny Tom 444 74 17%
McGowan Dustin 385 60 16%
Penny Brad 376 57 15%
Miller Andrew 451 64 14%
Chacon Shawn 389 53 14%
Eaton Adam 440 55 12%
Wang Chien-Ming 343 41 12%
Glavine Tom 372 44 12%
Bedard Erik 304 31 10%
Dumatrait Phil 362 31 9%

I have to look carefully at the tables yet.
At first glance I haven't been able to spot any pattern in the rankings, but here I give you the full table (2008 pitchers with at least 300 bad pitches thrown), in case you can find anything quicker than me.

LAST FIRST n swung pct
Johnson Josh 309 162 52%
Volstad Chris 334 160 48%
Hampton Mike 338 150 44%
Balester Collin 329 138 42%
Nippert Dustin 300 124 41%
Liriano Francisco 314 128 41%
Davies Kyle 506 197 39%
Lackey John 556 209 38%
Baker Scott 575 216 38%
Sabathia C.C. 880 329 37%
Street Huston 363 135 37%
Kazmir Scott 743 276 37%
Sowers Jeremy 447 162 36%
Miner Zach 583 206 35%
Halladay Roy 798 281 35%
Campillo Jorge 674 237 35%
Webb Brandon 815 286 35%
Correia Kevin 481 168 35%
Martinez Pedro 502 174 35%
Rusch Glendon 302 104 34%
Garza Matt 814 278 34%
Morton Charlie 329 112 34%
Wellemeyer Todd 738 251 34%
Shields James 730 248 34%
Lidge Brad 393 133 34%
Kuroda Hiroki 751 254 34%
Harden Rich 620 209 34%
Liz Radhames 348 117 34%
Perez Odalis 692 232 34%
Buehrle Mark 815 273 33%
Moehler Brian 463 155 33%
Baek Cha Seung 506 169 33%
Fogg Josh 366 122 33%
Hamels Cole 759 251 33%
Perkins Glen 403 133 33%
Slowey Kevin 449 148 33%
Dempster Ryan 816 267 33%
Sonnanstine Andy 575 188 33%
Lester Jon 784 256 33%
Perez Oliver 810 264 33%
Kershaw Clayton 415 135 33%
Nolasco Ricky 637 207 32%
Madson Ryan 317 103 32%
Young Chris 406 131 32%
de la Rosa Jorge 575 184 32%
Lincecum Tim 791 253 32%
Oswalt Roy 626 200 32%
Pineiro Joel 456 145 32%
Broxton Jonathan 302 96 32%
Haren Dan 746 237 32%
Hernandez Felix 656 208 32%
Galarraga Armando 696 220 32%
Peavy Jake 792 250 32%
Santana Johan 720 226 31%
Marcum Shaun 615 193 31%
McClellan Kyle 306 96 31%
Blackburn Nick 552 173 31%
Santana Ervin 812 254 31%
Greinke Zack 716 222 31%
Ramirez Ramon 313 97 31%
Beckett Josh 619 191 31%
Moyer Jamie 869 266 31%
Durbin Chad 407 124 30%
Bergmann Jason 572 174 30%
Maddux Greg 503 153 30%
Billingsley Chad 880 267 30%
Danks John 612 185 30%
Matsuzaka Daisuke 748 226 30%
Lowe Derek 950 287 30%
Johnson Randy 649 196 30%
Pettitte Andy 792 239 30%
Wolf Randy 803 242 30%
Franklin Ryan 342 103 30%
Sheets Ben 646 194 30%
Byrd Paul 483 145 30%
Maholm Paul 644 193 30%
Olsen Scott 596 178 30%
Vazquez Javier 757 226 30%
Garland Jon 717 213 30%
Carmona Fausto 465 138 30%
Cain Matt 738 219 30%
Looper Braden 803 238 30%
Bonser Boof 413 122 30%
Masterson Justin 404 119 29%
Lee Cliff 520 153 29%
Burnett A.J. 915 269 29%
Pelfrey Mike 837 246 29%
Jurrjens Jair 827 243 29%
Rowland-Smith Ryan 409 120 29%
Cueto Johnny 805 236 29%
Jackson Edwin 739 216 29%
Parra Manny 625 182 29%
Floyd Gavin 798 232 29%
Park Chan Ho 394 114 29%
Lannan John 730 211 29%
Lilly Ted 564 163 29%
Robertson Nate 583 168 29%
Olson Garrett 552 159 29%
Wainwright Adam 462 133 29%
Grilli Jason 351 101 29%
Litsch Jesse 602 173 29%
Affeldt Jeremy 348 100 29%
Chamberlain Joba 470 135 29%
Harang Aaron 670 192 29%
Cormier Lance 311 89 29%
Marquis Jason 664 190 29%
Dickey R.A. 406 116 29%
Davis Doug 649 185 29%
Rodriguez Wandy 509 145 28%
Saunders Joe 720 205 28%
Gallagher Sean 528 150 28%
Bush Dave 695 197 28%
Backe Brandon 682 193 28%
Weathers David 364 103 28%
Sampson Chris 354 100 28%
Jimenez Ubaldo 812 229 28%
Snell Ian 649 183 28%
Cordero Francisco 401 113 28%
Rasner Darrell 401 113 28%
Howell J.P. 396 111 28%
Mussina Mike 655 183 28%
Villanueva Carlos 445 124 28%
Delcarmen Manny 316 88 28%
Ponson Sidney 504 140 28%
Lohse Kyle 749 208 28%
Zambrano Carlos 729 202 28%
Green Sean 307 85 28%
Weaver Jered 695 192 28%
Heilman Aaron 375 103 27%
Smith Greg 883 242 27%
Verlander Justin 831 227 27%
Bannister Brian 719 196 27%
Bell Heath 305 83 27%
Hammel Jason 324 88 27%
Banks Josh 321 87 27%
Blanton Joe 838 226 27%
Myers Brett 802 216 27%
Volquez Edinson 844 227 27%
Zito Barry 819 220 27%
Rogers Kenny 686 184 27%
Romero J.C. 358 96 27%
Duke Zach 567 152 27%
Lincoln Mike 342 91 27%
Guthrie Jeremy 654 174 27%
Gaudin Chad 340 90 26%
Arroyo Bronson 811 214 26%
Sarfate Dennis 430 113 26%
Redding Tim 709 186 26%
Suppan Jeff 744 193 26%
Millwood Kevin 541 140 26%
Guerrier Matt 369 95 26%
Hanrahan Joel 402 103 26%
Burres Brian 523 134 26%
Meche Gil 855 217 25%
Bass Brian 332 84 25%
Torres Salomon 329 83 25%
Cook Aaron 580 146 25%
Wright Jamey 330 83 25%
Yates Tyler 327 82 25%
Grabow John 329 82 25%
Hochevar Luke 514 128 25%
Cabrera Daniel 754 186 25%
Feldman Scott 532 130 24%
Padilla Vicente 621 150 24%
Sanchez Jonathan 647 156 24%
Buchholz Clay 307 74 24%
Silva Carlos 561 135 24%
Duchscherer Justin 488 117 24%
Kendrick Kyle 627 150 24%
Marmol Carlos 368 88 24%
Reyes Jo-Jo 470 112 24%
Rupe Josh 389 92 24%
Hernandez Livan 747 175 23%
Francis Jeff 631 147 23%
Rivera Saul 391 91 23%
Hendrickson Mark 482 111 23%
Washburn Jarrod 666 153 23%
Wakefield Tim 565 129 23%
Maine John 597 133 22%
Eveland Dana 686 146 21%
Batista Miguel 530 111 21%
Ledezma Wilfredo 309 64 21%
McClung Seth 433 87 20%
Bennett Jeff 404 79 20%
Hudson Tim 569 111 20%
Gregg Kevin 352 67 19%
Owings Micah 405 72 18%
Pinto Renyel 312 55 18%
Laffey Aaron 358 62 17%
Bonderman Jeremy 334 57 17%
Contreras Jose 468 78 17%
Gorzelanny Tom 444 74 17%
McGowan Dustin 385 60 16%
Penny Brad 376 57 15%
Miller Andrew 451 64 14%
Chacon Shawn 389 53 14%
Eaton Adam 440 55 12%
Wang Chien-Ming 343 41 12%
Glavine Tom 372 44 12%
Bedard Erik 304 31 10%
Dumatrait Phil 362 31 9%

Monday, April 6, 2009

Hitting where the ball is pitched

I play amateur baseball. I'm a second baseman with no power who makes a lot of contact. Since I rarely hit a ball past the outfielders, I always try to hit 'em where they ain't, to spray the balls to all the field, to hit it where it's pitched... pull the inside pitch, place the outside one in the opposite field.

Big leaguers have power.
Some of them are known to pull every ball, others seem to place the ball according to where it is pitched.
During the next paragraphs we will look at this issue.

I looked for a number to quantify the amount to which a hitter responds to the location of a pitch.

Here's what I've done.

Using gameday hit locations I calculated the trajectory of every batted ball, varying from -45° (down the LF line) to 45° (down the RF line); then I rescaled the trajectory value to have it varying from -1 to 1 (this made life easier for me in the following calculations).


Note: actually the trajectory of some hits, e.g. the groundballs that leave the infield inside the bag then roll in foul territory, exceeds the -45°/+45° boundary, but I coded those hits like having a trajectory of (-)45°.


Then, from PitchF/x, I took the horizontal coordinate of the pitches resulting in balls in play. Again I rescaled the values to have them bounded from -1 (inside for a RHB) to +1.


Note: I did not assign the value 1 to the outermost ball (for a RHB) in my dataset, because I had some severe outliers; I rescaled to one the 95th percentile coordinate. Same thing for -1 and LHBs.


With this couple of values for every ball in play I have all I need to get a value for hitting 'em where they're pitched.
I just need to calculate, for all the hitters, the correlation between the trajectory values and the location values: high correlation (near 1) means the hitter has a tendency to pull inside pitches and hit outside pitches the other way; zero correlation means that where the ball is put into play by the hitter is absolutely unrelated to where the pitch was located; inverse correlation (near -1) would mean the hitter hits... inside-out and outside-in.

Here are the top ten and bottom ten lists. I reported the correlation coefficient (rho) and the 95% confidence interval (lcl = lower confidence limit, hcl = higher confidence limit).
At the end of the post I'll give you a table containing all the players with at least 200 balls in play in 2008.

Top ten

LAST FIRST rho lcl hcl
Mientkiewicz Doug 0.42 0.31 0.52
Byrd Marlon 0.42 0.32 0.50
Morneau Justin 0.41 0.33 0.48
Dobbs Greg 0.40 0.27 0.51
Roberts Brian 0.40 0.32 0.47
Payton Jay 0.39 0.29 0.49
Fukudome Kosuke 0.39 0.30 0.47
Johjima Kenji 0.39 0.30 0.48
DeJesus David 0.38 0.30 0.46
Lopez Jose 0.38 0.31 0.45

Bottom ten
LAST FIRST rho lcl hcl
Willingham Josh 0.08 -0.04 0.20
Kendrick Howie 0.08 -0.04 0.19
Martinez Victor 0.07 -0.07 0.20
Rodriguez Alex 0.06 -0.04 0.16
Beltran Carlos 0.05 -0.03 0.14
Hall Bill 0.05 -0.06 0.17
Aybar Willy 0.04 -0.07 0.15
Berroa Angel 0.04 -0.11 0.18
Rollins Jimmy 0.01 -0.08 0.09
Hawpe Brad 0.00 -0.10 0.11

OK, nobody hits the "wrong way".

I wondered if hitting the ball the "right way" improves the outcome of the batted ball.
A first (expected) answer is positive, since no player in the Majors hits inside-out and outside-in.
But I wanted some more.

I divided the batted balls in four quadrants, according to the sign of the trajectory and of the location values. Thus a pitch thrown outside to a RHB (location value approaching 1) hit to right field (trajectory value approaching 1) is in the first quadrant; similarly a ball with negative location value and positive trajectory value is in the second quadrant, negative-negative is in the third and positive-negative is the fourth.
Actually I removed a bunch of batted balls and put them in a fifth box; they are the balls with either the location or the trajectory value between -0.2 and 0.2.

Here are the average run values for balls in play by quadrant (quadrant numbers in the corners in roman).


As we would expect in the hypothesis that hitting the ball where it's pitched has a positive effect, we see higher values for the first and third quadrant: those are the boxes representing outside pitches hit the other way and inside pitches pulled by RHBs, respectively (those boxes also contain, respectively, pulled inside pitches and "pushed" outside pitches by LHBs).
Let's split RHBs and LHBs.

RHBs

LHBs



Accordingly to common sense we get higher values in the third quadrant for RHBs and in the first quadrant for LHBs: those are their pull quadrants for inside pitches.
The relative differences among quadrants also make some sense: the inside pitch pulled by the LHB can be snatched by one of the right infielders and converted into an out; for a RHB, the shortstop (or third baseman) making a diving catch on a hard roller still has a lot to do to get the out - opposite reasoning goes for opposite field batted balls on outside pitches (and thus the higher gap between 1st and 3rd quadrant for RHBs).

There's still something that might be polluting my numbers. Maybe in the first and third quadrants there are a lot of balls hit by players who are pull hitters who wait for the inside pitch. So we probably have some bias: in those two quandrants we simply have many balls hit by the best batters.

I redid the calculations.
For every batted ball I subtracted, from the outcome run value, the hitter's (the one who put the ball in play) average run value on batted balls. Now the values in the quadrants should be "cleaner".

RHBs

LHBs


Again we see the effects described before.

I don't think there's anything groundbreaking in this post. I just wanted to quantify a couple of things that are quite known.


Note: I haven't considered the vertical coordinate of the pitch location at all. I acknowledge this can be an issue.


Here is the full table I promised before.

LAST FIRST rho lcl hcl
Mientkiewicz Doug 0.42 0.31 0.52
Byrd Marlon 0.42 0.32 0.50
Morneau Justin 0.41 0.33 0.48
Dobbs Greg 0.40 0.27 0.51
Roberts Brian 0.40 0.32 0.47
Payton Jay 0.39 0.29 0.49
Fukudome Kosuke 0.39 0.30 0.47
Johjima Kenji 0.39 0.30 0.48
DeJesus David 0.38 0.30 0.46
Lopez Jose 0.38 0.31 0.45
Polanco Placido 0.38 0.30 0.45
Blanco Gregor 0.38 0.28 0.47
Dellucci David 0.37 0.26 0.48
Matsui Kazuo 0.37 0.27 0.46
Scutaro Marco 0.37 0.28 0.45
Gutierrez Franklin 0.37 0.26 0.46
Torrealba Yorvit 0.36 0.23 0.48
Cairo Miguel 0.36 0.23 0.48
Helton Todd 0.36 0.24 0.47
Jackson Conor 0.36 0.28 0.44
Ordonez Magglio 0.36 0.28 0.44
Nady Xavier 0.36 0.27 0.44
Lee Carlos 0.36 0.27 0.44
Belliard Ronnie 0.36 0.24 0.47
Reed Jeremy 0.36 0.24 0.46
Lowrie Jed 0.36 0.23 0.47
Infante Omar 0.36 0.25 0.45
Hairston Scott 0.36 0.24 0.46
Burriss Emmanuel 0.35 0.23 0.47
Butler Billy 0.35 0.26 0.44
Anderson Garret 0.35 0.27 0.43
Hernandez Ramon 0.35 0.26 0.43
Miles Aaron 0.34 0.24 0.43
Gerut Jody 0.34 0.23 0.44
Pedroia Dustin 0.34 0.27 0.41
Kinsler Ian 0.34 0.25 0.42
Catalanotto Frank 0.34 0.21 0.45
Cabrera Melky 0.33 0.23 0.42
Pujols Albert 0.33 0.24 0.41
Bradley Milton 0.33 0.22 0.43
Braun Ryan 0.33 0.24 0.40
Barajas Rod 0.32 0.21 0.42
Grudzielanek Mark 0.32 0.21 0.42
Castillo Luis 0.32 0.20 0.43
Eckstein David 0.32 0.21 0.42
Weeks Rickie 0.32 0.22 0.41
Lamb Mike 0.32 0.19 0.44
Cabrera Orlando 0.32 0.24 0.39
Iwamura Akinori 0.31 0.23 0.39
Kapler Gabe 0.31 0.18 0.44
Lopez Felipe 0.31 0.22 0.40
Inglett Joe 0.31 0.20 0.41
Youkilis Kevin 0.31 0.23 0.39
Callaspo Alberto 0.31 0.18 0.43
Kubel Jason 0.31 0.21 0.40
Floyd Cliff 0.31 0.18 0.43
Hill Aaron 0.31 0.16 0.44
Molina Yadier 0.31 0.22 0.39
Wilson Jack 0.31 0.19 0.41
Reyes Jose 0.31 0.23 0.38
Ortiz David 0.31 0.21 0.40
Loretta Mark 0.31 0.18 0.42
German Esteban 0.30 0.17 0.43
Francoeur Jeff 0.30 0.22 0.38
Young Michael 0.30 0.22 0.38
Zimmerman Ryan 0.30 0.20 0.40
Tejada Miguel 0.30 0.22 0.38
Crosby Bobby 0.30 0.21 0.38
Gonzalez Edgar 0.30 0.18 0.41
Counsell Craig 0.30 0.17 0.42
Hinske Eric 0.30 0.18 0.40
Drew J.D. 0.30 0.19 0.40
Vizquel Omar 0.30 0.18 0.41
Rivas Luis 0.30 0.15 0.43
Theriot Ryan 0.29 0.21 0.37
Furcal Rafael 0.29 0.14 0.43
Matsui Hideki 0.29 0.18 0.40
Lind Adam 0.29 0.18 0.40
Berkman Lance 0.29 0.20 0.37
Brown Emil 0.29 0.19 0.39
Navarro Dioner 0.29 0.20 0.38
Dye Jermaine 0.29 0.21 0.37
Suzuki Ichiro 0.29 0.22 0.36
Martin Russell 0.29 0.20 0.37
Bautista Jose 0.29 0.18 0.39
Schumaker Skip 0.29 0.20 0.37
DeWitt Blake 0.29 0.18 0.39
Dunn Adam 0.29 0.19 0.38
Byrnes Eric 0.29 0.14 0.43
Hardy J.J. 0.29 0.20 0.37
Spilborghs Ryan 0.29 0.15 0.41
Taveras Willy 0.29 0.19 0.37
Sheffield Gary 0.29 0.18 0.38
Buck John 0.28 0.17 0.39
Izturis Maicer 0.28 0.17 0.39
Sanchez Freddy 0.28 0.20 0.36
Izturis Cesar 0.28 0.19 0.37
Beltre Adrian 0.28 0.19 0.36
Boone Aaron 0.28 0.13 0.42
Schneider Brian 0.28 0.17 0.39
Hairston Jerry 0.28 0.15 0.40
Wigginton Ty 0.28 0.17 0.38
Fontenot Mike 0.28 0.14 0.40
Inge Brandon 0.28 0.16 0.39
Ibanez Raul 0.28 0.19 0.36
McLouth Nate 0.28 0.19 0.36
Glaus Troy 0.27 0.18 0.36
Ross Cody 0.27 0.17 0.37
Aurilia Rich 0.27 0.17 0.37
Coste Chris 0.27 0.15 0.39
Tulowitzki Troy 0.27 0.17 0.37
Rios Alex 0.27 0.19 0.35
Stairs Matt 0.27 0.14 0.39
Casilla Alexi 0.27 0.17 0.37
Damon Johnny 0.27 0.18 0.35
Zaun Gregg 0.27 0.13 0.40
Michaels Jason 0.27 0.13 0.39
Ellsbury Jacoby 0.27 0.18 0.35
Suzuki Kurt 0.27 0.18 0.35
Sweeney Ryan 0.27 0.16 0.37
Lewis Fred 0.27 0.16 0.36
Kotchman Casey 0.26 0.18 0.35
McCann Brian 0.26 0.17 0.35
Gross Gabe 0.26 0.15 0.37
Markakis Nick 0.26 0.18 0.35
Lee Derrek 0.26 0.18 0.34
Huff Aubrey 0.26 0.18 0.34
Bay Jason 0.26 0.17 0.35
LaRoche Adam 0.26 0.16 0.36
Ellis Mark 0.26 0.16 0.36
Blake Casey 0.26 0.17 0.35
Erstad Darin 0.26 0.14 0.37
Burrell Pat 0.26 0.17 0.35
Bartlett Jason 0.26 0.17 0.35
Ethier Andre 0.26 0.17 0.34
Hamilton Josh 0.26 0.17 0.34
Atkins Garrett 0.26 0.17 0.34
Guillen Carlos 0.26 0.15 0.35
Murphy David 0.25 0.15 0.35
Snyder Chris 0.25 0.12 0.37
Molina Jose 0.25 0.12 0.37
Loney James 0.25 0.17 0.33
Bako Paul 0.25 0.12 0.38
Barton Daric 0.25 0.15 0.35
Lowell Mike 0.25 0.15 0.35
Chavez Endy 0.25 0.13 0.36
Edmonds Jim 0.25 0.13 0.36
Blalock Hank 0.25 0.11 0.38
Betancourt Yuniesky 0.25 0.16 0.33
Longoria Evan 0.25 0.15 0.34
Cabrera Miguel 0.25 0.16 0.33
Giambi Jason 0.25 0.14 0.34
Crawford Carl 0.25 0.15 0.33
Gload Ross 0.24 0.14 0.34
Shoppach Kelly 0.24 0.11 0.37
Fielder Prince 0.24 0.15 0.33
Carroll Jamey 0.24 0.13 0.35
Barmes Clint 0.24 0.14 0.35
Phillips Brandon 0.24 0.15 0.33
Sizemore Grady 0.24 0.16 0.32
Amezaga Alfredo 0.24 0.12 0.35
Pierre Juan 0.24 0.14 0.33
Keppinger Jeff 0.24 0.14 0.32
Konerko Paul 0.24 0.13 0.33
Abreu Bobby 0.24 0.15 0.32
Guerrero Vladimir 0.24 0.15 0.32
Escobar Yunel 0.23 0.14 0.32
Kent Jeff 0.23 0.14 0.33
Hunter Torii 0.23 0.14 0.32
Thomas Frank 0.23 0.09 0.37
Young Delmon 0.23 0.14 0.32
Bourn Michael 0.23 0.13 0.33
Cust Jack 0.23 0.12 0.34
Kennedy Adam 0.23 0.12 0.34
DeRosa Mark 0.23 0.13 0.32
Jones Chipper 0.23 0.13 0.32
Ramirez Manny 0.23 0.14 0.31
Iannetta Chris 0.23 0.10 0.35
Matthews Gary 0.23 0.12 0.33
Soto Geovany 0.22 0.12 0.32
Kouzmanoff Kevin 0.22 0.13 0.31
Pena Tony 0.22 0.08 0.36
Bowker John 0.22 0.10 0.34
Mathis Jeff 0.22 0.08 0.35
Renteria Edgar 0.22 0.12 0.31
Punto Nick 0.22 0.10 0.33
Garko Ryan 0.22 0.12 0.31
Moss Brandon 0.22 0.06 0.36
Rivera Juan 0.22 0.09 0.34
Hudson Orlando 0.22 0.11 0.32
Aviles Mike 0.22 0.11 0.31
Patterson Corey 0.22 0.10 0.32
Utley Chase 0.21 0.13 0.29
Rodriguez Luis O. 0.21 0.07 0.35
Pierzynski A.J. 0.21 0.12 0.30
Gathright Joey 0.21 0.09 0.33
Giles Brian 0.21 0.12 0.30
Granderson Curtis 0.21 0.12 0.30
Molina Bengie 0.21 0.12 0.29
Buscher Brian 0.21 0.06 0.35
Werth Jayson 0.21 0.10 0.31
Milledge Lastings 0.21 0.11 0.30
Jenkins Geoff 0.21 0.08 0.33
Easley Damion 0.21 0.09 0.32
Lugo Julio 0.20 0.07 0.34
Cano Robinson 0.20 0.12 0.29
Sexson Richie 0.20 0.06 0.34
Rolen Scott 0.20 0.10 0.31
Holliday Matt 0.20 0.11 0.29
Wells Vernon 0.20 0.10 0.30
Iguchi Tadahito 0.20 0.07 0.32
Kearns Austin 0.20 0.08 0.32
Guillen Jose 0.20 0.11 0.29
Wright David 0.20 0.11 0.28
Cantu Jorge 0.20 0.12 0.28
Harris Brendan 0.20 0.09 0.30
Gomez Carlos 0.20 0.11 0.29
Prado Martin 0.20 0.06 0.33
Howard Ryan 0.20 0.10 0.29
Figgins Chone 0.20 0.10 0.29
Feliz Pedro 0.20 0.10 0.29
Tracy Chad 0.20 0.06 0.32
Baker Jeff 0.19 0.06 0.32
Ruiz Carlos 0.19 0.09 0.30
Gonzalez Luis 0.19 0.08 0.30
Hermida Jeremy 0.19 0.09 0.29
Guzman Cristian 0.19 0.11 0.28
Ramirez Aramis 0.19 0.10 0.28
Reynolds Mark 0.19 0.09 0.30
Olivo Miguel 0.19 0.06 0.32
Young Chris 0.19 0.10 0.28
Crede Joe 0.19 0.07 0.30
Drew Stephen 0.19 0.10 0.27
Cuddyer Michael 0.19 0.05 0.32
Delgado Carlos 0.19 0.10 0.28
Johnson Kelly 0.19 0.10 0.28
Rodriguez Ivan 0.19 0.08 0.29
Laird Gerald 0.19 0.07 0.30
Francisco Ben 0.19 0.08 0.29
Vazquez Ramon 0.19 0.06 0.31
Gonzalez Carlos 0.18 0.05 0.31
Overbay Lyle 0.18 0.09 0.28
Winn Randy 0.18 0.10 0.27
Johnson Reed 0.18 0.06 0.30
Davis Rajai 0.18 0.03 0.32
Gordon Alex 0.18 0.08 0.28
Velez Eugenio 0.18 0.05 0.30
Millar Kevin 0.18 0.08 0.27
Hart Corey 0.18 0.09 0.26
Cabrera Asdrubal 0.17 0.06 0.29
Duncan Chris 0.17 0.02 0.32
Greene Khalil 0.17 0.06 0.29
Jacobs Mike 0.17 0.07 0.27
Swisher Nick 0.17 0.07 0.27
Jeter Derek 0.17 0.09 0.26
Kendall Jason 0.17 0.08 0.26
Scott Luke 0.17 0.07 0.27
Span Denard 0.17 0.05 0.28
Vidro Jose 0.17 0.05 0.28
Castillo Jose 0.17 0.06 0.27
Mauer Joe 0.17 0.08 0.25
Crisp Coco 0.16 0.05 0.27
Teahen Mark 0.16 0.07 0.25
Cedeno Ronny 0.16 0.01 0.31
Gonzalez Adrian 0.16 0.07 0.25
Encarnacion Edwin 0.16 0.06 0.26
Stewart Shannon 0.16 -0.01 0.32
Rowand Aaron 0.16 0.06 0.25
Upton B.J. 0.16 0.06 0.25
Thames Marcus 0.16 0.02 0.28
Flores Jesus 0.16 0.02 0.28
Thome Jim 0.16 0.05 0.26
Victorino Shane 0.16 0.07 0.24
Varitek Jason 0.15 0.05 0.26
Jones Adam 0.15 0.05 0.25
Kemp Matt 0.15 0.06 0.24
Griffey Jr. Ken 0.15 0.05 0.25
Durham Ray 0.15 0.04 0.26
Helms Wes 0.15 0.00 0.29
Doumit Ryan 0.15 0.04 0.25
Cameron Mike 0.15 0.03 0.26
Peralta Jhonny 0.15 0.06 0.23
Tatis Fernando 0.15 0.01 0.28
Bruntlett Eric 0.14 -0.00 0.29
Soriano Alfonso 0.14 0.04 0.25
Hannahan Jack 0.14 0.03 0.25
Kotsay Mark 0.14 0.04 0.24
Ramirez Alexei 0.14 0.04 0.23
Aybar Erick 0.13 0.02 0.24
Headley Chase 0.13 -0.00 0.26
Ramirez Hanley 0.13 0.04 0.22
Harris Willie 0.13 0.01 0.24
Ludwick Ryan 0.13 0.03 0.23
Dukes Elijah 0.13 -0.01 0.26
Blum Geoff 0.13 0.00 0.24
Quentin Carlos 0.12 0.02 0.22
Wilkerson Brad 0.12 -0.02 0.26
Pence Hunter 0.12 0.03 0.21
Ankiel Rick 0.12 0.01 0.23
Church Ryan 0.12 -0.01 0.25
Davis Chris 0.12 -0.03 0.26
Choo Shin-Soo 0.11 -0.02 0.24
Bruce Jay 0.11 -0.00 0.22
Pena Carlos 0.11 0.00 0.21
Ausmus Brad 0.11 -0.05 0.26
Teixeira Mark 0.10 0.01 0.19
Uggla Dan 0.10 -0.00 0.21
Mora Melvin 0.10 0.01 0.20
Uribe Juan 0.10 -0.02 0.22
Ojeda Augie 0.10 -0.04 0.24
Votto Joey 0.10 0.00 0.19
Upton Justin 0.10 -0.04 0.23
Balentien Wladimir 0.09 -0.07 0.24
Marte Andy 0.09 -0.06 0.23
Willingham Josh 0.08 -0.04 0.20
Kendrick Howie 0.08 -0.04 0.19
Martinez Victor 0.07 -0.07 0.20
Rodriguez Alex 0.06 -0.04 0.16
Beltran Carlos 0.05 -0.03 0.14
Hall Bill 0.05 -0.06 0.17
Aybar Willy 0.04 -0.07 0.15
Berroa Angel 0.04 -0.11 0.18
Rollins Jimmy 0.01 -0.08 0.09
Hawpe Brad 0.00 -0.10 0.11

Wednesday, April 1, 2009

Bad ball swingers

As the new season is ready to roll, I take some time to give away the 2008 Vladimir Guerrero Award.
Using Pitchf/x data I looked at the players who swing frequently at bad pitches, and I tried to figure if they hurt themselves by trying to hit everything.

Initially I planned to look at all swings occurred on a pitch that was a ball according to the rulebook strike zone; then I decided to do things differently.
Using spatial smoothing I calculated the probability of a pitch to be called strike, given it's location. A previous study I run on my Italian website, and other researches by other authors at THT and elsewhere, had showed that batter handedness influences umpire decisions more than pitcher handedness, so I calculated different probabilities for RHB and LHB.

A couple of charts will summarize this part.


I decided to classify a pitch as a "bad ball" when its probability to be called strike is lower than 10% (the cutoff value is purely subjective, and I'd welcome suggestions for a different choice).

Here are the top ten bad ball swingers:

LAST FIRST swing_pct
Soriano Alfonso 43%
Barmes Clint 42%
Headley Chase 41%
Span Denard 41%
Ramirez Alexei 41%
Aviles Mike 40%
Stewart Ian 40%
Gomez Carlos 39%
Uribe Juan 38%
Lowrie Jed 38%


No Vlad on the list? I must have done something wrong!... No, he's the eleventh, just out of the table, at 37%.
I compared my full list with the one at FanGraphs and while they don't coincide, they are quite similar. Anyway they don't have to coincide, since FG charts outside zone swing %, while I'm charting "really outside zone swing %".

Here's the bottom of the list.

LAST FIRST swing_pct
Matthews Gary 19%
Dellucci David 19%
Lewis Fred 19%
Murphy David 18%
Crede Joe 18%
Matsui Hideki 16%
Sexson Richie 16%
Upton Justin 16%
Castillo Luis 10%
Helton Todd 5%


For this work I selected players who have seen at least 300 bad pitches. I don't know if this choice caused some players who don't swing at bad pitches to be left out.

Next thing I investigated is run value for swings on bad balls.
Many players are notorious bad ball swingers, but they are also feared because they can do a lot of damage even when they chase pitches in the dirt.
In the following table I summed up the run values obtained by hitters when they swung at a bad pitch.

LAST FIRST swing_run_value
Sheffield Gary 9.37
Garko Ryan 7.28
Stewart Ian 5.86
Mauer Joe 4.99
Glaus Troy 4.87
Ortiz David 4.24
Cust Jack 4.02
Kubel Jason 3.95
Huff Aubrey 3.74
Hinske Eric 2.55

But what if all those pitches were let go by?
I calculated the net run value for bad pitches, which you will find in the next table, as following:
- if the batter didn't swing, assign the run value of the pitch (likely the run value of a ball; but if the ump called it a strike, then the run value of a strike);
- if the batter swung, assign the run value of the event minus the expected run value of the pitch had the batter not swung (that is something like 90+% * run value of a ball + 10-% * run value of a strike).

LAST FIRST net_run_value
Sheffield Gary 1.85
Stewart Ian -1.77
Glaus Troy -2.34
Hinske Eric -2.76
Cust Jack -3.08
Murphy David -3.39
Mauer Joe -3.84
Kubel Jason -3.85
Helton Todd -3.97
Bourn Michael -4.31


As we see from the table, Gary Sheffield is the only player in MLB to have a positive value for his bad pitch swinging (at 29% his swinging percentage on bad pitches is middle-of-the-pack). Jose Reyes, the worst in this ranking, has lost 32 runs by swinging at balls way out of the zone.
Players like Ryan Howard and Vladimir Guerrero can have a gross production of more than 11 runs when swinging at bad balls, but when you look at what they would have produced had they let those pitches go by you get a net loss of nearly 30 runs.

I think I made you wander too long in the dark by giving just a few top-ten tables. Here's a spreadsheet with all the players that made my cut of 300 bad pitches seen.