Showing posts with label chase utley. Show all posts
Showing posts with label chase utley. Show all posts

Monday, February 23, 2009

Refining the shift

I worked some more on the subject of last post.
First of all, as I said at the end of that post, I rerun the cluster analysis after having removed the short grounders from my data set.
Here's the resulting chart.

No more fielders behind the rubber... I like it.

Then I tried to do things separately for infielders and outfielders. In this way I have more confidence in constraining the clusters to having equal sizes. This makes sense especially for the outfielders.

Here is a first chart for infielders: I run a model for four players.

I'm sorry for the questionable choice of colors, but I haven't fully grasped how the customizations of plots works for the package I'm using (Mclust for R, for those interested).

And here are the outfielders (three for the moment).

Then I tried to move one player from the infield to the outfield. Following are the charts for three infielders and four outfielders, respectively.

Summing up.
Playing Utley shifted is the right thing to do. You don't need anybody playing near the third base bag. Putting a fielder in short right makes sense too: the statistical analysis puts him there to catch short flies, but (as managers who employ the shift know) he is mainly valuable for handling the grounders not collected by the infielders on that side.
When data on batted ball velocity are available, a new dimension will be added to the cluster analysis: balls along the lines that get quicker to the infielders would be treated appropriately, thus producing clusters of different sizes (in the spatial dimensions considered here) for third basemen and first basemen.

Sunday, February 22, 2009

Chase-ing the FieldF/x

In the first inning of the last World Series the Rays fielders played Chase Utley extremely shifted, the way teams defend against guys like Big Papi, Giambi, or Howard. Chase tried to get advantage of the alignment by laying down a bunt, but it rolled out of the third base line; then he decided to ignore the opponents placement and hit a ball that no shift can take care of.

I hadn't watched many Phillies games during the regular season (living in Europe I'm usually exposed to the Cubbies... unless my boss accepts me sleeping on my desk in the morning), so I didn't know whether it is common place for teams to defend shifted against Utley. From what I heard during the telecast, Rays alignment was pretty unusual for that guy.

Using data both from Gameday and Retrosheet, I tried to figure an optimal positioning against Chase.
Here's what I've done.

First I plotted the locations of every Utley's batted ball, using coordinates from Gameday.



I plotted grounders in red. All the GBs you see in the outfield have gone through the infield and - theoretically - could have been caught by a well positioned infielder (or one that happened to be where the ball was hit). That's why people working at Project Scoresheet used to record, for groundball hits, the place where the ball left the infield, instead of that where the ball was collected in the outfield (you can see this good habit in the Retrosheet files of the '80s). As we see that's not the case with Gameday stringers.

I corrected for this. I dusted off a couple of geometry books and projected every groundball that went to the outfield to the place it left the infield.

Here's the new graph.



Finally I added some random noise because the groundball hits (those that went through) were plotted one over the other and wasn't easy to distinguish (or count) them.
Here's something easier to read.



"You can observe a lot just by watching". In this case, just watching you can see where batted balls tend to cluster.

Cluster analysis is a statistical tool that can help our eyes in this task.
I hoped that, with the help of cluster analysis, I could more or less identify where to place seven fielders against Chase Utley (pitcher and catcher are not so free to move around the field).

I must admit I didn't expect to be so lucky, and I don't expect to be when I perform similar analyses on other batters.
It turned out that I forgot to put a costraint on the number of clusters I wanted to find, but the algorithm I used chose exactly seven as the optimal result. Should that happen to every following analysis I do, that would be a scientific demonstration that nine players on the field is the perfect number for baseball.

Here's a chart showing the defensive alignment cluster analysis suggests for Utley. You put a player in the middle of each circle and he's responsible of every ball in that circle (or in the neighborhood, just look at the colors).



Weren't for that third baseman (I suppose) just behind the pitcher mound, the result would have looked too good to be true - I would have understood every one accusing me of making up the data.

It looks like the Rays alignment made sense. I'm curious about the midway (between infield and outfield) position where the second baseman usually plays in this kind of shift: the clustering algorithm places a player there to take care of short flies, while managers put the man there to tackle hard grounders too.

This analysis doesn't take into account the range a player can realistically cover. This causes the player behind the pitcher mound having to take care of all the infield grass; besides you have to believe that, however well positioned, you can't count on your pair on the right scooping every groundball.

I didn't expect running a simple cluster analysis (well, that's not so simple, to be honest) and finding results that make so much baseball sense.

Maybe if I remove from the data set the very short grounders (those that become infield hits or that are catcher/pitcher responsibility), I can even get that third baseman off the pitching mound!