A lot of words have been spent on clutch hitting and by many people smarter than me.

Anyway I would like to share some work I did a few years ago on my Italian website. That work has gone through some imporant refinement before giving birth to this post.

At the time I wrote the Italian article, the wonderful FanGraphs didn't have the Clutch stat yet, thus I was looking for some way to measure if (and possibly how much) a particular season by a particular player was clutch.

(Those were the times when A-Rod won the MVP award and many Sox fans - and Yankee haters - cried that Big Papi was more deserving due to his clutch performance).

1. Using average Run Values for batting events, I can assign a value to each plate appearence of a particular player. This value measures the outcome of the at bat without taking into account the importance (leverage... more on that in few moments) of the situation.

2. And I have Leverage, that is a number that measures the importance of the moment without looking at the outcome of the AB.

Suppose there's a player who is the clutchiest guy on the face of earth. He may have whatever batting line, but he will get the greatest production (HRs) in the highest leverage situations and the least production (Ks) in the lowest leverage situations.

Take all the at bats of this player, put the run values in a spreadsheet column, then put the corresponding leverage index in the next column.

Sort your spreadsheet for the run value column; now sort for the leverage column. Nothing changes for this particular guy.

If a player has a correlation of zero, then he achieved his best and worst results randomly across the leverage spectrum; the closer to one the more he has been clutch, the closer to minus one the more he has been a choker.

But how closer to 1 (or -1) can a player go in a season (or a career)?

This leads to the first enhancement I have made over my original article. Since at bat run values can only assume a limited set of values, correlation between RV and LI can not be exactly one even for the hypothetical super clutch player. Thus, when I show confidence intervals for the correlation, they have been calculated using bootstraping techniques.

Well, let's look at some players.

Here's Big Papi in 2005, the clutch season par antonomasia (FanGraphs has it at a 3.31 Clutch score).

We can make a couple of observations on these numbers. Even a very clutch season likely won't result in a correlation significantly different from zero (i.e. the 95% confidence interval doesn't have the value zero in between), and should we find one we'd have to look at - at least - the 2nd decimal value.

It seems that it's really difficult to get a value significantly different from zero. Here we must note one important thing: to get a high value of correlation, a player should overperform in high leverage situations AND underperform in low leverage situations. Thus players like Pujols or Ortiz who always perform near the excellence level can't possibly overachieve in high leverage situations - I believe that a correlation significantly different from zero (albeit we have to look at the second or third decimal point) on the positive side is noteworthy nonetheless; it's no surprise that we have found a significant (though minuscule) effect on the negative side for a very good hitter like Rodriguez.

1977 - 0.004 (-0.09 - 0.07)

1978 - 0.06 (-0.003 - 0.12)

1979 - 0.02 (-0.06 - 0.08)

1980 - 0.04 (-0.03 - 0.11)

1981 - -0.07 (-0.19 - 0.01)

1982 - 0.01 (-0.09 - 0.09)

1983 - -0.05 (-0.14 - 0.02)

1984 - 0.07 (0.01 - 0.14)

1985 - 0.10 (0.03 - 0.16)

1986 - 0.01 (-0.08 - 0.09)

1987 - 0 (-0.08 - 0.06)

1988 - 0 (-0.08 - 0.07)

1989 - 0.01 (-0.07 - 0.08)

1990 - 0.08 (0.01 - 0.14)

1991 - 0.06 (-0.01 - 0.13)

1992 - 0.01 (-0.08 - 0.10)

1993 - -0.02 (-0.10 - 0.04)

1994 - 0.04 (-0.06 - 0.12)

1995 - -0.03 (-0.13 - 0.05)

1996 - -0.003 (-0.08 - 0.06)

1997 - 0.05 (-0.07 - 0.16)

There are a few seasons in which Eddie beat the zero, only one of them under Weaver (1985 - and the book came out one year earlier).

I reapeat. When we find an effect, it is very small (so small we can doubt it really is an effect). Anyway, this player that was perceived to be clutch by one of his managers never had a choke season (defined as one for which the entire 95% CI is under zero).

If you're selecting the top clutch seasons using one metric from the thousands of player-seasons out there, isn't it almost guaranteed that you'll show clutch using another (presumably highly correlated) metric with 95% confidence? Obviously not, since you didn't, so my real question is if you corrected for this and, if so, how.

ReplyDeleteI'm not sure I have understood your question.

ReplyDeleteAre you saying that running multiple correlations I can get significance by chance?

If this is your point, I haven't corrected for this, but I have looked at only a few (though selected) seasons.

If I go on and run correlations for every player/season, I would definitely check if the number of significant occurrences exceeds the 5% value.

Yes, I think you've got it. Basically, I'm saying if clutch doesn't exist and we've got 2000 player-seasons, we'd expect 100 of them to show significance at the 5% level. If we select the 2 seasons that show the highest amount of clutch using 1 statistic (that is, significant at about the 2/2000= 0.1% level using that statistic given randomness), shouldn't we expect they'd be significant at the 5% level using a different clutch statistic? If fangraphs's clutch and your clutch correlation are measuring roughly the same thing, an extreme outlier in one should be at least a moderate outlier in the other, no?

ReplyDeleteI chose a few extreme seasons by FanGraphs metrics to look at what amount of correlation we can find in a "very clutch" season... and the answer seems to be very little.

ReplyDeleteI don't get how you say those 2 season to be 0.1% significant using FG's stat. They are two of the most extreme, but nothing says that those two are significantly different from every other season.