I’ll take advantage of this place to write down some notes as I explore the dataset provided by MCFC Analytics and Opta Sports.
Maybe I’ll use it for other things related to sports data analysis—who knows, maybe I’ll do some work on NHL or NBA data.
Back to today’s topic, which is soccer and, in particular, the English Premier League.
I have only worked on the so-called “Light” dataset, and as far as I know, the “Advanced” one has not been made available yet.
This is a first very cursory look on passing.
Each team performs, on average, 463 passes per game, where passes are defined as “An intentional played ball from one player to another.” Both successful and unsuccessful passes are counted here.
And here is a chart comparing teams’ own passing game and how they influence their opponents’ passing game.
(Click on the picture for the full-size version)
Arsenal (lower right corner), for example, makes roughly 100 more passes per game than the average team. Conversely, teams playing against Arsenal have an average of about 60 fewer passes attempted.
Note: The numbers have been adjusted for the “strength of the schedule”. This kind of adjustment is crucial in North American sports as teams do not face each other the same number of times. In the EPL the only reason to perform the correction is because teams do not play against themselves.There is a strong inverse correlation (r=-.77) between own passes and allowed passes.
This should not be a surprise. A team that passes the ball a lot has the ball most of the time, thus preventing the other team from making passes on their part.
Also, not unexpectedly, we find top teams in the lower right part of the chart (they pass more, they allow fewer passes) and bad teams in the top left area.
Scrolling the data point Southeast-to-Northwest you roughly get the 2011/12 standings.
There are two notable exceptions.
Swansea City ended last season as a middle-of-the-pack team, but its passing game pattern is similar to the one of the big clubs. Conversely, Newcastle United finished fifth, but it’s packed with the losing teams.
There is obviously one team that jumps to the eye. Fulham is the only club which does not fall close the straight line that runs among the data points. Removing it from the data set would make the correlation even stronger at -0.87.
We’ll see if we can figure out the peculiarity of that team (and maybe the two named earlier) in another post.