Tuesday, 8 August 2017

"It's All about The Distribution Part 2"

First the disclaimer, this isn't a "smart after the event" explanation for Leicester's title season.

It is a list of the occasional, nasty or pleasant surprises that can occur and the limitations of trying to second guess these when using a linear, ratings based model.

Building models based around numbers and averages do work extremely well for the majority of teams in the majority of seasons.

But as the financial world found to the cost of others, neglecting distributions, especially ones that appear normal, but hide fatter than usual tails can leave you unprepared for the once in a lifetime event.

The previous post looked at a hypothetical five team scenario, where the lowest rated, but under exposed side had a much better chance of winning a contest than implied by the respective ratings, simply because the distribution of potential ratings were markedly different for this side.

Again, full disclosure, this model wasn't from football, it was a five runner race run at Uttoxeter and Team 5 was actually a very lightly raced horse against exposed rivals.

I assumed that the idea that distributions of potential performance sometimes matters also carries over into football and the obvious example of an unconsidered team taking a league by storm was Leicester's 2015/16 title winning season.

I went back to 2014/15 and produced some very simple expected goals ratings for all 20 sides going into the 2015/16 season.

I also looked at how diverse and spread out the performance ratings from 2014/15 were for each side.

Three teams whose performances had fluctuated most and might be considered as having a bit more meat in their distribution tails and might be less likely to adhere to their "average" expectations were champions, Chelsea, West Ham and Leicester.

I then set up a distribution for each team based around their average rating and the standard deviation from their individual game by game performances in 2014/15.

I then drew from these tailored distributions as a basis to simulate each game in the 2015/16 season, Leicester's winning season.

And this is how the Foxes and their fellow in and out teams fared in simulations that take from a distribution, rather than a rating.


Leicester project as a top half team, who were as likely to finish in the top two as they were to be relegated and West Ham put themselves about all over the place, but predominately in the top half, which is where they ended up.

Chelsea have a minute chance of ending up tenth, so kudos to Mourinho for breaking this particular model.

There are some really interesting figures emerging today, both for teams and players and usually it's fine to run with the average.

But these averages live in distributions and when these distributions throw up something inevitable, if unexpected, as the bankers found out, someone has to pay.

No comments:

Post a Comment