Wednesday, 30 November 2016

Was Aguero Quite So Lucky in 2015/16?

By now, expected goals needs very little introduction.

It attempts to quantify the importance of pre-shot variables in determining the likelihood that a goal will be scored. In essence it is a measure of chance quality and is largely determined by such things as shot type and location.

The majority of models output the likelihood that an average Premier League player would score from a given position and shot type. By aggregating the individual expected goals for each attempt and comparing this to a player's actual output we can broadly suggest the level of under or over performance.

Here's how the two 2015/16 leading non penalty scorers fared compared to the aggregated total of their expected goals,

Both over-performed,

Aguero more so than Kane, but we can better visualise this disconnect by simulating each of the 111 non penalty attempts taken by Aguero to see the range of season long goal totals predicted by the model.

There's around an 8% chance that the average player model would equal or better Aguero's 20 non penalty goals from his 111 chances in 2015/16.

Thereafter the interpretation becomes more subjective.

We may assume presumptuously that the model is perfect and Aguero was merely lucky.

281 individual players tried to score in 2015/16, so that's alot of individual trials and someone is likely to over perform to the level that Aguero did.

This suggests that he may subsequently enjoy more normal levels of luck and his performance may be less extreme in the future.

Or we might prefer that Aguero's 20 goals is partly driven by luck, but it also contains an element of skill in finishing chances that exceeds that granted to the average player whose out of sample data went into producing the model.

As suggested by the title of the above graph, we can produce a second expected goals model that while not explicitly tailored to Aguero's (potential) finishing prowess, does contain elements that may act as a proxy for elusive finishing ability.


If we now simulate Aguero's 111 chances, but using a model that incorporates statistically significant variables that "may" relate to finishing skill, he becomes less "lucky". His 20 goals are now much less unlikely. The new model predicts he would score 20 or more in nearly 40% of seasons.

Overall, this new set of variables (I can't be more specific, sorry) inflates the individual expected goals values of players, such as Aguero and Kane who possess the new variable and reduces the the figures for those who don't.

Overall a model that allows for a differential in finishing abilities across all players that attempt to score in a typical season reduces such indicators as the rmse in out of sample data.

Under a model that includes a proxy term for finishing skill, Aguero only scores 1 more goal than predicted in out of sample data from 2015/16 and Kane scores exactly the number predicted by the model.

Perhaps more importantly Aguero's 2015/16 is a substantially better goodness of fit at the individual attempt level under the second model compared to the first.

Tuesday, 22 November 2016

Burnley's Unsustainable Survival Technique.

Monday night's live game pitted two of the Premier League's more dour sides against each other.

WBA is the magnificent Tony Pulis' current port of call, where they are the recipients of his exclusive brand of pundit flummoxing, survival techniques.

Meanwhile, Burnley are getting by on a meagre 0.8 expected goals per game. They are conceding an average of 2.1 expected goals per game and through the grace of the probabilistic gods, actually allowing just 1.4 real goals.

That's not a Pulis approved survival approach, at least in the long term, but it has given Sean Dyche's side a few notable results.

Top of the tree of upsets was Burnley's 2-0 early season win at home to Liverpool, where Dyche tired out his opponents, not by engaging them in a presssing foot race, but by nicking an early lead and then handing them dozens of goal attempts.

All of which they missed.

The blueprint of being overwhelmed, but showcasing the England credentials of your defence, was wheeled out again at Old Trafford for the approval of Jose. And while Burnley didn't quite manage to nick a goal here, they did keep their goal intact for a welcome point.

Sandwiched in between was another expected goals beating at the hands of a top six contender where the reality better reflected the distribution of the quality and quantity of chances created in the game.

Chelsea's invite left Burnley nursing a 3-0 loss.

On the surface Burnley had made a comfortable start to their renewed acquaintance with the Premier League. "they look far better equipped for survival this time around, sitting comfortably in 9th place"  might have been something that was written about the Clarets prior to Monday's game.

But scratch beneath the media soundbites and Burnley's well being is supported by a large helping of unsustainable variance.

Hats off to the 14 Burnley players who withstood the battering from an 11 and then ten man Manchester United in late October, but simulate the exercise 1,000's of times and a United win is by far the most likely outcome of the three possible results.

Simulate all 120 matches, along with the multitude of possible tables, 1,000's of times and Burnley's most likely current position is.....bottom. Rather than the more comfortable 9th they occupied prior to match week 12.

Of course, points already won are kept, no matter how ill gotten or deserving and should Burnley continue their idiosyncratic survival process, coupled with their recent showing in the Championship, they probably won't finish in their current expected position of bottom in May.

They'll most probably finish 19th.

If you want to check out all of Burnley's shot maps, along with all Premier League games for the last three seasons, download the free Infogol app