Saturday, 9 December 2017

Know Your Limits

All predictions come with the caveat that there is a spread of uncertainty either side of the most likely outcome.

A side may be odds on to win almost all of their matches over a season, as Manchester City have very nearly shown in 2017/18, but there is a finite, if extremely small chance that they will actually lose all 38 matches.

Similarly, there is a bigger chance that they will win all 38, but the most likely scenario sits between these two extremes and for the current best team in the Premier League, winning the title with around 96 points is the most expected final outcome in May.

While single, definitive predictions are more newsworthy, they imply a precision that is never available about the longer term futures, especially about a sporting contest, such as a Premier League season that comprises low scoring matches spread over 380 games.

It's therefore useful to attach the degree of confidence we have in our predictions to any statements we make about a future outcome, particularly as new information about teams feeds into the system and the competition progresses, turning probabilistic encounters into 0,1 or 3 point actual outcomes.

Here's the range of points which a simulated model of the 2016/17 Premier League came up with using xG based ratings for each team and particularly Swansea before a ball was kicked.

Swansea had been in relative decline since their impressive introduction into the top tier, playing much admired possession football, mainly as a defensive tactic, that had seen then finish as high as 8th in 2014/15, 21 points clear of the drop zone.

2015/16 had seen them fall to 12th, just ten points from the drop zone and much of their xG rating for 2016/17 was based around this less impressive performance.

The top end of their points totals over 10,000 simulations resulted in a top 10 finish with 52 points, but the lower end left them relegated with 27 points and their mode of 36 final points suggested a season of struggle.

And this is illustrated by the dial plot showing well into the red zone signifying relegation.

After ten games, we now have more information, both about Swansea and the other 19 Premier league teams and the most likely survival cut off points in the 2016/17 league.

At the time, Swansea were 19th with five points from ten games and while the grey portion of mid table is still achievable, it has shrunk and the Swans' low point has fallen deeper into the red.

After thirty games, so just eight left, the upper and lower limits for Swansea after the full 38 games has narrowed. They are still more likely than not to be relegated, according to the updated xG model, but there is still some chance that they will survive.

In reality, Swansea were in the bottom three with three games left, but a win for them and a defeat for Hull in game week 36 was instrumental in retaining their top flight status, but it was as close as the final plot suggested it might be.

Adding indications of confidence in your model enhances any information you may wish to convey.

It's also essential when using xG simulations to "predict" the past, such as drawing conclusion about a player's individual xG and his actual scoring record.

Adding high and low limits will highlight if any over or under performance against an average model based simulation is noteworthy or not.

One final point. The upper and lower limits can be chosen to illustrate different levels of confidence, typically 95%. But this does not mean that a side's final points total and thus finishing position has a 95% chance of lying within these two limits.

It is more your model that is on trial.

There is a 95% chance that any new prediction made for a team by your model will lie within these upper and lower limits.

Hopefully, your model will have done a decent job of evaluating a side, in this case Swansea from 2016/17. But if it hasn't, Swansea's actual finishing position may lie elsewhere.

Wednesday, 29 November 2017

Over Performers Aren't Always Just Lucky.

Firstly, this isn't another post about whether Burnley are good at blocking shots because "yes they are".

Instead it's about applying some kind of context to levels of over or under performance to a side's performance data. And attempting to attribute how much is the result of the ever present random variation in inevitably small samples and how much is perhaps due to a tactical wrinkle and/or differing levels of skill.

Random variation termed as "luck" is probably the reddest of rags to a casual fan or pundit, disinterested or outwardly hostile to the use of stats to help to describe their beautiful game.

It's the equivalent for anyone with a passing interest in football analytics of "clinical" being used ad nauseam, all the way to the mute button by Owen Hargreaves.

Neither of these two catch-all, polar opposite terms used in isolation are particularly helpful. Most footballing events are an ever shifting, complex mixture of the two.

I first started writing about football analytics through being more than mildly annoyed that TSR (or Total Shot Ratio, look it up) and its supporters constantly branded Stoke as being that offensive mix of "rubbish at Premier League football" and constantly lucky enough to survive season after season.

And then choosing the Potters as the trendy stats pick for relegation in the next campaign as their "luck" came deservedly tumbling down.

It never did.

Anyone bothered enough to actually watch some of their games could fairly quickly see that through the necessity of accidentally getting promoted with a rump of Championship quality players, Stoke or more correctly Tony Pulis, were using defensive shapes and long ball football to subvert both the beautiful game and the conclusions of the helpful, but deeply flawed and data poor, TSR stat.

There weren't any public xG models around in 2008. To build one meant sacrificing most of Monday collecting the data by hand and Thursday as well when midweek games were played.

But, shot data was readily available, hence TSR.

At its most pernicious, TSR assumed an equality of chance quality.

So getting out-shot, as Stoke's setup virtually guaranteed they would be every single season, was a cast iron guarantee of relegation once your luck ran out in this narrow definition of "advanced stats",

Quantifying chance quality in public was a few years down the road, but even with simple shot numbers, luck could be readily assigned another constant bedfellow in something we'll call "skill".

There comes a time when a side's conversion rate on both sides of the ball is so far removed from the league average rates that TSR relied upon that you had to conclude that something (your model) was badly broken when applied to a small number of teams.

We don't need to build an xB model to see Burnley as being quite good at blocking shots, just as we didn't need a labouriously constructed expected goals model to show that Stoke's conversion disconnects were down to them taking fewer, good quality chances and allowing many more, poorer quality ones back in 2008.

Last season, the league average rate at which open play attempts were blocked was 28%. Burnley faced 482 such attempts and blocked 162 or 34%

A league average team would have only blocked 137 attempts under a naive, know nothing but the league average, model.

Liverpool had the lowest success rate under this assumption that every team has the same in built blocking intent/ability. They successfully blocked just 21% of the 197 opportunities they had to put their bodies on the line.

You're going to get variation in blocking rate, even if each team has the same inbuilt blocking ability and the likelihood of a chance being blocked evens out over the season.

But you're unlikely to get the extremes of success rates epitomized by Burnley and Liverpool last season.

You'll improve this cheap and cheerful, TSR type blocking model for predictive purposes by regressing towards the mean both the observed blocking rates of Liverpool and Burnley.

You'll need to regress Liverpool's more because they faced many fewer attempts, but the Reds will still register as below average and the Claret and Blues above.

In short, you can just use counts and success rates to analysis blocking in the same way as TSR looked at goals, but you can also surmise that the range and difference in blocking ability that you observe may be down to a bit of tactical tinkering/skillsets as well as randomness in limited trials.

In the real world, teams will face widely differing volumes, the "blockability" of attempts will vary and perhaps not even out for all sides and some managers will commit more potential blockers, rather than sending attack minded players to create havoc at the other end of the field.

With more data, and I'm lucky to have access to it in my job, you can easily construct an xB model. And some teams will out perform it (Burnley). But rather than playing the "luck" card you can stress test your model against these outliers.

There's around a 4% chance that a model populated with basic location/shot type/attack type parameters adequately describes Burnly's blocking returns since 2014.

That's perhaps a clue that Burnley are a bit different and not just "Stoke" lucky.

The biggest over-performing disconnect is among opponent attempts that Burnley faced that were quite likely to be blocked in the first place. So that's the place to begin looking.

And as blocking ability above and beyond inevitably feeds through into Burnley's likelihood of conceding actual goals, you've got a piece of evidence that may implicate Burnley as being a more acceptable face of over-performance in the wider realms of xG for the enlightened analytical  crowd to stomach than Stoke were a decade ago.

Wednesday, 22 November 2017

An xG Timeline for Sevilla 3 Liverpool 3.

Expected goals is the most visible public manifestation of a data driven approach to analyzing a variety of footballing scenarios.

As with any metric (or subjective assessment, so beloved of Soccer Saturday) it is certainly flawed, but useful. It can be applied at a player or team level and can be used as the building block to both explain past performance or track and predict future levels of attainment.

Expected goals is at its most helpful when aggregated over a longer period of time to identify the quality of a side's process and may more accurately predict the course of future outcomes. rather than relying on the more statistically noisy conclusion that arise from simply taking scorelines at face value.

However, it is understandable that xG is also frequently used to give a more nuanced view of a single game, despite the intrusion of heaps of randomness and the frequent tactical revisions that occur because of the state of the game.

Simple addition of the xG values for each goal attempt readily provides a process driven comparison against a final score, but this too has obvious, if easily mitigated flaws.

Two high quality chances, within seconds of each other can hardly be seen as independent events, although a simple summation of xG values will fail to make the distinction.

There were two prime examples from Liverpool's entertaining 3-3 draw in Sevilla, last night.

Both Firmino goals followed on within seconds of another relatively high quality chance, the first falling to Wijnaldum, the second to Mane.

Liverpool may have been overwhelming their hosts in the first half hour, they were alert enough to have Firmino on hand to pick up the pieces from two high quality failed chances, but a simple summation of these highly related chances must overstate Liverpool's dominance to a degree.

The easy way around this problem is to simulated highly dependent scoring events as such, to prevent two goals occurring from two chances separated by one or two seconds.

It's also become commonplace to expand on the information provided by the cumulative xG "scoreline" by simulating all attempts in a game, with due allowance for connected events, to quote how frequently each team wins an iteration of this shooting contest and how often the game ends stalemated.

Here's the xG shot map and cumulative totals from last night's match from the InfoGolApp.

There's a lot of useful information in the graphic. Liverpool outscored Sevilla in xG, they had over half a dozen high quality chances, some connected, compared to a single penalty and other, lower quality efforts for the hosts.

Once each attempt is simulated and the possible outcomes summed, Liverpool win just under 60% of these shooting contests, Sevilla 18%, with the remainder drawn.

Simulation is an alternative way of presenting xG outputs rather than as totals that accounts for connected events, the variance inherent in lots of lower quality attempts compared to fewer, better chances and also  describes most likely match outcomes in a probabilistic way that some may be more comfortable with.

Liverpool "winning" 2.95-1.82 xG may be a more intuitive piece of information for some (although as we've seen it may be flawed by failing to adequately describe distributions and multiple, common events), compared to Liverpool "winning" nearly 6 out of ten such contests.

None of this is ground breaking, I've been blogging about this type of application for xG figures for years, But there's no real reason why we need to wait until the final whistle to run such simulations of the attempts created in a game.

xG timelines have been used to show the accumulation of xG by each team as the game progresses, but suffer particularly from a failure to highlight connected chances.

In a simulation based alternative, I've run 10,000 attempt simulations of all attempts that had been taken up to a particular stage in last night's game.

I've then plotted the likelihood that either Liverpool or Sevilla would be leading or the game would be level up based on the outcome of those attempt simulations.

Liverpool's first dual attempt event came in the first minute. Wijnaldum's misplaced near post header, immediately followed by Firmino's far post shot.

Simulated as a single event, there's around a 45% chance Liverpool lead, 55% chance the game is still level and (not having had an attempt yet) a 0% chance Sevilla are ahead.

If you re-run the now four attempt simulation following Nolito's & Ben Yedder's efforts after 19 minutes, a draw is marginally the most likely current state of the game, followed by a lead for either team.

A flurry of high quality chances then make the Reds a near 90% to reach half time with a lead, enabling the halftime question as to whether Liverpool are deservedly leading to be answered with a near emphatic, yes.

Sevilla's spirited, if generally low quality second half comeback does eat into Liverpool's likelihood of leading throughout the second half, but it was still a match that the visitors should have returned from with an average of around two UCL points.