Pages

Tuesday, 21 May 2013

How Important Are "Six Pointers"?

Imagine an alternative end of the season. Wigan and Aston Villa find themselves locked on 40 points going into week 38, trailing a host of teams, each with 42 points. So there is an anxious final 90 minutes in store, not only for Wigan and Villa in the fight to avoid the Premiership's final relegation spot, but also for the raft of sides two points in advance of them.

Except because of the vagaries of the fixture list only Villa and Wigan are at risk of relegation. Villa visit Wigan on the final Sunday and so everyone else is perfectly safe because the two lowly rivals cannot both get three more points to overhaul the pack.One of the two final day rivals is guaranteed to finish the season in the final relegation spot.

Such considerations quickly become major factors when simulating potential points totals for various teams or scenarios. The fixture list is thoroughly entwined and if Stoke defeat Arsenal at the Britannia in one particular simulation, the result for Arsenal on their travels to the Potteries for this particular iteration must also reflect this.

The hypothetical Wigan/Villa scenario boils down to a winner (or draw) takes all, but the importance of so called "six pointers", where teams who are likely to be locked together in the final table, has long been recognized. The most eagerly dissected head to head confrontation this season involved long term rivals Arsenal and Tottenham and with all title hopes extinguished by January, their fight for fourth place and Champions League football in 2013/14.

With all due respect to Everton, who found themselves sandwiched between the two London sides at the end of January, I am going to merely simulate the post transfer window campaign from the perspective of Arsenal and Spurs, taking particular account of the effect on the simulation of the result of the North London derby played on March 3rd at White Hart Lane. A true "six pointer", with Arsenal still trailing Spurs by 4 points at the time.

Once the window shut in January both sides had 14 games left. Arsenal, it would transpire had the easier run in. The median position of their opponents at the time of each match during the final third of the season was 12th compared to a more elevated 9.5 for Tottenham. That advantage was partially counter balanced by the use Spurs had already made from a less onerous set of fixtures. They led Arsenal at the start of February by three points after each side had played 24 games.

The race for fourth, therefore appeared very tight and so it proved. In the simulations, which account for strength of schedule and simulate matches played from February onwards, Spurs won 51% of the races where there was a clear league points winner. However, 6% of the races ended with both sides tied on league points and Arsenal's already superior goal difference at the start of February would make them much more likely to win the majority of these simulated contests on the tie breaker. If we take this best case scenario, Arsenal now grab fourth spot in 52% of the simulations.

So how pivotal in the simulations was the head to head encounter in March? In simulations where Spurs beat Arsenal, their share of the Champions League winning spoils rose from around 50% of the simulations to nearly 70%, with almost the absolute reverse being the case on the occasions where the Gunners triumphed. A drawn game saw Tottenham and Arsenal winning virtually identical percentages of the trials.

Bale's opening strike in the North London derby would ordinarily have led to Champions League football.
So in simulations, a head to head result from a near level break in January appears to give a large boost to the winning side. Spurs did in reality beat Arsenal in March, but they still came up short in May. Arsenal were always likely to gather more points than Spurs during the post season run in, they did so in 73% of the simulations, but Spurs' win at White Hart Lane would, more often than not have been decisive. Expensive, post Europa losses to the likes of Fulham may fuel another debate and combined with Arsenal's impressively fine run in, the Gunners can look forward to high quality European action next term.

I'll flesh out the numbers in a later post, but more good stuff based around the race for 4th spot can be found here from Simon Gleave and here from James Grayson


Sunday, 19 May 2013

Possession, Opponents And Match Outcome.

One of the early additions to the usually quoted football stats of goals scored and conceded was the amount of possession enjoyed by each side during a match and as such it is almost universally quoted today. It is therefore understandable that much effort has gone into determining the connection between possession and success or otherwise on the field.

Naturally from a supporter's viewpoint it feels more secure if your own side has the ball and similarly the ability to keep possession is often, quite rightly connected with talent and skill. If we further include that the most successful and well resourced teams in most major leagues are largely based around a passing and therefore possession based style it is easy to see how increased amounts of possession became intertwined with an increasing likelihood of achieving a favourable outcome.

However, it is increasingly becoming apparent that the relationship between possession and wins is far from straightforward. In this post  I outlined an earlier view that possession merely tells you how long teams spent trying to do certain things on the pitch. It doesn't tell you what those actions were, it doesn't tell you how successfully the actions were translated into really important things, such as goals  and it doesn't tell you how effectively each side carried out those tasks. Game states, tactical approaches and relative skill levels between the sides  and of course randomness, decide match outcomes and how these are played out on the day decide the largely secondary statistical measure that is recorded as possession.

Extremes can often be used to illustrate more subtle differences which appear in all matches but are difficult to spot when the teams play with similar natures and intent. Barca are unlikely to ever compete in everyone's dream matchup against Stoke at the perennially wet and windy Britannia, but pass loving Arsenal provide an adequate proxy for the Catalan giants. The outcome is fairly predictable at the Emirates, but less so at Stoke, where the Potters often claim all three points. But one universal constant persists, namely win, lose or draw, Arsenal always have much, much more of the ball than Stoke.

The match outcome is decided by the interplay of Stoke's direct, set piece centric approach, where defending is a chore undertaken largely without the ball, pitched against Arsenal's weaving, intricate brand of passing. Possession stats merely fall into place at the end of the game as a by product. So if possession in the case of Arsenal and Stoke contests is a predictable variable based around team styles, that his partially hard baked into their contests and is largely independent of match outcome, how do other less stylistically extreme matches fare?

Below I've plotted the amount of actual possession enjoyed by Stoke and Arsenal in every game from 2011/12 against the average possession of their opponents over a representative selection of matches.




The trend is clear and even more prominent across other EPL teams. A side's share of match possession is tied to the historical tendencies of both itself and it's opponent. For example , when Stoke meet a side which also employs a tactical approach that shuns possession, then Stoke's share rises and when Arsenal face similarly possession loving sides their share falls. In short, the possession battle is decided, largely before a ball is kicked and while it can be shifted slightly by such things as red cards, venue and scoreline, it is partly predictable with reference to the past styles of the competing teams.

The Stoke plot particularly highlights the futility of trying to connect possession stats to match outcome without reference to a side's preferred, and presumably most effective playing style. Stoke won eleven games in 2011/12, all achieved with less than 50% possession and they dominated possession in three games, winning none of the three.

League wide the connection persists. Pregame, historical possession stats for both sides, along with venue can predict individual match possession relatively well, as demonstrated by the plot below. Red cards in particular produce distorted extremes, but prior knowledge of each side's possession history leads to an adequate estimation of how often each side will see the ball in a single match.


Stoke have managed to secure virtually all of their Premiership wins with less than 50% of match possession, and to sever the last remaining connection between possession and performance, we need to see if more possession relative to a side's normal, average share, as opposed to the lion's share of possession, correlates to more successful results.

Spurs were in a more celebratory mood when seeing less possession in 2011/12.
 Defining performance over a single match is difficult because of the discrete nature of 3, 1 or 0 points awarded for each possible outcome, but we can partly overcome this by seeing if above average performance compared to pregame estimates are seen where possession figures are also above average for individual teams.

Stoke won 85% of their pregame points expectation combined in matches when they had above average( for them) possession, but 125% of their pregame points expectation when possession fell below their typical average. So more possession was generally an indication that Stoke were doing time consuming actions that, for them, were connected to under performance. Other teams shared this trait, from Everton, Spurs and QPR to both Manchester clubs.

Team's spread the length and breadth of the EPL table from Arsenal, Chelsea to Bolton, WBA and Villa demonstrated the reverse preference. Possession figures above their average led to better than average results and the less they saw of the ball the poorer their relative performance became.

Once again we are looking at what teams did and how long those actions took and ultimately how successful and lucky they were with their changing approaches. Possession largely appears to be a statistic that more defines a side's stylistic approach to gaining, defending, retrieving a desired result. As a stand alone number, it's usefulness fails to survive the inevitable lack of more detailed context driven investigation.

Saturday, 18 May 2013

What Chance A Premiership Playoff Game At Villa Park?

Arsenal and Chelsea go into the final Sunday of the Premiership season with the prize of automatic qualification for the next year's Champions League still to be decided. Chelsea hold their fate in their own hands and victory at home to Everton will render Arsenal's result away at already safe Newcastle an irrelevance.

However, with final placings decided firstly on points gained, then on goal difference and finally on goals scored, there is a possibility that Chelsea and Arsenal could end up stalemated for third place.

Chelsea Result. Arsenal Result. Chance Of Both Occurring.
0-0 2-1 1 in 120
1-1 3-2 1 in 450
2-2 4-3 1 in 14,000
3-3 5-4 1 in 1,600,000
4-4 6-5 1 in 500,000,000

The Premier League have taken the possibility of Arsenal and Chelsea ending up level under all three tiebreakers so seriously that they have provisionally scheduled a playoff game to be played at Villa Park on May the 26th. Above I've listed the combinations of results that will trigger Villa Park to prepare for their biggest game of the season. Of the relevant scores, Arsenal winning 2-1 is their most likely outcome and Chelsea being held to a 1-1 draw has the highest probability of occurring for them.

Fortunately for the fate of Chelsea's American tour, which straddles the chosen date for the playoff, these two most likely individual outcomes don't pair up. A goalless game at Stamford Bridge and a 2-1 win for Arsenal away at Newcastle is the most likely combination and the cumulative chances of a 39th game for both sides comes in at around 1 chance in 90.

Stranger things have happened on the final day of the Premiership season. But for those contemplating a much more outlandish finale to the campaign, a 6-6 draw for Spurs coupled with a 15-0 defeat of Arsenal by Newcastle would result in a playoff between those two sides for the final Champions League spot. The 6-6 draw alone carries around a once in 18,000,000 chance.

Thursday, 16 May 2013

The Art And Talent Of The Corner Kick.

Stoke City's innovative style of play, involving making the very best of the limited assets available wasn't merely restricted to the Delap/set piece routines so familiar to recent Premiership audiences. Paul Maguire, floating in the near post corners and Brendan O'Callaghan providing the headed goal or a worst the delicate flick on were a staple ingredient of match days at the Victoria Ground in the early to mid eighties. Both are still fondly remembered, especially O'Callaghan, who announced his Stoke City debut with a goal within 10 seconds, as a substitute.....from a corner. However, players move on, tactics change and strategies are developed to combat every successful system and in Stoke's more recent non Premiership past, they were regarded as a side that couldn't buy a goal from a corner. Once back in the top flight, Stoke reacquained themselves with the joys of scoring from set pieces in general and corners in particular.

If converting corners into goals is a talent that is distributed unevenly between teams and therefore, ebbs and flows across the decades, we should be able to see both repeatable team traits across seasons and conversion rates that diverge from those expected if the process was simply centred around the league average in a purely random manner.

The average goal conversion rate from corner kicks in 2011/12 was just over 3%. Highs of 5.5% were seen at Manchester City, lows of zero percent at Villa Park and an average of 200+ corners were attempted per team across the Premiership. Equality of opportunity was guaranteed for each corner at the outset of the kick because they are all taken from near identical pitch placements and the spread of the individual team success rates polarized by City and Villa over the last completed campaign, implies that some teams are more talented corner takers than others. If we attempt to account for the random variation component, we are left with conversion rates that are more indicative of the actual talent of each team and this figure is more likely to predict future performance than the actual conversion rates recorded by a side.

Manchester City were likely to have been good and lucky in 2011/12. So a conversion rate  nearer to 4% than their actual figure of 5.5% should really be entered against their name and likewise a near 2% conversion rate is a more accurate legacy to Villa's corner converting prowess for the 2011/12 season. It is probable that they experienced the perfect storm of being both generally poor takers of a corner and unlucky and improvement through a variety of routes should have been expected in 2012/13.

The eventual champions scored at least one goal from a corner once in every three matches and their closest challengers, United needed on average an extra match to do likewise. Overall, by Opta's definition, two matches out of every seven in 2011/12 saw at least one goal scored from a corner kick situation.

If we move on to the defensive side of the ball, the same effects are seen. The actual observed conversion rates allowed by each defense is more spread out than you would expect if each defence shared an identical ability to defend corner kicks. Also, by dragging extreme results closer to the league average and giving more weight to the raw figures recorded by sides which faced larger numbers of kicks, we produce numbers which are more predictive of future performance.

The poorest five performers at defending corner kicks in 2011/12 occupied the bottom five slots in the final Premiership table. Corner conversion has often been an avenue to excel at on the route to preserving top flight status, but by neglecting their duties at the other end of the field and leaking goals from corners at rates of at least one goal every three games, both Bolton and Blackburn's ultimately suffered relegation. Although Wolves managed to narrowly see off the trifecta, they were only marginally better than Wigan and QPR and also experienced the first of multiple demotions.

Interestingly, the season on season correlation for defensive performance is stronger than the corresponding attacking situation. Possibly the ability to make something happen (score from a corner) attracts more attention than the ability to prevent something from occurring. Therefore proficient corner scoring teams are quickly identified and schemed against in future meetings (Delap's longthrow survived as a potent weapon for barely three Premiership seasons and only remained effective thereafter in the unfamiliar territory of the cup competitions).

Villa employ a novel corner defence by attacking the ball.
Such attributes as aerial ability is an obvious advantage when defending and attacking against corner kicks and there is a weak correlation between corner competence at either end of the field. Above average converters of corners are slightly more likely than random to also be above average defenders of such a set piece. However, the weakness of the relationship hints at the diversity of talents that comprise a successful corner. An excellent delivery, as provided by a Robin van Persie or a Paul Maguire doesn't help his side defend a corner, but a good header of the ball is an asset at both ends of the pitch.

Raw conversion rates can hint at different talent levels of corner conversion and a relatively strong season on season correlation also implies a repeatable skill is present. But a deeper analysis of corner strategy requires isolation of every associated skill, from ball delivery to off the ball running and even the semi legal art of blocking opponents. As a valuable scoring method, a 3% conversion rate may not initially impress. However, as @analyseFooty suggested in relation to this post, if we consider a corner kick as just another pass, compared to an average pass, it is a devastatingly efficient one!*

All data is taken from the MCFC release of 2011/12 data in conjunction with Opta.

*(on average an EPL team makes 450 passes a game and scores 1.3 goals, of which about 70% are from open play. Therefore conversion rate per pass is of the order of 0.2 to 0.3%. In 2011/12 over 4,000 corners produced 131 goals, therefore, conversion rate is around 3%. Even allowing for general passes which don't carry attacking intent and accepting that not every goal scoring corner is a first contact score, corner conversion rates still easily hold their own).

check out Ravi's site at http://analysefootball.com/

Tuesday, 14 May 2013

Game States And Team Quality.

In my previous post I looked at how Arsenal's attacking and shooting tendency was tailored towards the particular game and scoreline states in which they found themselves over the 2010/11 season. Arsenal were the pregame favoured team in virtually all of their 38 Premiership matches in that season and it was only in the four matches where they traveled to Liverpool, Chelsea and the two Manchester sides that they went into the contest as underdogs. Consequently, the scoreline state and game states mirrored each other fairly well. A lead was obviously a good game state, a draw could almost always be improved upon compared to pregame expectations and when trailing, the Gunners had both the incentive and almost always the potential ability to turn the scoreboard around.

However, in the case of more mediocre sides, these correlations aren't always as clear cut, especially when the game is stalemated.

The final 2010/11 table was a fairly typical example of the recent Premiership. Manchester United were comfortably crowned champions, Chelsea, Arsenal and Manchester City followed them home in a tight group of three and then came those aspiring to qualify for the Europa league. The mediocre EPL sides then begin to appear and going into the final round of matches just seven points separated 9th place from 19th. Therefore, Aston Villa, 13th after 37 games and 9th a game later could reasonable be chosen as a typically, run of the mill side.

Villa were the favoured side in just 17 of their 38 games and unlike Arsenal, there would likely have been many more games where a draw would have been an acceptable result for the team from the West Midlands. So where Arsenal's approach would be consistently to tend towards pushing for a go ahead goal, the connection between Villa's scoreline state and game state is likely to be more ambiguous. A current point away to Fulham was most probably acceptable, (although they may harbour thoughts of capturing all three), but one at home to ultimately relegated Blackpool would be much less acceptable. In short, the scoreline states don't coincide as neatly with a side's perceived game state in the case of Villa compared to Arsenal.

Similarly when Villa trailed, their ability to match the desire to improve the scoreline with their capability of achieving that aim is also unlikely to tally with that of Arsenal. Villa trailed at some stage on 19 occasions, against teams who were as determined to hang onto their three points as Villa were to retrieve something from the match. So the change in scoring effort from Villa is likely to be a function of these shifting priorities shown by each side. When the same thing happened on 14 occasions to Arsenal, the Gunners had a more potent attacking force to call on for a more concerted retrieval approach than did the Villans in their various contests.

As with the previous Arsenal analysis, I've used the x, y data of the shot to determine a goal expectation, which in turn leads to an expected long term scoring rate in different scoreline states. At worst, this type of analysis can give an enhanced picture of how Villa tried to play during different phases of matches in that season and we may also be begin to see the interaction between teams without painstakingly plotting minute by minute changes in game state.

Aston Villa's Goal Expectancy From Chances Created in Various Scoreline States.2010/11.

Scoreline State. Ahead. Level. Behind.
Goal Expectation From Chances Created. A Goal Every 72 Minutes. A Goal Every 52 Minutes. A Goal Every 58 Minutes.

We see a similar trend to that exhibited by Arsenal. Chance creation and long term scoring rates decline when Villa led, compared to other scorelines. Shots were less frequent and marginally of the lowest quality on average. Interestingly, potential scoring rates are actually highest when games were level, Villa were creating best and most frequent chances in this scoreline state. Numerically, creation rates only fell away very slightly when they trailed, but quality was noticeably poorer.

All Hands On Defence As Villa Protect A Lead.
These changing rates coupled with those produced by Arsenal in the same season, hint at the changing dynamics of a football game, where desire and capability are pitched against opponent ability and intent. The game state at level scorelines is likely to be less clear cut in the case of Villa compared to Arsenal. In the former, both sides may be still be actively seeking a win, whereas the opponents facing Arsenal are likely to be more uniformly engaged in defending their point. In short, when drawing Villa are more likely facing teams who are also willing to take a chance.

Once Villa trail the eventual priorities are more clear, but as Villa lack the attacking expertise of the top sides, exemplified by Arsenal, their ability to create valuable chances may now be less than they were capable of achieving in a more open situation where both sides may still have been trying to break a stalemate.

Overall the Villa figures show a similar general trend as Arsenal in 2010/11. Both sides were at their least dangerous in goal scoring terms when already ahead. The differing potency of both Arsenal and Villa at level or trailing scoreline states may merely be simply an artifact of sample size or it may represent a genuine difference between the very best in such situations and the mediocre.

Often in football analysis, such as the relevance of possession, the characteristics of the very best overwhelm the tendencies of the less gifted majority, in turn hiding a more complex reality and this may be the case in determining game states for different teams under the same scoreline, especially stalemates.

Ultimately, game states will have to be defined by the non trivial interplay of relative team quality, current scoreline and time remaining.

Saturday, 11 May 2013

Cranking Up The Goal Expectation When Doing Badly.

A football match is a contest that is constantly and subtly changing in many ways. Goals are the obvious major events that alter the balance by which teams either seek to consolidate an advantageous position or retrieve a potentially losing one. Goals come about through a combination of skill, random chance and no little effort and the varying degrees to which teams choose to attempt to impose this factors on an opponent determines how successful they will be. In this post  I looked at how trailing teams are more likely to score than they had been previously when they concede the lead.The amount of time remaining is also a contributing factor, but sooner rather than later every team will launch a concerted effort to retrieve a losing position. They don't automatically become the most likely team to claim the next goal, if there is one, but they do, on average become more dangerous in attack than had previously been the case.

The extra potency shown by such teams could previously only be quantified if their efforts produce a goal and over large enough samples their scoring rate when trailing can be shown to increase by upwards of 10%. However, by using models that predict goal expectations for individual goal attempts based on the x,y co ordinates from where they were made, we can demonstrate how sides, on average attempt to up their attacking game in certain match situations. Either until their opponents succumb, they themselves are caught on the counter attack or the game merely excitingly runs it's full course.

Arsenal, being a consistently successful side are less prone to ambiguous, stalemated game states, where doubt lies as to whether or not they are reasonably happy to be on level terms. Original game winning probabilities of around 25% or smaller are the break even point, whereby a side is theoretically content with a point and the vast majority of Arsenal's matches will see them quoted at greater probabilities than this to win at the outset. Therefore, Arsenal are almost certain to push for a winner at some point in almost every tied game unless they are visiting either Manchester club or Stamford Bridge.

Arsenal's Goal Expectancy from Chances Created in Various Scoreline States. 2010/11.

Scoreline State. Ahead. Level. Behind.
Goal Expectation from Chances Created. A Goal every 55 minutes. A Goal every 45 minutes. A Goal every 45 minutes.

The overall level of Arsenal's ability in 2010/11 was on par with a side expected to score, on average a goal every 51 minutes. The goal expectancy based on the quality and quantity of the chances they created when they led suggests that then they played like a team capable of scoring only once every 55 minutes. So, as a team which had the lead they moved into a move defensive mode to the detriment of their attacking expectations.

The Gunners' urge to improve during level and trailing scoreline states is reflected in their quantity and quality of goal attempts being the equivalent of a long term average scoring rate of a goal every 45 minutes. In 2010/11 they upped the rate of chance creation and partly maintained the quality in a level scoreboard state and upped creation even more, but at the cost of chance quality when behind.

In the absence of goals, we can still show the efforts, sometimes fruitless, made by Arsenal in losing or frustratingly stalemated situations. During the 79 minutes they trailed to Villa in their final home game of the season, Arsenal fired in enough goal attempts of varying quality to have scored at a long term rate of a goal every 30 minutes and their game long potential goal expectancy over the full 90 minutes was an equally urgent goal every 35 minutes. But the randomness of conversion rates saw them merely register a 89th minute consolation, despite their numerous efforts. They lost on the day, but through random variation rather than lack of trying.


Above I've plotted the overall, theoretical scoring rate suggested by all the chances created by Arsenal in each 2010/11 match against the average of the game state they encountered on the day. In matches where they were consistently chasing their hoped for outcome, they were able to up their attacking output, producing chances that would yield almost a goal every 20 minutes in their home loss to WBA and these bouts of increased effort often remain in the overall game figures. But, as with the Villa game, short term randomness again beat them.

At the opposite end of the plot, they were unable or unwilling to continue to take the fight to United and Chelsea when beating both. Defence was probably more of a priority once the lead had been secured. An early goal from the game's best chance against Wolves, allowed Arsenal to dictate much of the game at Molineux. More goals would have been welcome, but weren't essential and a second goal only arrived in injury time as Wolves pressed forward.

It appears that all sides eventually tailor their attacking intent to suit the current score, their own pre game expectations, the quality of the opposition and time remaining and if random distribution of your innate talent is kind enough to gift you a three goal lead at home to Chelsea, there is little need to try to run up the score at the risk of opening up the game. Scoring further goals no longer remains a high priority.

Single game scoring efficiency is a heady mix of match day randomness that infrequently yields significant, talent driven events and the relative abilities of the contestants. High quality chances often fail to result in a goal, whereas poorer quality ones sometimes do, and these partly random outcomes often help to frame how the remainder of the match is played out.

For more interesting work on this essential context driven subject check out Paul's recent post at differentgame.