Tuesday, 31 January 2012

Home Advantage and Personal Duels in the EPL.

One of the most enduring and enigmatic features of modern day sport and football in particular is the existence of home advantage.Although the scale of a football team's home field has been declining fairly steadily over the course of the last and present century it is still a major factor in the sport.It's very common for individual teams to apparently buck the trend of teams performing better on their home turf,but these instances are always as a result of small,single season sample sizes and when measured over longer periods virtually every EPL side shows a preference for their home venue.Home teams have goal differences that are around 0.8 of a goal inferior on the road compared to at home and home success rates hover just below 60% at home compared to just over 40% when they're away.

We discussed and quantified home advantage in the EPL here and suggested a partial contributor to the effect here,in this post we'll try to use some of the more granular stats that are being made available and match these numbers to some of the more innovative research that has been done in the field.

Many factors have been proposed as playing a role in setting home field and whilst some have been discounted,others certainly play a partial role.Travel effects have in all probability declined with the improved transport network and over night stays becoming the norm.Crowd influence is probably an overstated factor.A research paper indicating that referees give different decisions when shown footballing incidents with the absence and presence of crowd noise was conducted with amateur refs from Staffordshire.From personal experience,not the most reliable of control groups.Familiarity with the playing surface is also probably less of a factor in football where there is little scope for improvisation in the presentation of the pitch.Very few professional grounds have the pronounced slopes that were part of the folk lore of giant killers such as Yeovil.Derby have long since departed from their quagmire at the Baseball Ground and present day Stoke with their narrow Highburyesque pitch are the closest you'll find to a team/pitch combination tailored to the needs of the hosts.

A more novel approach to home advantage appeared in the early 2000's when researches found an increased level of natural testosterone in both soccer and ice hockey players prior to home games compared to away contests and none competitive games.These increased levels in theory could lead to elevated aggression,better endurance and reaction time and additionally an enhancement of the passive aggressive effect postulated by evolutionary biologists when "trespassers" meet adversaries on none neutral ground.

Is seems reasonable to assume that if this phenomena is widespread in professional sport it will show up in the results of the many dozens of individual battles that occur during the duration of a team sport.Opta via it's resellers such as EPLIndex and the Guardian Chalkboards are gradually making more detailed statistics available and although the integrity  and accuracy of these numbers is inevitably going to be slightly suspect through different interpretations of similar events by different data collectors,coupled with the mammoth task of maintaining massive databases it does open up new lines for investigation.

The success rates for individual tackles and the results of individual duels between players could be the kind of footballing incidents that are effected by the venue specific swings in both testosterone levels and heightened aggression.Therefore,I recorded how successful teams were in individual game incidents when they instigated tackles or duels at home and then on the road.

I chose the 2008/09 season at random and plotted how the percentage success rates for the cumulative individual totals for all 20 teams from that season changed for home and away games.Before we look at the graphs it's worth noting that one season is not sufficient to fully capture even such a fundamental feature as home advantage beyond doubt.The overall home and away trend should be evident,but there will be random outliers due to the relatively small sample size and so we should expect similar variation when looking at tackle or duelling events.

How a Team's Success Rate in Ground Duels Varied by Venue in 2008/09.

The best correlation from the three categories of on field,individual challenges that I looked at was for one player taking on an opponent on the ground.It's hard to imagine a contest more suited to demonstrating the presence of an elevated state of endurance,aggression and reaction time than this type of challenge.If the testosterone study has validity we are likely to see accumulated team success rates in these type of situations generally decline on the road and on the limited evidence of one EPL season that is what appears to happen.Based on the regression line,a team which wins 50% of these type of challenges at home would likely win just over 49% of them on the road.

A Naturally Aggressive Home Player Wins this Ground Duel with two Passive Away Opponents. 

How a Team's Success Rate in Aerial Duels Varied by Venue in 2008/09.

A similarly strong correlation exists for aerial battles.For much the majority of the line of best fit where the bulk of the points lie,teams win a higher proportion of aerial battles at home than they do away from their home stadium.The situation is briefly reversed when we look at the few very worst of performers,this may be a quirk of less robust sides or it may disappear in larger sized samples.

How a Team's Success Rate in Tackles Made Varied by Venue in 2008/09.

Possibly the most interesting set of results comes from the success rates of tackles made on opponents.There seems at first glance no reason why the percentage of successful tackles made by teams should not also benefit from the increased testosterone fuelled,none passive approach that prevails at home.Yet we see the reverse for almost half of the trendline.Teams actually appear to complete a higher proportion of tackles away from home than they do at home at lower completion rates.However,a team making 78% of successful tackles at home are only predicted to complete 76% on their travels which is more in keeping with the trends seen in duels.

A clue may be found in Opta's definition of a tackle.A tackle is deemed a success if it is played out of play and is "safe".If home teams are leading as they more often are at home,particularly late in the game they are also quite happy to see the ball out of play and safe,especially as they still retain possess from the throw or kick.This allows the away side to notch up artificially higher tackle counts with the complicit agreement of the home side.There's not quite the same level of guaranteed benefit to the (leading) home side in allowing the visitors to win a ground or air duel.Tackles are like shots on goal much more hostages to game context and a rich subject for further,more detailed investigation.

So far we've shown that duels if not tackles are won more often by the naturally pumped home side,but before we can reasonably conclude that these numerous individual pitch battles accumulate and contribute to home advantage we need to see if ground or aerial success rates are correlated to overall team success rate in games played.

 How a Team's Seasonal Success Rate in Matches Correlates to it's Seasonal Success Rate in Ground Duels.

As before the trend and direction of the line of best fit is of more interest than the relatively weak,sample size driven correlation.The results are pooled from both home and away matches this time because we are simply interested in if winning a higher proportion of your ground duels enhances your chances of winning the game.And the upward direction of the slope indicates that this appears to be the case.A similar graph results from the data for aerial duels.

So to conclude.Various limited studies seem to indicate that home teams have elevated,naturally occurring testosterone levels compared to when they are the visitors and evolutionary biology suggests that species have evolved to act more aggressively when on their own turf.These two interconnected facts appear to be confirmed by home teams winning higher proportions of individual duels during matches and winning more of these duels correlates with winning matches.And this may go part of the way to explaining the universal occurrence of home field advantage in football.

Thanks to EPLIndex for access to the stats used here.

Friday, 27 January 2012

Individual Premiership Passing Stats Corrected for Difficulty or Why Leon Britton is Only the 66th Best Passing Midfielder.

When Alan Hudson returned to the top flight in 1983 for a swansong to his largely unfulfilled career,he again chose to swap the glamour of Chelsea for the bottle kilns of Stoke.A beautifully elegant passer in his prime,his biography was aptly titled "A Working Man's Ballet",he would have won more than the two England caps he gained during his first spell in the Potteries were it not for a spate of injuries and personality clashes.Stoke were a struggling (old) Division One side upon Huddy's re signing and while his return wasn't as epic an occasion as that of Sir Stan's a generation earlier it helped to stave off The Potters' descent into the lower reaches of the Football League.....for twelve months at least.

An abiding memory of Hudson at Stoke in that final season was of him casually taking the ball from the keeper,usually in front of the Boothen End before exchanging half a dozen square passes with various slightly terrified members of his own back four.It was only when an opposing forward was dispatched to chase the ball that Hudson stroked a forward pass into the vacated area,thus instigating a Stoke attack.This little cameo not only showcased  his skills,it also gradually allowed his team mates to become more comfortable in possession ,but as an unintentional by product it also ramped up the great man's passing statistics.Were Opta around thirty years ago Hudson would be near to the top of their passing charts.

More pertinently to the present,the scenario exposes some of the more obvious flaws of raw passing statistics.Square balls played under little or no pressure from the opposition are a lot easer to complete than forward passes played through the congested shipping lanes of midfield or the final third.Nowadays defenders are happy enough to play the Hudson role themselves and even avid long ball teams such as the present day Stoke are content to pass the ball along the backline,accruing possession gold stars as they go.Comparing the raw passing stats for a defender who can gain cheap successes with an attack orientated midfield player who is constantly trying to thread the eye will inevitably lead to misleading conclusion unless some attempt is made to contextualise each players passes.

In this post here I used the percentage of longballs a team plays as a proxy for the overall difficulty of the passes attempted and I've attempted something similar for individual players over last season and the first half of this one.This time I've used the percentage of forward passes from EPLIndex as the proxy for pass difficulty and I've grouped players by teams and split the sample as either defenders or midfielders.The profile of passes played by defenders and midfielders is very different,although defenders play proportionally more forward balls than do midfielders,62% compared to 50%,they are faced with many more safe,low risk passes.

How Defensive Pass Success Declines with Increased Proportion of Forward Passes.EPL.2010-12.

The correlation appears reasonably strong and an increase in forward passes leads to a decrease in successful passes,which is what you would expect.This allows us to derive a regression equation that can be used to predict a completion percentage for a certain proportion of forward passes based on the record of all defenders who played in the EPl over the last one and a half seasons.If we therefore calculate the completion rate expected for an average player for an actual player's proportion of forward passes,we can see if that actual player's completion rate is above or below our prediction.

For example Arsenal's Thomas Vermaelen  completed around 88% of his passes last season and those raw numbers rank him as equal 16th out of defenders who have played at least seasonal 100 passes over the two years.Vermaelen played 70% of his passes forward and from the regression we can say that an average defender would expect to complete just 68% of his passes under such circumstances.So the Arsenal defender was well above average for pass completion last term and if we repeat the calculation for all eligible defenders and re sort the rankings,he jumps to 7th best overall.

From the Vermaelen case we can see that the application of a reassessment of a defender's passing percentage based on the amount of forward passes he plays can re shuffle the pack and better reflect a player's enterprising passing style.The top ten defenders on raw passing completion percentages are listed below,followed by the ten biggest over achievers based on the predicted completion rate for an average defender making their percentage of forward passes.

Most Accurate Passing Defenders by Percentage Completion Rates.EPL 2010-12. 

Defender. Season. Pass
Accuracy %.
William Gallas. 2011/12 92
John Terry. 2011/12 91
Gary Monk. 2011/12 91
Ledley King. 2011/12 91
Johan Djourou. 2010/11 90
Alex. 2010/11 90
Jonny Evans. 2011/12 90
Kolo Toure. 2011/12 90
Ashley Cole. 2010/11 89
Vincent Kompany. 2011/12 89
Per Mertesacker. 2011/12 89

None of the most accurate raw passers appear in the corrected list below,although John Terry makes the first list from this season and the corrected list from the previous campaign.Clint Hill is by far the biggest beneficiary from the correction,jumping over 200 places on the back of an astonishingly large proportion of forward passes.Stoke defenders dominate the bottom 25 places in terms of raw stats,with Wilkinson,Shawcoss, Huth,Collins,Higginbotham and Wilson each making at least one appearance,but all move comfortably up the table under the revised terms.The EPL's worst corrected passer with at least 100 passes turns out to be Stoke's Ryan Shotton who falls over 30 places to the bottom of the pile compared to his raw stats,although in mitigation he hasn't been used exclusively as a defender.

Most Accurate Passing Defenders Corrected for Proportion of Forward Passes.EPL 2010-12. 

Defender. Season. Pass
Accuracy %.
Accuracy %.
John Terry. 2010/11 88 61
Wayne Bridge. 2010/11 78 59
Assou-Ekotto. 2010/11 71 54
Brede Hangeland. 2011/12 81 62
Nicky Shorey. 2011/12 76 59
Clint Hill. 2011/12 68 53
Th'mas Vermaelen 2010/11 88 68
Gael Clichy. 2010/11 80 62
Jose Enrique. 2012/12 80 62
Daniel Agger. 2010/11 76 61

If we now repeat the process for EPL midfielders we can similarly apply a correction to their  pass completion rates.The strength of the correlation is less strong on this occasion and the regression line is slightly distorted by the presence of the passing stats of the Big Five teams.Although passing rates are almost always attributed to the passer,it is self evident that there is also another player involved in the act of passing,namely the recipient.If we take Manchester City as an example,a pass from Silva to Aguero represents the nearly £70 million of combined talent,similarly Ya Ya Toure to Dzeko required a cash outlay of £60 million to bring about.The gap between the top and the rest of the EPL is never more evident than when you compare their respective attack and midfields.By contrast the average combined "cost" of a midtable EPL team's passing tandem will rarely peak at much more than £10 million and the average will be less.For example,Stoke's average midfield is worth around £4 million per player and it's regular strike force around 7 million.

In short the disparity between the Big Five and the rest is more evident when looking at passes originating from midfield compared to those originating from the defence.As with all high end improvements,small but noticeable improvements require a disproportionately large input of cash and in this case the presence of the stats of the hugely expensive top five team's midfielders weakens the correlation between difficulty of pass and pass completion for the league as a whole.

However,if we press on with this caveat in mind we see that Mikel and Denilson appear in both groups and the currently injured Lucas,despite his comparatively lowly completion rate can lay claim to being to top overall passer with 100 or more passes in each year because of his more adventurous attempts.

The much hyped Swansea midfielder,Britton slips from the top spot in the raw stats all the way down to 66th in the corrected version by virtue of his reluctance to pass the ball forward.

Most Accurate Passing Midfielders by Percentage Completion Rates.EPL 2010-12. 

Midfielder. Season. Pass
Accuracy %.
Leon Britton. 2011/12 93
SamirNasri. 2011/12 92
John Obi Mikel. 2011/12 92
Jake Livermore. 2011/12 92
Nigel de Jong.. 2011/12 92
Abou Diaby. 2010/11 91
Denilson. 2010/11 91
Paul Scholes. 2010/11 91
Mikel Artea. 2012/12 91
Joe Allen. 2011/12 90

Most Accurate Passing Midfielders Corrected for Proportion of Forward Passes.EPL 2010-12. 

Midfielder. Season. Pass
Accuracy %.
Accuracy % 
Danny Rose. 2010/11 79 63
Lucas Leiva. 2010/11 83 67
Denilson. 2010/11 91 73
Josh McEachran. 2010/11 91 73
John Obi Mikel. 2010/11 90 74
Alex Song. 2011/12 85 70
John Obi Mikel. 2011/12 92 76
Gareth Barry. 2011/12 87 71
Lucas Leiva. 2012/12 86 71
Michael Carrick. 2011/12 90 75

Tuesday, 24 January 2012

Manchester City 3 Tottenham 2,Balotelli's Penalty Kick and an Imaginary Red Card.

Manchester City 3 Tottenham 2.

Anyone who found themselves stuck in a queue for an early second half pie and then decided to beat the rush at the end could have missed every beat of the action in this enthralling title eliminator (for Spurs at least).A quick glance at the Expected Points graph for the game soon pin points where the incidents fell,as Man City quickly took control of the likely points haul and then almost as quickly handed most of their gains straight back to their visitors.


Spurs' Expected Points total had just crept slowly past the one point they were tenaciously hanging on to in the 56th minute,when first Nasri and then Lescott drove the Londoners total into the ground.Defoe's immediate response was quickly followed by the kind of breathtaking strike that is becoming mere common place from Bale and ten minutes after the mayhem began,the teams were back on level terms with Spurs ten minutes nearer to a hoped for point.

Although he started on the bench,the game's remaining major talking points inevitably revolved around Balotelli.To start with his least controversial contribution.He's sent tumbling by King in the 94th minutes just as the game is expiring and typically picks himself up to score the game's winning goal.Just prior to the foul,Spurs' Expected Points total was 0.9999,as near to 1 as you can get,but allowing for that faint chance that they were slightly more likely to concede a late,late winning goal in the seconds that remained than were their opponents.Similarly,City's Expected Points were a shade above one.As the referee awards the kick,Spurs see their EP plummet,but not all the way to zero,there's still around a 25% chance that the kick will be saved or even that it will be scored and they'll storm down the other end a grab a dramatic second equaliser in the seconds that remain.But because it's Mario taking the kick,you just know he'll score and City will hold on....and they do.

How the Game Changed in the 94th minute.

 Inevitably nothing is simple where Balotelli is concerned.Many thought at the time that he shouldn't have been around to win and take the decisive kick.Having been already booked during his short time on the pitch he then appeared to stamp on Parker after 84 minutes.At the time the referee decided it was accidental and certainly Mancini wasn't to be seen brandishing an imaginary red card on touchline.Balotelli himself was quick to apologise to Parker,indicating he simply lost his balance,in retrospect one of the Italian's smarter reactions given that the Football Association have since deemed the stamp intentional and banned him for four games.

How a Balotelli Red Card would have changed the Game.

Tottenham were naturally furious,although possibly significantly not in the immediate aftermath of the incident.Maybe like Parker they were simply stunned.But a quick look at the alternative reality shows that despite Balotelli inflicting real and actual harm to Spurs' title chances 10 minutes later,the game was at such an advanced stage that even if the Italian had been sent off,the respective Expected Points totals for each team wouldn't have changed much.If Harry thought Spurs would have taken a point against 10 man City,he should also recognise that they were almost as likely to take that point against a full strength City that late in the contest.Poor defending was a bigger contributor to Spurs returning south empty handed ,than poor refereeing.

Thursday, 19 January 2012

Passing Accuracy in the Premiership Corrected for Pass Length or Why Stoke are this Season's Best Passers.

Before we start to address passing percentage in the EPL,I'd like to use a short illustration from American Football. When the Houston Texans became the 32nd and most recent addition to the National Football League they earned the right to select that year's most promising prospect and they exercised that right by choosing Fresno State quarterback,David Carr.Carr spent five largely disappointing seasons at Houston before swapping the AFC Conference for the NFC where he served time as a rarely used backup most notably in New York with the Giants and in San Francisco.

Ironically it was during his final season with the Texans that Carr produced his career high numbers for passing percentage,he was successful on over 68% of his 442 attempted throws during the 2006 season.Taken in the raw,these numbers appear to be rather impressive.Acknowledged future Hall of Fame candidates such as Peyton Manning and Tom Brady,average only 64% over their most productive seasons.

 By comparison with Carr's gaudy 2006 season figures,Atlanta quarterback Michael Vick completed just 57.5% of his 2006 passes and like Carr,Vick wasn't retained by his then employers the Falcons.Vick didn't return to Atlanta in 2007 for the very good reason that he was serving jail time for aggravated animal cruelty,but despite his apparently lowly completion figures he was able to claim a starting role with the Philadelphia Eagles upon his prison release in 2009 and unlike Carr he remains a current starting player. So what made Carr's 68% completion rate unattractive to prospective employers,but Vick was able to win a mega bucks contract with his less than impressive 57.5%.Part of the answer can be found if we look at how far each player was throwing the ball through the air.In Carr's career high season he was on average throwing his passing attempts for less than 5 yards through the air,whereas Vick's passing attempts were going for over twice that distance.In short Carr was attempting relatively easy,short throws compared to Vick's longer more difficult ones.

 The issue of passing distance therefore offers some mitigation for Vick's apparently poor completion rate,but we can go much further by regressing the completion rates for all starting quarterbacks against the average distance of ground they throw the ball.We can then plug into the regression line both Vick and Carr's average throwing distances for the 2006 season to see the completion rate we would expect from an average,starting quarterback and that quickly reveals the reason for Carr's ultimate demise.A typical starting quarterback who was asked to throw the kind of short passes required of Carr in his final season at Houston would likely have a completion rate of almost 72%,so Carr's recorded 68% marks him down as a well below average completion passer.

By contrast Vick's contemporaries would have completed his depth of throws just over 55% of the time,Vick for all of his faults on and off the field was still an above average passer in 2006.

 Back to the English Premiership,where the idea of correcting passing statistics to account for length of pass hasn't yet been embraced.The longer the pass the more the opportunity for the ball to deviate from it's intended target and the greater the time available for an opponent to converge on the target or make an interception.In short,longer passes are more difficult to execute and carry an intrinsically bigger chance of failure. Team passing statistics are increasingly becoming available for the EPL,but seldom in a form that make analysis possible.However,passes are being broken down between short and long passes and that allows a way to begin to put some context on the raw numbers.(NFL passes are similarly broken down as short or deep passes and these designations can be used as a reasonable proxy for a player's actual air yards).If we use the proportion of long passes a team makes over a season as a way of classifying a team's likely average passing length and plot this against that team's passing completion percentage we immediately see that there is a very strong correlation.In general and logically the higher proportion of long passes made by an EPL team,the lower it's completion rate.

  How an EPL's Pass Completion varies with Increasing % of Long Passes.2009-12.

As with the earlier quarterback comparison,we now have a means to input a team's actual percentage of long passes into the regression line to predict the rate at which a typical EPL team would complete those passes and we can then see which teams are out performing that rate and by how much.Equally we can see teams who are under performing the league average and whose completion figures are possibly flattered by a short and safe style.I've only taken data from the last two completed and this year's part completed seasons.

Below I've listed the teams who have outperformed their predicted passing completion by most over the last three seasons.The table contains a variety of styles ranging from shorter passing sides such as Chelsea,Swansea and Man Utd to long ball merchants such as Birmingham,Norwich and inevitably Stoke.

Tottenham must claim the crown of most accomplished passing side over the period of the data.They appear three times in the top ten and are using slightly shorter passes this season,but have out performed the league average for their average pass length in all three seasons.Swansea also justify their reputation as a passing side,they don't go deep very often,but in terms of adjusted passing quality they slip in between Chelsea and Man Utd  so far this year.

Glenn Whelan prepares to launch it long.

Almost inevitably,Stoke,the bete noire of the EPL confound their critics by topping the table.Their unorthodox approach has won them few friends,but as with most things they do,they do it extremely well.They top the table for playing the largest percentage of long balls over the three seasons and that inevitably means that they will prop up the table for raw completions,but when pass length is accounted for they rise to midtable for the previous two seasons and  are the biggest over achievers this year.An average side would complete just under 65% of Stoke like passes,Stoke manage almost 70%.Visually I feel much of their improvement this year is down to the addition of Crouch.His ability to hold onto forward passes,whilst holding off defenders is most impressive and highlights another flaw in raw passing statistics.Namely you have to pass to someone and the better the quality of the recipient,the easier it is for a passer to accumulate good raw passing stats.

Largest Over Achieving Passing Teams when Correcting for Pass Length. 

Team. Year % of  Long Passes Pass Success % Predicted Pass Success% % Difference
Stoke. 2011/12 23.5 69.8 64.7 +7.9
Tottenham. 2011/12 14.2 84.8 80.2 +6.0
Blackpool. 2010/11 19.3 76.4 72.1 +5.9
WBA. 2010/11 18.0 77.9 74.3 +4.8
Birmingham. 2010/11 21.4 71.6 68.5 +4.6
Tottenham. 2009/10 18.6 76.6 73.2 +4.6
Norwich. 2011/12 19.7 74.4 71.3 +4.3
Tottenham. 2010/11 16.4 80.1 77.0 +4.1
Wigan. 2009/10 19.3 74.8 72.1 +3.8
Wigan. 2011/12 15.5 81.1 78.3 +3.6
Chelsea. 2011/12 11.8 85.6 82.8 +3.4
Swansea. 2011/12 12.0 85.2 82.6 +3.1
West Ham. 2010/11 18.5 75.7 73.4 +3.1
Man Utd. 2011/12 12.9 84.1 81.6 +3.0

For every winner there's losers and a similar process can identify which teams should be completing a higher percentage of their passes given their chose of pass length.Fulham and Blackburn clog up the top five,although as a slight encouragement they don't appear in this season's guise.Perhaps it was a problem that was identified and addressed in the close season,both teams are hitting slightly less long balls this year,so maybe they were particularly bad when going deep.Sunderland,Wolves and Bolton also feature prominently,but the presence of Arsenal's 2009/10 side and Liverpool's 2010/11 team will perhaps surprise those who judge solely by reputation.

Largest Under Achieving Passing Teams when Correcting for Pass Length. 

Team. Year % of  Long Passes Pass Success % Predicted Pass Success% % Difference
Fulham. 2009/10 14.7 74.3 79.4 -6.4
Blackburn. 2009/10 21.1 64.5 68.9 -6.4
Blackburn. 2010/11 21.2 65.2 68.8 -5.2
Aston Villa. 2010/11 15.6 74.1 78.1 -5.2
Fulham. 2010/11 14.4 76.2 79.8 -4.5
Liverpool. 2010/11 14.1 77.1 80.2 -3.9
Sunderland. 2010/11 18.0 71.8 74.4 -3.5
Sunderland. 2009/10 21.1 66.7 68.9 -3.2
Wolves. 2010/11 17.8 72.3 74.6 -3.1
Wolves. 2009/10 20.3 68.2 70.4 -3.1
Bolton. 2010/11 21.0 67.0 69.0 -2.9
WBA. 2011/12 14.5 77.3 79.6 -2.9
Arsenal. 2009/10 10.9 81.4 83.6 -2.6
Bolton. 2009/10 22.8 64.2 65.8 -2.5

These figures shouldn't be used to endorse long passing over short passing or vice versa,unlike the NFL where passing is pre eminent passing is but one component of football (or soccer to avoid confusion).The numbers merely show that team's can and do chose to mix up their proportion of long and short passes and it's very likely that some teams are better at completing different lengths of passes than other teams.

Tuesday, 17 January 2012

Blackburn v Fulham.

Blackburn 3 Fulham 1.

0-0,Red Card,Yakubu,(Blackburn),23'

It takes some serious self inflicted damaged to make a run of the mill home relegation scrap a candidate for performance of the season.Yakubu's red card was harsh,but understandable in the current slightly chaotic refereeing climate.The striker's dismissal,his first in England came after just 23 minutes and therefore was the equivalent of pre game favourites Fulham leading the original 11 v 11 match up by just over a goal.At that point Blackburn's expected points total was driven down to a shade over 0.4 points per game and 20 minutes without conceding was only good enough to cause it to climb by 0.1 points.However,two goals within a minute (although separated by the interval) catapulted Rovers to the status of strong favourites.The clock was very much their friend when Duff halved their advantage and the climb towards three precious points was accelerated and virtually assured by substitute Formica's strike 10 minutes from time.

The transfer window round up here,surmised that a Samba less Blackburn defence,allied with a better than average attack for a struggling side was their best chance to survive and that's just what Rovers served up for their fans on Saturday.

Saturday, 14 January 2012

Thierry Henry.A Deal Well Done?

Arsenal welcomed back an old familiar face on Monday and Thierry Henry immediately repaid their faith in the former spearhead of their strike force with the winning goal at the Emirates against Leeds United in the 3rd round of the FA Cup.Signed as cover for Arsenal's less than prolific departees for the African Cup of Nations,Henry's reappearance in the EPL after two season's in the MLS provides an intriguing real life experiment.

How for instance will the formerly prolific scorer cope in the EPL now that he's into his 34th year?

Quantifying a player's level of performance in a predominately team sport such as soccer is certainly problematical.A whole raft of new measurements are slowly being recorded,although their availability and how they are interpreted and applied to individual players is still up for debate.Fortunately Henry in his time at Arsenal was predominately a goal scorer and goals scored are easy to record and can be used to readily define the course of a player's career.A goal scorer who stops scoring goals doesn't stay around in the EPL for long.

Henry became during his time in the EPL a member of an exclusive club of players who scored 100 or more Premiership goals,a feat that singles those players out as being not only potent finishers,but also having the longevity to play at the top for multiple seasons.The 100 club contains around twenty current members and we can use their combined goals scoring records to devise an ageing pattern for their exploits as a group.

Goals scored per game does a good enough job at rating a goals scorers ability,however it does tend to favour players who were on teams which scored lots of goals.Therefore I usually express a goalscorer's core statistic in terms of his goals per game rate as a proportion of his team's goals per game rate as this gives a little more context around the kind of goal scoring environment the players was performing in.It also helps to highlight how integral a player was to the team's overall goal grabbing ability.For example if  player scores on average a goal a game for a team which scores on average 2 goals a game,his scoring rate is half that of his team.If his team scores 3 goals a game,his rate is only a third.

To see how elite goalscorers age I recorded the cumulative,team adjusted,scoring rate for all members of the EPL 100 club and plotted the results.The sample is obviously biased because not only does it contain only the league's very best finishers,but it's also biased towards longevity.But we are interested in trying to predict a likely ageing curve for Thierry Henry and he is very much typical of the kind of player who makes up the larger sample.The curve wouldn't necessarily apply to goal scorers as a group or indeed outfield players as a group.

Scoring Performance Curve for All Time Elite Strikers in the EPL. 

As you'd expect,players who stay around the EPL long enough to net at least 100 career goals are performing to a high level at a young age,peaking at around 25 years of age,but are still playing at a similar standard as they enter their thirties as they did when they were just 20.Once into their thirties though,their overall scoring contribution to the team begins to decline with increasing steepness and also remember this graphic does not account for appearances made by the player,just his average scoring rate when he does make it onto the field.So although the graph may capture how injuries impact on a players ability to score goals,it doesn't tell us anything about how injuries can limit playing time.So older players may appear more attractive because of the omission of game time limiting injury data.

The profile of the plot does tells us that typically an elite striker is performing as badly as he has ever done by the time he reaches his 32nd birthday.They are still productive,but they no longer have any upside in terms of resale value or increased goal getting capacity.Scoring capacity then declines rapidly as does sample size as players in the sample drop down in grade or out of the game completely.

So lets now see a similarly plotted graph looks like for Henry.He spent five years at Monaco,before making a mid season switch to Juve,where he played mostly in a wide attacking role,before joining Arsenal in 1999.

 Scoring Performance Curve for Thierry Henry for Arsenal. 

The plot is understandably much less tight as we are now dealing with smaller sample sizes relating to just one player,but the general shape of the line of best of is very reminiscent the plot for elite strikers as a whole.Notably Henry's average seasonal scoring rate compares more favourably with that of his team than does the team adjusted average for strikers as a whole over the same stretch of years.An impressive personal statistic for Henry,especially because he played at least 30 times in each season except his last one.He also peaked a couple of years later than was usual for his contemporaries,possibly as a result of playing initially in less physically demanding leagues or possibly as a trait of Henry himself.By his last season his career curve was beginning the downward curve typical of the group,his appearances where almost half their number when he was at his peak,but he was still playing and contributing at levels that well exceeded the expected norm.This set of exceptional figures,whilst combined with a fully expected onset of inevitable decline goes some way to explain why Arsenal firstly chose to sell Henry and secondly it shows how they were able to demand such a large transfer fee from Barca for a player approaching his thirties.It also fully justifies the decision to honour him with a statue.

So what can Arsenal fans expect from their returning star of yesteryear.The deal is merely a short term loan period,so there aren't any issues surrounding Henry's declining ability to play out an entire season.He played over 30 total games in each of his three seasons at Barca and has since managed almost 40 games in the less demanding MLS.So he should easily manage a short loan period.It's likely that his effectiveness will be limited simply by his ageing curve.

Throughout his EPL career,Henry's contribution to Arsenal's scoring was on average 30% higher than those of a typically elite striker,therefore if we input his age to the line of best fit for all strikers and increase that value by 30% we will  get a reasonable approximation of where Henry's ability currently resides.A 34 year old top striker who is still good enough to be still playing in the EPL will be scoring around 18% of the total scoring rate of his team.If we include the "Henry" premium that rises to about 24%,the equivalent rate for an elite striker who is still just the right side of 30 and not approaching his mid thirties as Henry is.If we now use Arsenal's average rate of goalscoring over the season we can deduce that Henry should be able to score an average of 2 goals for every 5 appearances he makes.

The amount of games Henry will play during his loan period will make it virtually impossible to conclude if he has performed to his predicted level,but the mere fact that Wenger has done the deal means that more analytically talented football brains than this one consider Thierry well worth a punt.

In a follow up post I'll include career curves for more members of the 100 club and reveal which current EPL strikers is a real one off.

Thursday, 12 January 2012

Optimising Your Transfer Window Purchases.

Crack open the Bovril and heat up the Wright's pie because the EPL has reached the half way stage and it's time for the transfer window to provide a last chance for teams to cement that better than expected start and transform it into tangible rewards come May.Or more stressfully,it's a final opportunity to get enough able and willing bodies on board to fight off the dreaded specter of relegation and a reluctant starring role on BBC's soon to be axed late night Championship Show.

Artificial and contrived it may be,there's no doubting the January window is a dramatic and compelling addition to the football calendar.As a footballing antidote to the post Christmas blues,it's second to none,be you a fan with opinions to share or a slightly unsettled player with a fleet of high performance cars in urgent need of chrome plating.

So what's the best way to ensure that your January sales purchases are providing the biggest potential boost to your current fortunes by addressing the weakest part of your squad instead of merely reinforcing an already well performing aspect of team performance.The first stage of the process should be to be realistic about your goals.Teams at the foot of the table are obviously there because they lack quality throughout the side.It's a futile exercise to compare the current state of teams like Bolton,Wigan and Blackburn to even the league average benchmarks or you'd simply conclude that they each needed at least half a dozen new improved players.And that isn't going to happen because the money isn't available.If it were,such teams would struggle to attract quality players anyway.And if they somehow did attract them,such a large number of new signings would need time to gel and possibly adapt to Premiership which time we'd have reached April Fools Day and continuing under performance would likely be assured.

What's needed is a way to compare the attacking and defensive qualities of teams while also accounting for their current overall league position.As any young aspiring athlete knows,improving your weaknesses is often a quicker and more efficient way to advance a team than trying to improve further on your strengths.We therefore now need a team stat that can be broken down into defensive and attacking components,but as we are just half way through the season,it also has to correlate well with itself.We can then use the information derived from this to decide where the rebuilding is going to be targeted.

We saw here how well team statistics auto correlate from the first half of the season to the second and it's clear that a shot based approach may be the best.However,we also need to know how strongly the statistics correlate with team success rate,because ultimately that is what every team,not just the struggling ones are striving to improve.Shots are only moderately correlated to success rate and that correlation doesn't improve if we add a shot efficiency component to the analysis.More importantly,we've seen here that shot efficiency is very context dependent even over entire seasons.Trailing teams are less efficient than teams who are playing with a lead.Scoring go ahead goals is partly luck driven and this shows up more in limited sample sizes.Thus,it should come as no surprise that shot team efficiencies show virtually no correlation for EPL teams as a whole when measured over half seasons.

We've seen here that goal difference is correlated far stronger to success rate than any other commonly recorded team statistic,it's also moderately correlated with itself,therefore a combination of reasonably strong auto correlation and very strong success rate correlation makes goal difference a much better choice to define a team's August to December record.

As a final strand in our massive game of Connect Four,we have broken down the various components that go towards a team's overall goal difference in this post. Armed with this analysis we can use a side's current goal difference,compare that side's attacking and defensive performances with the average performances expected from a side having a similar goal difference and see were the team's rotten under belly really lies.

The Historical Scoring Contribution to a Side's Goal Difference in the EPL 1999-2010.

The Historical Defensive Contribution to a Side's Goal Difference in the EPL 1999-2010.

I don't want this post to become a tedious number crunching fest,so after a pared down illustrative example,I'll merely list each team's strength,weakness or for fairly balanced sides neither.Blackburn probably aren't the best example to use because most people would question their board's commitment or competence,but Rovers do have a readily discernible talent divide.Their current goal difference is -0.7 and their position of bottom,but just three points from 16th reflect this.To achieve this goal difference they've scored 1.45 goals per game and conceded 2.15 goals/game.If we now look at each of the graphs above we can see that historically a team with that goal difference would have scored around a goal a game and allowed their opponents to score around a average of 1.7 goals each game.Therefore,we can conclude that Rovers are an above average attacking side for their current position (They've put 4 past both Arsenal and Swansea and 3 past United at Old Trafford.),but a below average defence.

Realistically they should be aiming to become an average defensive side in their sphere,whilst trying to maintain their attacking prowess,thereby improving their goal difference and moving up the EPL ladder by a couple of rungs.It's likely that their attacking numbers will regress toward the mean for poor EPL sides after January,but they should still be above the norm for strugglers.Likewise their defence is likely to improve slightly because of random fluctuations,but genuine improvement from such a low level should be achievable through drafting in better loan or permanent signings and better coaching.Many coaches are of the opinion that poor defences can be made markedly better through easily implemented organisational routines,but no amount of coaching can teach a striker the Cruyff turn.Another reason why strikers attract a premium,they tend to be born rather than manufactured.

Whether the Blackburn board are willing or able to play ball is another matter.They do at least have a saleable defensive asset in Samba who could be used to finance a couple of purchases that may decrease the talent of the parts,but increase the effectiveness of the whole.

If we repeat the process for every EPL team we can highlight the weaknesses sides are showing given their current league position and goal difference by comparing them to historical precedences.To make the following table more easily readable I've simply colour coded the findings.Green means good to go,red means in need of attention and amber means teams are fairly close to being an historically well balanced attack and defence.Current injury concerns aren't factored into the analysis,for example Everton only have two fit central defenders,two less than Stoke habitually include in their starting line up.So following Jagielka's long term injury they will have to ignore their attacking needs and get in defensive reinforcements instead.

Where Teams Should be Making their January Purchases. 

TEAM. State of
the Attack.
State of
the Defence.
Sunderland. ATTACK. DEFENCE.

Rather counter intuitively,it is the team without a gaping hole in their team who will find it most difficult to actually move far from their current position.They have been playing at a level on both sides of the ball that is largely consistent with their current position.They will struggle to attract signings that are any better than the type of players they already have on board,so any purchases are more likely to prove squad depth than any great leap forward.Wigan for example are playing relegation football,but with no obvious route to improvement,whereas both relegation rivals,Bolton and Blackburn have an open sore defensively and a means to bring in players at the cost of one outstanding member of a grossly under performing unit.They at least have a survival strategy.On a level playing field you would worry for QPR and Wigan.

Will this season see Wigan's lack of quality finally catch them out?
(Spot the brave Wigan fan in the Stoke end!)
To conclude and provide an embarrassing recap come May,we can assume that each team does manage to haul their under performing unit up to league average levels for their current position and maintain their strengths,we can armed with their likely new and improved goal difference simulate the remaining 18 games of the season,total this to their current points total and predict a final table.

Predicted Final EPL Table 2011/12.

TEAM. Predicted
Man City 97
Man Utd. 89
Tottenham. 81
Chelsea. 73
Arsenal. 69
Liverpool. 65
Newcastle. 61
Sunderland. 51
Norwich. 51
Stoke. 49
Everton. 49
Swansea. 48
Aston Villa. 46
Fulham. 46
West Brom. 43
Blackburn. 36
Wolves. 35
Bolton. 34
QPR. 33
Wigan. 27

No huge changes in positions,the table is usually very well established by mid term.But there's certainly  opportunities for teams at the bottom who are prepared to target their spending.The now customary log jam at the bottom looks likely to re occur and there's welcome slippage at the top with the possibility that some previously permanent members of the Big Four making way for a couple of fresher faces.

Tuesday, 10 January 2012

The Manchester Derby and the Penalty that Wasn't Given.

I hope this blog isn't starting to look like a Chris Foy bashing arena,but the Lancastrian referee has been involved in one contentious decision after another and therefore he makes great copy for my win probability model.It's very easy but ultimately too simplistic to add or subtract goals that should or shouldn't have been given to the final match score in an attempt to quantify the effect dubious decision have had on a game. Unless a goal's wrongly given or allowed with the very last kick of a game,the correct approach is to see how the various alternative scenarios would have changed each team's chances of winning,losing or drawing the game as they stood immediately prior to the incident.This approach is particularly pertinent when dealing with penalties,around a quarter of them are missed,so it's wrong to even suggest that a penalty awarded is a goal guaranteed.

Obviously the biggest talking point to arise from the Manchester derby is the dismissal of Kompany for his reckless two footed lunge/superbly timed Bobby Mooresque tackle.(Delete appropriate to you allegiances).The incident has been dealt with here and it certainly swung the game massively in United's favour.Their position in the game immediately following the red card was the equivalent of playing City 11 vs 11,but with at least a two goal lead rather than the one goal lead they actually had at the time.

However,an equally decisive moment came in the 82nd minute when Phil Jones intercepted Kolarov's cut back cross.The ball struck his leg,spun up onto his outstretched arm and went behind for the awarded corner with City trailing 2-3.Again opinions are probably split between neutrals and City fans who thought it was a penalty and United fans who saw no intent on Jones' behalf.The suspicion was that Foy was unable to make a decision because neither he or his far touchline assistant had an unobstructed view of the incident.If he had been better placed a penalty wouldn't have been too surprising a outcome.A potential verdict that Jones was sufficiently concerned about to indicate that the ball hit his chest (virtually the only part of his body the ball didn't hit).

Alternative Realities for the 82nd Minute in the Manchester Derby.

Whatever the rights or wrongs of the non decision,United were massive favourites immediately prior to Kolarov's cross.Despite introducing a ring rusty,racehorse owner,whose last competitive gallop had been eight months previously,United still led by a goal,still had a man advantage and the clock was ticking.A draw was unlikely,a Man City come from behind victory was still hugely improbable and their previous heroics still required a 1 in 200 series of events for them to record an astonishing come from behind victory.

Had Mr Foy,seen Jones' arm come into contact with the ball and had he felt kindly disposed towards City,a United victory would still have remained the most likely outcome of three as Foy blew his whistle to award a spot kick,but the chances of a replay would have leaped to almost coin toss levels.To make the draw the favoured outcome,City needed to score the penalty,presumably through the boot of Milner as most of their regular penalty takers were either injured,on loan or on a beach.They would however,have still remained the outsiders to take the game in the remaining eight minutes plus stoppage time.....although from the situation of 3-3 the miracle turnaround was now merely unlikely compared to massively improbable.

All in all another game changing call of huge proportions with no definitively right or wrong decision,especially given the compromised sight lines.Graham Poll in his Daily Mail column thought it was a penalty,so I'm inclined to think this was one call Mr Foy got right.

Of course had Chris Foy awarded Manchester United a 62nd minute penalty when Kolarov bundled into Valencia,two minutes before City's second goal.........but there's a limit to the number of times even the most unfortunate of refs are put through the win probability wringer.

Sunday, 8 January 2012

Manchester City v Manchester United.FA Cup 3rd Round.Win Probability Graph.

Another match,another potentially busy day for Sir Chris Hoy's Twitter account.Rooney's 10th minute goal was enough to make United the favourite of the three possible outcomes,but it was another controversial decision by Chris Foy two minutes later that condemned Man City to an afternoon of valiant toil with little chance of reward.

Red Cards on average are shown around the hour mark,so they're worth just over an extra half a goal to the recipient's opponents.The team receiving the card will score less and concede more and the net benefit to their opponents will be an average of half a goal..When Foy decided that Kompany's mid pitch challenge on Nani was a reckless two footed lunge and not the clean tackle it appeared from various angles,the subsequent red card was worth around an extra 1.3 goals to Man Utd,because virtually the whole match still to be played.So from the advantageous situation of leading by a goal against superior opponents,United were propelled to the almost unassailable position of leading 1-0 and also becoming the superior team on the pitch.

Even when City reduced the arrears to a single goal,their visitors were still the most likely team to score next and the clock was increasingly becoming United's friend.A magnificent effort by City,but United would have had to have been incredibly careless to have failed to progress on the day.

0-1,Red Card,Kompany (Man City),12'

Mode of Scoring in the EPL 2010/11,Part 2.

Some more descriptive trivia from the 2010/11 season culled from the play data,the initial posts can be found here and here.This time concentrating on the minutiae of how the ball is actually put in the net and from where.Penalties have been excluded because they'll artificially inflate the percentage of goals from inside the box.They would also inflate the amount of goals scored by right footed players because they took proportionally more spot kicks than lefties last season.Overall,left boots score less goals than right ones because the EPL contains more predominately right footed players.

Left Foot and Match Ball.The second most popular scoring match up in 2010/11.

Proportional Method of Scoring in the EPL 2010/11.

TEAM. % Goals from Headers. % of Goals from Shots. % of Goals with Left foot. % of Goals with Right Foot. % of Goals Scored from Outside the Box. % of Goals Scored with Other Body
Arsenal. 12.3 87.7 32.3 55.4 6.2 0
AVilla. 28.6 71.4 26.2 45.2 14.3 0
Birmingham. 25.7 74.3 17.1 57.1 8.6 0
Blackburn. 25.6 74.4 41.9 32.6 14.0 0
Blackpool. 17.6 82.4 27.5 54.9 15.7 0
Bolton. 22.2 75.6 42.2 33.3 15.2 2.2
Chelsea. 19.0 81.0 27.0 54.0 11.1 0
Everton. 27.7 72.3 27.7 44.7 19.1 0
Fulham. 31.3 66.7 22.9 43.8 15.2 2.1
Liverpool. 15.4 84.6 17.3 67.3 9.6 0
Man City. 7.8 88.2 39.2 49.0 15.7 3.9
Man Utd. 23.6 76.4 25.0 51.4 8.3 0
Newcastle. 28.0 70.0 32.0 38.0 12.0 2.0
Stoke. 29.3 63.4 22.0 41.5 7.3 7.3
Sunderland. 22.5 72.5 27.5 45.0 12.5 5.0
Spurs. 20.0 78.0 40.0 38.0 30.0 2.0
W Brom. 18.9 79.2 26.4 52.8 11.3 1.9
W Ham. 23.7 76.3 23.7 52.6 13.2 0
Wigan. 10.5 86.8 42.1 44.7 23.7 2.6
Wolves. 23.8 73.8 40.5 33.3 11.9 2.4
21.3 77.1 29.8 47.3 13.5 1.5

Arsenal scored the vast majority of their goals on the floor and from inside the box,with only Man City and Wigan scoring a smaller proportion of their goals from headers.Leading scorer Van Persie netted just once from outside the box and strongly favoured his left foot.Walcott's entire goal haul came from inside the box,mostly with his right foot and almost all before the hour mark.

For Aston Villa,the versatile Bent did all his goalscoring inside the box and spread his strikes fairly evenly between his left foot,right foot and head.Winger,Downing was equally adept with either foot and six of his seven goals were the game's opening score.The team as a whole were well above average for the proportion of headed goals.

Tottenham scored the biggest proportion of their goals from outside the box in 2010/11,30% of non penalty goals coming in this way.Of Palyuchenko's double figure haul,just under half were from long range,Defoe didn't score a close range goal all season and more expected long range scorers such as Huddlestone also chipped in from distance.Spurs also scored half a dozen goals after the 90th minute,half of which broke a tie.All of van der Vaart's goals were important as they either enhanced a one goal lead,drew Spurs level or broke the deadlock.

Stoke netted almost 16% of their goals after the 90th minute indicating a reliance and insistence on fitness,an added incentive to stay until the final whistle and risk the gridlocked post game car parks .The Potters were just pipped by Fulham as the team who scored the biggest proportion of their goals from headers and they also scored the smallest proportion of goals from shots.They weren't very adept at scoring from distance as only Arsenal scored a smaller proportion of their goals from outside the box,so both teams favoured scoring from close range.But whereas Arsenal worked the ball into the box,Stoke launched it and picked up the pieces.A case of two polar opposite approaches producing very similar raw statistics.Jones was the only EPL striker to manage a goal with his head,his right and left foot and finally with "other body part".

Kenwyne Jones.A Full House of Goals in 2010/11.
Man City relied on Tevez for over a third of their league goals,almost all of them from his right boot,so their continued surge to the top has been impressive.Perhaps more remarkable though was their virtual absence of headed goals,all bar one coming from Lescott either at corners or set plays.Silva and Johnson almost combined to get into double figures and both used their right foot exclusively for standing on.City were amongst the least active goal scorers in second half stoppage time.

Champions Man Utd came pretty close to splitting their goals scoring along league averages.Hernandez,Rooney and Berbatov did the majority of their goalscoring inside the box.Given Berbatov's reputation for being laid back,it was impressive that he scored 60+% of his goals after the break.

Liverpool,led by Kuyt,Torres and Gerrard were the most right footed team in the EPL as well as being another team who were particularly effective once the ball was inside the box.Kuyt also scored the league's latest goal after 102 minute from the penalty spot,equalising a similar Arsenal effort four minutes earlier.

Similarly,Birmingham scored proportionally very few left footed goals and goals from outside the box were also rare.Gardener's eight goals from midfield were invariably scored in tight games and all came after the 50th minute.

Teams with a liking for headed goals also included Everton,mainly on the back of Cahill,who was much more dangerous with his head from open play than his customary threat from corners in 2010/11 and Fulham who shared the heading duties between Zamora,mainly from open play and Dempsey and Hangeland from set plays and corner kicks.

Wigan carved out an unusual set of numbers,but illustrate why small sample sizes can be misleading as future predictors.They were behind just Tottenham when it came to long range efforts,but direct free kicks accounted for a significant proportion of these strikes.They won't be guaranteed to get the same opportunities this year and the predominately left footed N'Zogbia definitely won't be taking them if they do.Rodallega was their short range goal poacher especially late in tight games.

West Brom's Odemwingie was perhaps the most versatile of strikers.Not only did he bag double figure strikes,but he was equally adept with either foot.He scored from open play,counter attacks and set plays,but wasn't a player you needed to be overly concerned about when the Baggies swung in a high corner.

If anyone wants a more detailed breakdown for a particular team just use the comments box and I'll see what I can do.