A couple of weeks ago we had the "proof" that Andre Villas Boas was the superior Chelsea manager because he had averaged 1.70 points per game compared to Roberto Di Matteo's inferior 1.50 ppg.Casting aside for a moment the legitimacy of ignoring RDiM's Champions League and FA Cup exploits,the major concerns of the methodology used centered around the lack of any adjustment for different strength of schedules faced by the two mangers and more tellingly the small sample sized used in the studies.
RDiM hadn't yet overseen a dozen Chelsea league games and therefore his points per game total was liable to fairly large swings until his sample size increased.If we take an extreme example,a manager who has been in charge for just one game will have taken either 0%,30% or 100% of the available points and only as he plays more games will other percentages become available to him.
To further reinforce the danger of making bold statements based on little evidence,by the time Chelsea had collected the Champions League trophy,they had played one more league game,at home to Blackburn,which they won.If they win their opening game of the 2012/13 season and RDiM is still in charge,his Chelsea will have taken 21 points from 12 league games and will have a better points per game strike rate than AVB's Chelsea.So stand by for the "RDiM is a better manager than AVB". articles.
Facts like these are fun trivia,but the direction in which they are heading can switchback or soar in the relation to the paucity of data.Report them,enjoy them,but don't draw hasty conclusion from them that rapidly become today's informed opinion and if a conclusion is required from small data sets,then add as many caveats as you can.
Which brings us to the startling conclusions of one particular post that informs that Barcelona are a better side without/when Xavi doesn't start.The methodology is strikingly similar to that used to show AVB as superior to RDiM.In the 12 league games where Xavi didn't start,Barcelona scored more goals per game,let in less goals per game and had a higher win ratio compared to the 26 league games in which he did start.
If we start with the data used,all of the objections that existed with the flawed Chelsea study are again present here.Namely,small sample size (12 games are being compared to 26).No account is taken of the strength of schedule (Xavi actually played against a marginally better group of teams than those he was on the bench or rested for).In terms of goals scored and conceded,the batch of sides Xavi played against were more prolific than those where he didn't start and similarly,the opposing defences he started against were more secure than those where he didn't.So even if we neglect issues around sample size,the data is still raw and skewed against Barcelona's number 6,especially as one of the goals credited to "Barcelona,without Xavi starting" appears to have been scored by a late substitute going by the name of....Xavi.
One of the most powerful thoughts a football blogger can carry,with a hat tip to Simon Gleave's tweet is that "random chance exists. Season also not long enough to remove effect." So if you want to see how Barcelona play with and without Xavi,you should go back more than one season,so that the skill component of the individual results is given an opportunity to overwhelm the random element.Xavi has,after all played for their first team since 1998 and if his declining influence on matches is true rather than just an artifact of data mining,then it is unlikely to have developed suddenly this season.Trends would become apparent earlier,unless we have a readily identifiable incident,such as a player returning from a longterm injury.
If you search through a limited,season long amount of games consisting of countless teams and leagues,random patterns will emerge and often they will appear to implicate even great players as being the causative agent.However,data trawling and then assembling a story is rarely rewarded with the effect persisting in larger samples.We see patterns where none exist and single causes where many add to the effect.
To conclude,if we correct all of Barcelona's 2011/12 league stats for opponent strength and split them by Xavi starting or not,we do see the apparent under performance.In the 12 games where Xavi was on the bench at kickoff,if not at fulltime,Barcelona would have expected to get just over 32 points from those particular games and they got 33.In the larger 26 game sample where he played from the start,they would have expected to get in the region of 66 points and they got just 58.So it looks bad for Xavi.However,if we start to pull out random 12 game samples from the 26 games where Xavi started we soon get a group of matches where the "Xavi starts" Barca plays as well and sometimes better than the "Xavi doesn't start" Barca.Randomness can produce outstanding results from within a sample of merely great ones,just as easily as Xavi can be made to appear as the agent of Barcelona's decline by a selection of games data mined from a 14 year career.This approach has a much legitimacy as suggesting that Manchester City are relegation certainties based on their five points from five EPL games during March and April 2012.
In short,12 games does not define a team's true ability and one player alone doesn't account for all of that change even if it did.I think Xavi's worth is unlikely to take a hit this summer.