Tuesday, 10 July 2012

Spotting Genius is Easy, But what about The Rest.

Nearly forty years on and it is still a vivid memory. The Blue shirted Swedish defender appears to be in control of the situation. Notwithstanding the instant control, the attacker seems to have few options, he's got the ball wide on his team's left flank, about ten yards from his opponents goalline, but he's facing back down the touchline towards his own half of the pitch. His only real choice is to lay the ball back to a supporting colleague or at best lift a hopeful inswinging cross with his right foot into the congested area. And for a split second he seems to have chosen the latter. The defender sticks out a half hearted left leg to block the expected response. He's induced the cross from the attacker and from that angle and with no real intended target it's almost certain to be cleared or claimed by his keeper.

But then it happens. Instead of the cross, the attacker deftly caresses the ball back through his own legs with his right instep. Instantly he pivots on his standing, left leg and sprints after his own pass towards the now undefended byeline, showing only his black number 14 on his Orange shirt and a blur of attacking intent to his bemused marker.

Ladies and gentlemen, the Cruyff Turn.

A moment of supreme skill as well as theatre and recognisable as such to anyone who saw it. Unfortunately, it's not always as easy to separate the skill from the mere mundane in football as the talent shows itself in minute improvements in efficiency of passing, quickness of brain or variation in power or placement.

Increasingly analysis is turning to rate statistics to attempt the classify a pecking order for talent based footballing actions. Every transfer target now comes with his numbers attached, shot conversion rate, cross conversion rate, pass conversion rate, the list goes on. So it's vitally important that we have some level of confidence in these type of figures.

How, why and even if the quoted statistics are causatively correlated to match outcome would appear to be the most obvious course of the initial investigation and that should be followed by how much strength we should attach to players who have recorded impressive or not so impressive records from limited sample sizes. However, there is one stage we need to evaluate before these processes are even applied to the raw numbers.

The most fundamental question that is rarely asked is "Is the factor we are measuring even a skill"? or are we simply seeing random fluctuations in performance rates that are entirely down to chance. Once we have confirmed our intuition and found evidence that we are indeed looking at a skill, we next need to know how much of a skill it is. Only then can we start to begin to know how many observations we need to make of a group of players before that skill begins to shine through.

Solving the "is it a skill and how much of a skill is it" ? problem would seem straightforward. The natural starting point is the player's raw conversion rate. So imagine you've identified an attribute that you've seen on a pitch and you want to purchase a player who displays that attribute. I don't want to confuse things by picking an obscure skill that may or may not be almost entirely luck based, so we'll just call it "The Attribute".

The success rate at completing our particular attribute in a group of similar players averages exactly 10%. Let's imagine you've got a list of 50 prospective purchases and their success rate over their last 100 occasions of attempting to perform your team's desired footballing task. Below I've listed the top 5 and bottom 5 performers.

How many Success Were Recorded by Player's Performing "The Attribute".

Player. Number of
Successes in 100 Attempts.
Price ?
1st Ranked. 17 17%
2nd Ranked. 16 16%
3rd Ranked. 16 16%
4th Ranked. 15 15%
5th Ranked. 15 15%
46th Ranked. 6 6%
47th Ranked. 6 6%
48th Ranked. 5 5%
49th Ranked. 5 5%
50th Ranked. 3 3%

So who do you buy? If you're Manchester City or Chelsea and money is no problem, you buy one of the top five, probably driving up the price to unrealistically high level. If you're Stoke you pick up number 48 on a free transfer, possibly with a history of off field problems.

The catch is that the group from which the above table is just a part, was actually generated randomly. I set the success rate to 10% and over 100 different observations for 50 different players, this is the range of outcomes that appear solely through chance. If you broke the bank to buy the top ranked player, then you're out of luck because he doesn't have any skill when it comes to excelling at the attribute, because no one has. The table makes it appear that skill is involved because that is the kind of distribution of successes and failures that people expect to see if skill is a factor. Player 1's 17 successes came about entirely by chance, as did Player 50's mere three successes.

Fortunately, it's possible to take a mathematical look at the spread of the distribution of successes and failures for a group of players and be able to tell if it has likely arisen through chance alone or if it has been skewed by external factors that could be attributed to a varying level of skills within the group. Applying these methods to the tabulated numbers above confirms that I generated the distribution randomly.

However, if I use the same methods to look at such attributes as a players ability to create chances that are converted in open play, you find that the distribution of converted chances that are turned into goals doesn't resemble a random distribution. Another factor, such as a combination of skills spread between the provider and the scorer is present. If we further look at the ability of players to provide clear cut chances for their team mates, we find that not only does that appear to be a talent, but it is a much rarer talent than any other that I have currently looked at.

This was intended as an excuse to post a photo of my Cruyff shirt and discuss chance creation. Instead it's turned into a post on the steps needed to validate, analyse and produce reliable and credible static player rate statistics that are beginning to flood the blogosphere. Knowing the spread of talent within a group of players is vital to giving you an idea about the reliability of your conclusions over a limited number of observations. In short, you need to know if you're likely buying a lucky or a talented player and how much of your desired skill  is down to player talent.

I'll deal with chance creation later, but at least I kept in the Cruyff reference.

