As they are wont to do, PaddyPower have recently published some rubbish. This is hardly groundbreaking, and normally it wouldn't even be worth mentioning. But I thought that we could take apart the rubbish article a bit for an instructive take on "What Not to do with Statistics." Every once in a while, I think that it's good to have a reminder to use numbers responsibly, because it's really easy to do otherwise.
The article in question promises to let you "Find out why Aston Villa, Norwich and Bournemouth look set for relegation, while Crystal Palace are certain to be safe." Well that certainly sounds of import to Villa fans, and there's no way it's merely a poorly-disguised excuse to run some comic photoshops of Tim Sherwood.
First order of business here: it's July. The season is more than two weeks away. No one is certain to be safe, and even if we're willing to make a few exemptions (we shouldn't be) Crystal Palace certainly wouldn't be afforded one. Whatever.
Semantic quibble aside, it's the way PaddyPower are using their numbers that bothers me. The methodology is basically as follows: the Premier League went to 20 teams in the 1995-96 season. From that point on, PaddyPower looked at how teams finishing in a particular place did in the following season. For instance, Aston Villa finished 15th in the 2013-14 season and finished 17th in the 2014-15 season. They then looked at these trends to reveal some big news!
Namely, and most relevantly to Villa fans, 42.1% of 17th-place teams were relegated in the next season since the Premier League went to twenty teams. This number is tied for second with the teams who were the winners of the Championship and only beaten (at 52.6%) by the Championship playoff winners.
If you'd like to see the data, I've gone ahead and done all of the legwork myself and compiled it into this nifty chart:
If you click on "1st" up in the bubbles, you'll see how each team that finished first in one season performed in the next, and so on. CW, CRU, and CPW are Championship winner, Championship runner-up, and Championship playoff winner, respectively. The data is arranged such that worse positions are lower on the chart.
What can we take away from this data? Let's make this a multiple choice question:
- Aston Villa are DOOMED
- Aston Villa have a 42.1% chance of relegation
- ABSOLUTELY NOTHING
If you answered "3," congratulations, you've passed Professor Lintott's Maths Quiz.
But why do they mean absolutely nothing? I mean, it's 19 years worth of sample size, and that's not exactly insubstantial. The problem comes in the simple fact that these year-to-year numbers aren't well-connected. While a particular team's performance from one year to the next can be a strong indicator of what may come, there is far too much variation in the teams who finished, for instance, 17th to make a strong correlative prediction.
We can illustrate this in two ways. First, let's talk basic math(s). Standard deviation is, to oversimplify a bit, a measure of how numbers cluster around an average, or said differently, how much numbers "agree" with each other. If all numbers in a set are equal, their standard deviation is zero and they are in total agreement. In this set of data, the lowest standard deviation comes from first place. Rather unsurprisingly, teams who finish first in one year tend to do very well the next year (though 2013-14 Manchester United are a PERFECT example of why these numbers shouldn't be used predictively, as they are the only 1st place team to follow up with a performance outside of the top three). The standard deviation of those numbers is 1.31, which is pretty small. For 17th place (since that's all we care about)? A standard deviation of 4.95, the second largest of any spot on the table. 17th place teams have followed up their performances by finishing from anywhere between 4th and 20th, inclusive, in the following year.
Let's debunk these numbers another way, though. I think we all can agree that one year of football is, largely, an exercise in randomness. Great teams can have bad years and terrible teams may overperform. So, if we simply remove one year from this math the results change drastically. To refresh our memories, 17th place, Championship winners, runners-up, and playoff winners have been relegated 42.1%, 42.1%, 36.8%, and 52.6% of the time respectively.
If we remove just the 2001-02 season (in which, of those four, only the previous year's 17th-place team was relegated), those numbers change to 38.9%, 44.4%, 38.9%, and 55.7%. One year (which is equally as likely to be aberrant as any other) drasticaly changes our perception of this set of data. If one small tweak can change our perception so strongly, we probably shouldn't be using that perception.
So should Aston Villa fans be worried about relegation? Maybe! Should they be worried about relegation because 42.1% of 17th-place teams were relegated in the next season? Absolutely not.
That said, can we have some fun with the numbers? Definitely! So long as we don't draw sweeping conclusions from a lot of unconnected seasons we can point out some statistical peculiarities that are fun to look at. For instance:
- There have been 76 teams in the past 19 seasons who have finished in the top four. Of those, only two have finished the next season in the lower half of the table.
- Teams that have won the league since 1996-97 have never finished worse than third in the previous year. But teams who won the league in a previous year have finished as badly as seventh. That was the 2013-14 Manchester United team under David Moyes. Get rid of that and teams who won the Premier League have never done worst than 3rd in the following year.
- You can't get rid of that, though. That's the whole point of this column. Stop getting rid of things. Stop abusing numbers. Numbers have never hurt you.
- The biggest change from one year to the next came when Ipswich Town went from being the Championship playoff winner (6th place) in the 1999-00 season to 5th in the Premier League in the 2001-01 season. The second biggest change? Well, it's a tie between Blackburn (who finished 6th in 1997-98 and 19th in the following season) and... Ipswich. They followed their amazing worst-to-fifth jump with a plummet from 5th to 18th. They're the only team to be relegated from fifth.
- We're pausing from the numbers for a second, but think about that. In May 1999, Ipswich fans were unsure what league their team would be in the next season because they had done well but not wonderfully in the Championship. 12 months later, they had secured a place in the UEFA Cup and felt like they were on the top of the world. And 12 months after that, they were in the Championship again (though rather hilariously, back in the UEFA Cup thanks to the Fair Play slot).
- 10th place has never been relegated. 10th place teams have finished between 5th and 17th in the following year.