FO Basics: Regression Towards The Mean
by Bill Barnwell
Please click here for the other articles in the "FO Basics" series:
- August 30: Where our stats come from, and the difference between charting stats and play-by-play stats.
- August 31: A summary of research from our first seven years.
- September 1: Our college stats, how they differ from our NFL stats and from each other.
- September 6: The importance (and limitations) of watching games on tape.
Regression towards the mean. It's a phrase we use a lot at Football Outsiders, describing a concept that's probably easier to explain in a sentence than in two paragraphs. It's most often employed when looking forward at how a team, unit, or player will do in an upcoming season, which makes it a wonderful little lightning rod for controversy: What happened last time is unlikely to happen again.
In this latest entry in our "FO Basics" series, I'm going to take a step back and review the concept of regression towards the mean, explaining what it means and how we apply the concept to football. I'll also try to address some of the reader comments I've seen about regression towards the mean and how it's discussed and implemented at Football Outsiders.
Let's start with the phrase itself. Regression towards the mean implies that a data point is unlikely to happen again, and that the next instance of whatever that data point is representing is likely to exhibit a level of performance closer (regress) to average (the mean).
The simplest example of regression to the mean is flipping a coin. Go flip a coin 10 times. Let's say you get one head and nine tails. Perhaps you got tails 90 percent of the time because no one believed in you, you worked extra hard implementing your coin flipping strategy in the offseason, or your hand wanted tails more than the other one.
Now, go flip a coin 10 times again. Maybe you'll get one head and nine tails again. Chances are, though, that you'll get something closer to what we know is the true mean of flipping an unweighted coin 10 times -- five heads and five tails. You might roll five and five, or three and seven, or eight and two.
Something to clarify here is the difference between regression towards the mean and regression to the mean. It's a subtle difference, but one that we have too often been guilty of misstating. (We're going to try to fix that in the future.) Regression to the mean in a sample of 10 coin flips would suggest that you're likely to pull up five heads and five tails on that second set of flips. That's not very likely at all. Using the binomial distribution, we know that the odds of you getting exactly five heads and five tails in 10 flips is just 24.6 percent. The odds of you getting more than one coin heads-up in 10 flips, though, is better than 98 percent. Regression towards the mean is extremely likely; regression to the exact mean is pretty unlikely.
In football, the concept is more complex. Very few things are a 50-50 shot, the way that flipping a coin is. Fumble recoveries are a good example. While what the league refers to as aborted snaps are recovered by the offensive team about 80 percent of the time, the vast majority of fumbles have been proven to be about a 50-50 proposition. Take sacks: In 2009, 94 of the fumbles caused by sacks were recovered by the offense and 93 were recovered by the defense. On running plays, the offense recovered 73 fumbles, while the defense got to 108. That seems like the defense is recovering a larger portion of the fumbles than expected, but the year before, the offense recovered 88 fumbles on running plays and the defense picked up 91.
That randomness also extends to teams. Although there are myriad stories about teams that place an emphasis on flowing to the football (notably the Lovie Smith defenses in Chicago before they started to suck), teams that recover a particularly small or large percentage of the fumbles that hit the ground in their games in a given season will often see that figure regress toward the mean in the subsequent season.
The past three years provide plenty of examples. In 2007, the Cincinnati Bengals recovered a league-high 70.6 percent of the fumbles in their games (24 of 34). In last were the Baltimore Ravens, who only recovered nine of the 25 fumbles in their games, for a dismal recovery rate of 26.5 percent. (Not that these numbers are based on standard plays, so we're not counting the occasional special teams fumble.)
A year later? The Bengals did regress to the mean, albeit very slightly. They recovered 57.5 percent of their fumbles, which was good for third in the league. No. 1, though? Those Baltimore Ravens. They got 63.3 percent of their fumbles, nabbing 19 of the 30 loose balls. In 2009, Baltimore was 12th (15 of 28, 53.5 percent), while Cincinnati was 17th with an even 50 percent (15 of 30). Last year, Tampa Bay led the league by recovering a staggering 78.8 percent of fumbles, while the Bills recovered a league-low 35.5 percent. The year before, Tampa was 30th in the league and Buffalo was 13th. The year-to-year correlation for fumble recovery rate in 2008-09 was -0.01; in 2007-08, it was -0.07. That suggests that last year's fumble recovery rate has absolutely no predicative value whatsoever.
That leads to another point of clarification: The difference between "regression towards the mean" and what's known as the Gambler's Fallacy, the idea that something is "due." When reading comments about different aspects of our projection system that rely on regression towards the mean, I've seen this come up as a criticism of such a model. I'd like to point out the difference.
As an example, take those 2009 Buccaneers that led the league in fumble recovery rate. When we say that the Buccaneers' fumble rate will regress towards the mean, that doesn't meant that they're due to finish at the bottom of the league this season. We just established above that a team's fumble recovery rate is random from year to year, so there's no way to predict what the Buccaneers' actual recovery rate will be. We do know that teams with that sort of recovery rate in the past have had no ability to maintain that rate, and that the average rate of fumble recoveries for a team is 50 percent, though, so we can safely say that a team with an outlying rate of fumble recoveries is very likely to regress towards the mean in the subsequent season.
This is an example of how difficult it is to build predicative formulas for football. Although the odds of Tampa maintaining that 78.8 percent fumble recovery rate are remarkably slim, we can't project them for anything beyond a league-average fumble recovery rate of 50 percent. They might actually recover 65 percent of the fumbles on the ground and gain an extra two wins because of it. They could also recover 35 percent of the fumbles that hit the turf and lose two extra games.
If this seems too obtuse of an example, consider the coin flip example again. Let's say you're going to flip a coin 10 times. Each time you flip the coin and it comes up heads, I give you $100. Each time the coin comes up tails, you give me $100. In the first run, you flip nine heads! You get $800, and I'm left checking the coin.
Now, let's say that we're going to run the same bet on a second test of 10 tosses. As you might suspect, you shouldn't be counting on $800 again. Although the chance exists that you might flip nine heads again, it's very slim. As mentioned earlier, the chance is less than two percent, so there's better than a 98 percent chance that the coin flipping will regress towards the mean. The most likely scenario -- which happens 24.6 percent of the time -- is that you'll flip five heads and five tails, and we'll each get zero dollars. In the other 75.4 percent of trials, even though you should expect nothing, you may end up getting $200, losing $400, or losing $1,000. Add up all the probabilities and amount of money you'd get from flipping a coin at said probabilities and your expectation is ... zero dollars.
Another somewhat controversial way that we apply the concept of regression towards the mean to our projection system is with injuries. Research we've done on injury rates has suggested that injuries play a dramatically important role in team success, but that the injury rate of teams from year-to-year is random. Recently, though, we've seen teams like Tennessee and Kansas City (and to a lesser extent, Dallas) stay towards the top of the health charts, while teams like Detroit, Cleveland, and St. Louis wallow in injury pity. How should our projection system view the injury rates of these teams going forward?
One way to check this is to see if the year-to-year correlation for injuries has changed. That is to say, do teams that were hurt in a given year stay hurt in the following season? Do they regress towards the mean?
In early 2007, when we started publishing injury research on what later became the Patriots Daily site, the answer was that injuries were almost unquestionably random. The year-to-year correlation for Adjusted Games Lost (our injury metric) by a team's starters from 2002-2006 had ranged from 0.14 (2004-05) to -0.04 (2002-03). This suggests that there's virtually no year-to-year consistency for AGL. If we plug in the actual number of games missed by a team's starters, the range goes from 0.08 to -0.02.
Since then, things have been mostly similar. The table below charts the correlation of starter AGL from year-to-year, including those subsequent seasons.
|Table 1: Injury Correlation|
|Years||AGL||True Starter GL|
Whoa. There's virtually no predicative ability with the previous year's injury rate ... until 2009. We've written about the remarkable weirdness of 2009 before, but here's another example of why it was so strange: For the first time since 2002, injury rates actually stayed somewhat consistent from year to year.
(For the statistically skeptical, I chose 2002 as a cutoff point because it was the year Houston joined the league, not because 2001-02 had anything to do with or against this analysis. And if you're confused about the difference about Adjusted Games Lost and True Starter Games Lost, AGL is a measure that incorporates players who are listed as Probable, Questionable, and Doubtful and their position's historical rate of missing time with those injury designations.)
I don't believe that individual teams are throwing off the research. Dallas is the example most commonly given since we talked about their unlikely run of health in Pro Football Prospectus 2008 and then in Football Outsiders Almanac 2010, but Dallas has really just been a hit-or-miss team as far as health goes. They finished second in AGL in 2009, one of four finishes in the top two over the eight-year span ... but in the other four seasons, they ranked 25th, 21st, 18th (2007) and 17th (2008). Tennessee has ranked in the top three for three consecutive seasons, but they ranged from 11th to dead last over the five previous years. Although the chances are slim that Tennessee's health would be a statistical fluke, it's certainly possible. Detroit had back-to-back top 10 finishes in team health before injury-riddled seasons in 2008 and 2009. Before a legendary FOMBC afflicted them in 2007, the Rams' health had been consistently middle of the pack.
Although we'll never be able to tell Tom Brady to watch out for Bernard Pollard in Week 1 of the 2008 season, we can employ this information at a more micro level. Take offensive linemen. Before the 2009 season, we suggested that the Giants and Jets would suffer more injuries up front. Both teams had just made it through two consecutive years where their five starting linemen had put up 80 starts. From 2001-2008, there had been 25 other instances of a team's five offensive linemen doing that. Only two teams (the 2002-03 Chiefs and Vikings) had been able to pull off two consecutive years of perfect offensive line health. Neither team had made it to three years.
In 2009, the Giants' offensive line suffered injuries and cratered in value, causing a dramatic decline in their running game. Meanwhile, the Jets managed to pull off an unlikely third season of health, with their offensive linemen starting all 80 games for a third consecutive year. The result was an effective rushing attack (11th in DVOA) despite a DOA passing offense.
Does this mean that the Jets have broken the system and that we're expecting them to have zero offensive line injuries again this year? No. From 2001-2009, the year-to-year correlation for offensive line injuries is exactly -0.01. Again, that means that there's just no predicative value in using the previous year's injury rate for the next year's rate. Those 2004 Chiefs only missed one game up front, but the 2004 Vikings missed 18. In 2005, the Chiefs were up to six, and by 2006, they were at 19. On average, those teams that had a perfect season of offensive line health had nine games missed by their offensive line in the next season. The excuses that might come up for why the Jets were able to stay healthy won't pass the most basic of B.S. detectors -- it's not like they were the first offensive line that was nasty or had veterans that knew how to keep their bodies healthy.
Although we were wrong about the health of the 2009 Jets offensive line regressing towards the mean, the odds are very high that a Jets offensive lineman will miss time with an injury this year. Nothing is ever a sure thing when it comes to projecting football teams, but with regression towards the mean, we can say with a reasonably high amount of certainty that something is likely to happen. Or, more accurately, that something isn't likely to happen again.