Are NFL Careers Really Getting Shorter?
by Zach Binney
As those of you who have read my previous articles for FO know by now, my main beat is injuries. One related question that grabbed my interest recently is whether the apparent increase in injuries in recent years (which may itself be just an artifact of better data collection or more attention being paid to the issue of injuries) could be shortening NFL careers.
So, I started poking around to see if anyone had done an analysis like this. It turns out that in late February, the Wall Street Journal published an analysis of NFL career lengths with startling findings: the average career length of an NFL player had dropped from 4.99 to 2.66 years from 2008-2015. That's a drop of nearly 50 percent in just seven years, and most of the drop came from 2011 onwards! Making the findings even more interesting, the decline was as linear as it was precipitous, and it held across all positions. The article's graphic is reproduced below:
The article proposed several plausible theories for the decline. Injuries were a main one. It's also possible that teams are simply churning through more players than ever before (FO's data shows a gradual increase from 1,895 regular season players in 2007 to 1,962 in 2015, a 3.5 percent increase). The 2011 collective bargaining agreement (CBA) increased training camp rosters from 75 to 90 players and introduced a new more favorable rookie wage scale, so perhaps teams made a conscious decision to construct younger rosters and jettison older players more quickly.
When I saw these results, I was intrigued (as were others. How could I have not known about this? How could the NFL Players Association not be up in arms about their players collecting paychecks for only half the length of time they used to? I set out to replicate the analysis using the article's data source of Pro-Football-Reference (PFR). The short version: I couldn't. When I used PFR data, I found virtually no decline in NFL career lengths in recent years.
I reached out to the WSJ's sports section on Twitter to try and figure out what was happening. They quickly replied and did what the best scientists and analysts do -- they shared their raw data and walked me through exactly what they did.
It turns out they had used PFR's Football Encyclopedia of Players (the pages listing players by last name) to extract player names and the first and last years they played. Good idea in theory, but in reality this Encyclopedia includes any player who ever played a regular season NFL game and -- this is very, very important -- whatever other players PFR felt merited tracking. Even more importantly, the cutoff for these "other players" seems to have changed dramatically over time, becoming much looser in recent years.
In 2011 -- right around when the career length drop really starts taking hold -- PFR appears to have decided to start more broadly tracking players who never played in regular season games. What's more, the group of players PFR included broadened every year: for example, in PFR's Player Season Finder query tool there were 379 "0-game" seasons in 2011, versus 398, 485, and 533 in 2012, 2013, and 2014, respectively. There were only two such player-seasons total from 2000-2010. Of note, since I first pulled my data back in March 2016, PFR has (mostly) cleaned this tool of these 0-game seasons for 2011, 2013, and 2014, but the 398 from 2012 are still visible -- they appear to have some sort of "NULL" value for games rather than a zero and were not caught by whatever cleaning PFR implemented. All these 0-game players, however, remain in the Encyclopedia.
This meant that, in the WSJ analysis of PFR Encyclopedia data, 2011 and later years had lower-quality players, on average, than previous years, and that the problem grew over time. Lower-quality players are going to have shorter careers. The trend the article found isn't because NFL players, as a whole, are having drastically shorter careers. It isn't even primarily that there are more lower-level players being given a (brief) shot at the NFL, though that may be happening to some degree. It's because beginning in 2011 PFR decided to start tracking more players at the bottom end of rosters who never played.
I used a different method to mine only players who played in at least one regular season game -- a population with a simple, consistent definition that stays steady over time that is also more appropriate for assessing NFL career lengths (See the Appendix at the end of this article for details). Our own Scott Kacsmar touched on the issue of including players who never played when calculating career lengths in his series on NFL draft production. With due respect to our colleague Ben Muth, I'm not sure I'd want to count his one training camp with the Chargers in calculating the average career length of NFL players (though, to be fair, that's one more training camp than I'll ever have).
In this consistent and more appropriate population you don't see anything near the cratering of career lengths reported in the WSJ. You could argue that career lengths in 2012 and 2013 were at their lowest since 2005, about half to three-quarters of a season lower than the average from 2007 to 11. This might reflect the beginnings of a trend toward earlier retirements, but it's far from a rapid halving of career lengths. Looking at the chart more broadly, I'm inclined to think that the recent variation is just random and well within historical norms. The standard errors (basically, a metric for how precise our measurements are) bear this out, as well:
If you stratify by position, defensive linemen and running backs show larger recent declines (defensive linemen from 6.6 to 6.0 to 5.5 from 2011 to 2013; running backs from 6.5 to 4.5 to 4.6 over the same time), but no position shows the precipitous, sustained, linear decline shown in the WSJ article. In addition, retiring defensive backs and linebackers actually showed slight increases over this period.
[ad placeholder 3]
Now, career lengths are tremendously right-skewed, so before my statistics professors yell at me that the averages tell us NOTHING, LEBOWSKI, here are some side-by-side boxplots showing the spread of the data. As a quick boxplot tutorial for the unfamiliar: the bottom and top "whisker" ends are the minimum and maximum career lengths; the bottom and top of the box are the 25th and 75th percentiles; and the middle of the box is the median. This means that the orange box is the 25th to 50th percentile of retiring player career lengths, while the gray box is for the 50th to 75th percentile of that same value. The pattern looks about the same -- median career lengths haven't changed since ticking up from four to five years in 2006 with the exception of a blip to six years in 2011. The 75th percentile did drop from nine to eight years in 2012, and the 25th percentile from three to two years in 2013, but those years' retirements really don't look out of whack historically:
So what can we conclude? Players who play -- the main population that we're interested in when it comes to career lengths -- were playing about as long as they used to through 2013. This may have changed over the last couple of years, but PFR does not have the retirement data to let us investigate that. The original WSJ analysis was flawed because it included large numbers of players who never played from 2011 onwards but not before, making average career length appear to shrink when we really were just looking at lower-quality players.
One other big lesson: this is a cautionary tale. It's really, really, really easy to make mistakes like this, even when you get data from a reputable source like PFR and go through a robust series of error checks like the WSJ author did. I would just caution everyone to dig deep and make sure your data don't have any strange patterns (like a massive rise in 0-game players). Always question, and always look for yourself. This exercise was especially terrifying for me because I could see myself making exactly the same mistake.
[ad placeholder 4]
A couple last-minute notes: my numbers are higher than WSJ's across-the-board because, in addition to excluding players who never played, I defined career length inclusively (last year - first year + 1) instead of exclusively (last year - first year). These are in line with Scott's numbers in the draft series and those of the NFL from 2011. They are, of note, substantially higher than the roughly 3.5 years commonly quoted by the NFLPA, but that number is flawed because it reflects average experience at a cross-sectional point in time rather than the actual average length of a full career.
Also, I excluded 2014 because, in PFR data, a player's "last year" is listed as the last year in which they played a regular season game, even if they might still be active. So, for example, Jordy Nelson -- who tore his ACL and missed the entire 2015 season -- is right now listed as ending his career in 2014, which is likely false. This is likely to be a big problem for the 2014 season and a progressively smaller one as we go further back (since, for example, a still-active player would've had to miss the full 2014 and 2015 seasons due to injuries or other reasons to be erroneously listed as ending his career in 2013). I wouldn't be surprised if the modest downticks in 2012 and 2013 are partially driven by this or similar residual effects, though.
Appendix: PFR Query Methods for Players Who Have Played in One or More Regular Season Career Games
First, I extracted data on drafted players from 1980 to 2015 who played at least one game in the NFL using the Draft Finder Query. Second, I supplemented that with data on undrafted players over the same period playing at least one regular season game using the Player Season Finder query.
Zach is a freelance injury analyst and a Ph.D. student in Epidemiology focusing on predictive modeling. He consults for an NFL team and loves Minor League Baseball. He lives in Atlanta. You can contact him on Twitter @zbinney_nflinj.
20 comments, Last at 31 Aug 2016, 2:15pm
#2 by billprudden // Aug 29, 2016 - 3:01pm
I'd chalk this up to the tremendous pressure news organizations, by every definition, are under to publish quickly and frequently these days. The rate at which even WSJ, NYT, and WaPo get so very much wrong is far greater than it used to be...
#3 by dmstorm22 // Aug 29, 2016 - 3:43pm
Forget that, people should sense check their results anyway. Despite concussion concerns and other anecdotal evidence, that in a five year span, careers would be cut in half should have set off serious alarm bells that something is up.
As the author of this post noted, that would have been such a serious problem it would have gotten far more traction within the NFLPA, probably even before the WSJ would write about it.
Really shoddy work, especially since the results screamed that there was some factor not being considered, and that it wasn't all too hard to find out what it was.
Good work by Zach to figure this out.
#5 by Zach Binney // Aug 29, 2016 - 8:06pm
Thanks for the comments, everyone! There are a few additional nuances to the story I didn't include because they involve private communications. I do want to say for the record that the author is an extremely smart guy and a capable analyst who did quite a bit of due diligence for this article, though in the end I do believe he and WSJ simply got it wrong.
The main reasons I wrote this article were 1. To set the record straight, and 2. To provide a cautionary tale of data analysis gone awry. On the first note, the original article got a ton of play on major sports outlets (SI, Deadspin, all kinds of sports radio, etc.) so while FO provides an amazing initial platform if you can get anyone elsewhere reading and commenting on this piece, we can help to broaden the correction. "NFL careers not actually dropping" isn't a tenth as sexy as "NFL careers cut in half," unfortunately. Thanks for reading!
#6 by TADontAsk // Aug 30, 2016 - 10:14am
It is a huge problem in the research field, that studies just don't get published unless there are statistically significant results. An article that shows nothing is happening can be just as important, yet it isn't going to get out there as easily.
#8 by travesty // Aug 30, 2016 - 7:37pm
I suppose an alternative analysis would be, for each year, consider, for each player who played in a game, how many seasons ago they first played in a game. If players were retiring earlier on average, you'd expect the average current experience to be decreasing from year-to-year.
#10 by Bright Blue Shorts // Aug 31, 2016 - 5:13am
The initial graphic is quite interesting in the order of career longevity we have ... OLs, DBs, QBs, LBs, DLs, TEs, RBs, WRs
I get that OLs are longest because generally they're very large people and relatively there just aren't that many very large people with talent in the whole of the United States.
And I can see that QBs will usually have lengthy careers (if they make it) because their skillset is the most complicated, so again relatively fewer contenders.
But DBs 2nd? Slightly surprised. I assume they generally suffer fewer injuries because on average they're involved in less contact so that helps with longevity.
#11 by Aaron Brooks G… // Aug 31, 2016 - 8:40am
If you look at the order as ranked by the initial year only, it makes more sense.
It just needs that observation that you lose speed before strength. Also, the skills guys tend to have a deeper pool of players relative to the number of starters (5 linemen have what, 2 backups total?) so they're more fungible. Makes more shorter careers, once you except the handful of mutants in the ranks.
#13 by Bright Blue Shorts // Aug 31, 2016 - 9:23am
I think I see what you mean ... that graphic is ranked by the 2014 data but you're suggesting look at year 2000?
For that time the order's QB, OL, DL & LB, DB, TE, RB & WR
Which is much more in line with what I'd expect.
Good spot, thanks.
#12 by brian30tw // Aug 31, 2016 - 9:07am
Great article and analysis, but I have one minor quibble that extends beyond just this article, so apologies for the rant. I don't know why people insist on talking about standard errors for their estimates when the dataset they are using is the entire population of interest, not a sample. There's no such thing as a standard error of average career length when you're examining the entire population of careers. You're computing the true average, not an estimate of the average, so there's no sampling error involved.
#14 by Zach Binney // Aug 31, 2016 - 10:37am
This is actually a really interesting point that brings us to a sort of statistical philosophy discussion. I've talked about this a lot with several of my professors, and I think there's a case to be made on both sides. You're 100% right that we're measuring the whole population here and thus there is (EDIT: not) error from what we commonly think of as sampling. But another way to think of it is this is just one universe of the possible set of different football players. Or, put another (more realistic) way, different sets of players COULD have retired in each of these years, they just didn't.
Another example: say you're measuring lung cancer incidence in an asbestos factory. Say it's a static population of workers and you measure every worker in that factory. Is there random error in this measurement? Does it matter whether there are no other asbestos factories in the world (like there being only one NFL), or 3 others, or 1,000 others?
I like still including standard errors as it helps to quantify what variation we might expect year to year from measuring a finite number of players (even if it is the whole population of interest in our universe). Like, if I'm looking across Figure 1 and trying to decide whether a drop of 0.6 years from 2012 to 2013 is substantial, it's important to know whether I measured 10 players or 1,000 players, regardless of what percentage of the players I measured.
Then again, I measured 100% of the players, so there's no error from what we typically think of as sampling. So from a theoretical and conceptual perspective, maybe including standard errors doesn't make sense. Like I said, I can see both sides!
#17 by brian30tw // Aug 31, 2016 - 12:45pm
Agreed, it's very philosophical and not really clear one way or another. I don't mean to detract from the overall analysis here, which is great. Just little things like that catch my eye!
I definitely agree that the standard error, or whatever you want to call it in this case, provides context and helps to determine whether a change in career length is "meaningful."
I run into similar issues as an economist. If you want to run some sort of cross-country regression model, and there are "only" 200 countries in the world, you run into "small sample size" issues, but it's not really a sample, it's the whole universe of countries! It's not clear to me the "usual" considerations need to be worried about in cases like that.
#20 by Bright Blue Shorts // Aug 31, 2016 - 2:15pm
Morten Anderson only played 25 seasons but he hung around into 2008 hoping that a team would need him as an injury pickup.
As his Wikipedia states "In the 2008 season, Andersen did not receive a contract offer from any team, but waited until December 8 to officially retire. Had he played on or after December 6, he would have been the oldest NFL player to play, breaking George Blanda's record."
But I think the graph shows him as playing his last game in 2007.