Writers of Pro Football Prospectus 2008

Most Recent FO Features

JohnsonKer18.jpg

» Seventh Day Adventure: Week 13

The biggest game this week is the Iron Bowl, where the playoff hopes of Alabama, Auburn, and Georgia hang in the balance.

25 Sep 2015

NFL Injuries Part I: Overall View

by Zach Binney

Introduction and Motivation

When I first sat down to analyze injuries in the NFL a couple years ago, my goal was to immediately build a fancy model that would let teams see into the future, predicting exactly who would get hurt and when. It would refine the use of the broad "injury prone" tag, letting smart teams (read: those who would listen to me) find great value on guys who had been written off by other teams as made of Gatorade-soaked tissue paper. It would take into account player-level clustering, team-level effects, and time trends. It would tell us in a single color-coded number who was going to go down, allowing teams to approach free agency and the draft with robust historical data rather than the subjective opinions of (best case) a training staff and team physician or (worst case) a general manager with no medical training.

What I quickly realized was that most people can't quote the overall risk of an injury in the NFL. How many players is a team likely to lose in any given year? They had no idea. They certainly couldn't break it down by position. They might have a vague sense that injury risk rises with age, but how much? They probably know (if they read the Adjusted Games Lost column) that injuries have increased over time, but they don't know what sorts of injuries are driving that rise. They might have a sense that ACL tears are bad, and if they pay a bit more attention they might know to tear their hair out when their team announces any injury with the word "triceps." But how nervous should they really be about a hamstring, or a high ankle sprain?

My background is in epidemiology -- most people think we're skin doctors, and the rest imagine us as the "virus hunters" who go out into West Africa and search for Ebola. To be clear, I am way, way too cowardly for that. At a higher level epidemiologists want to do two things:

1. Describe the distribution of diseases (for example, injuries) in a population (football players) and,

2. When we see differences within populations (variation by position or team or year), ask and analyze why these differences exist. On my first day I was gunning to ask and analyze why injuries happen -- and, by extension, predict them -- but skipped right over the describing injuries phase, mostly because I figured that had already been done. It really hasn't.

Describing injuries seems basic, and it is. This post is going to be nothing but simple counts and averages, but it's incredibly important work, and I hope it's interesting as well. It's exciting to have a deeper understanding of what's likely to happen when your star linebacker -- or your opponent's stud left tackle -- goes down with a "knee" or "ankle" injury. Also, with the (justified) focus on concussions lately, the attention on the toll professional football takes on the human body has never been higher. It's a great time to get smarter about that toll. So, read on… or not, I've already got your page view.

One more thing: none of the below would have been possible without FO's foresight in collecting detailed injury data for the past 15 seasons. It's a basic point, but in analytics we're always limited in the questions we can answer by the data that we have. FO has some amazing data, and the years of interns and data managers who collected it are the unsung heroes of this piece. It's good to remember that.

(Ed. Note: I should point out that we didn't start collecting this data until 2007 or so; we then went back and filled in past data going back to 2000. Proper gratitude goes to Bill Barnwell, who came up with the idea of starting the injury database in the first place back when he worked for Football Outsiders, and did a lot of the initial work to go back and fill in those older years. Since he left for Grantland, the injury collection has been managed first by Danny Tuccitto and then by Scott Kacsmar. -- Aaron Schatz)

Disclaimers

This data was gathered primarily from injury reports. Scott Kacsmar and others have written about this in AGL columns and elsewhere, but we rely on teams to truthfully and correctly report injuries. We know there's a lot of variation in how teams report -- both whether they even put a guy on the list, and the level of detail they're willing to give out (a true MCL sprain could be a "Knee," "Knee Sprain," "Knee -- MCL," or "Knee -- MCL sprain" in our database) -- so all these analyses are contingent on that data being valid. Garbage in, garbage out.

Additionally, there are some player injuries not included in these statistics because the players never played an NFL snap or were simply deemed too irrelevant. For example, an undrafted rookie who sprains his knee in training camp and gets cut from the 75-man roster isn't counted in these analyses. Thus, our total number of injuries is probably an underestimate of the true number of injuries and the toll professional football takes on the human body.

Finally, the week that an injury "occurred" is the first week the player is listed on the injury report. So if a player appears with a groin injury in Week 4 in our dataset, we do not know exactly when he suffered that injury: it could have been in the Week 3 game or in practices leading up to Week 4. We can only pinpoint injury timing to within about a week (two weeks if the player was on bye), and we can't tell you at all the situation in which it occurred.

For my money, while I admit this data is far from perfect, I think it captures the 80/20 and gets us pretty close to where we need to be.

Organizing and Categorizing NFL Injuries

Now that that's out of the way, how can we think about injuries in the NFL? With something as complex as injuries, one of the first hurdles you hit is how to even organize the data so we can get information and insights out of it. Even after data cleaning, there were almost 380 different injury types in our database. That's not pretty to look at. Mimicking injury reports, I've wrapped these up into higher-level locations in the body: head, foot, wrist, etc. For some of the most common injury locations -- knees, ankles, and shoulders among them -- I've made a few additional splits. I also made a handful of additional on-the-fly splits (for example, separating out broken legs and arms from other leg and arm injuries). These categorizations are subjective, but I've tried to strike a balance between using enough detail to capture subtle variations in injury types while not creating too many categories to visualize or cutting down to sample sizes that are too small for drawing reasonable conclusions. My final scheme includes 50 categories, listed below in Figure 2. (Fifty is still a lot of categories, so Figure 2 actually has two parts.)

If any experts out there have a problem with my categories, it's easy enough to change! This is just a start. Even if we had perfect categories, though, there would still be some significant misclassification of injuries due to variations in how teams put players on the injury report and the amount of information they give out.

Distribution of Weeks Missed due to Injury

A good first question for our data is: How many regular-season weeks do NFL players miss due to injury every year? (Note that we're ignoring playoffs in all these analyses.) Let's look at Figure 1 below. The number of player-seasons with one or more weeks missed comes from the FO injury database.

The first thing that jumps out of this chart is that a majority of players (61 percent) won't miss any time in a given year. Another thing that jumps out is that this data is heavily right-skewed (that is, there are way more players who miss little or no time than who miss extended periods). Sneak preview: this relationship holds pretty much no matter how you cut the data by age or position. Not rocket science, right? But this has important implications when trying to predict injuries:

  • Trying to give a projection for the average (mean) number of weeks a player will miss is pretty much meaningless with this kind of distribution since the data is horribly asymmetric. Plus, have you ever seen a coach's or trainer's eyes glaze over when you tell him his guard is likely to miss 0.73 weeks? It makes no sense.
  • WARNING: I am still going to use average weeks missed to compare the severity of different injury types. The average is still useful for that, don't yell at me (as previously noted, I'm a coward). I'm just saying it's not useful for projecting how much time a player is likely to miss.
  • If we try to use the median instead -- a typical approach when dealing with skewed data -- for the vast majority of players we'd say their median weeks missed is going to be 0 since way over half of similar players missed no time. But does that mean they're very unlikely to miss any time? Not at all. They might have a one-third chance of missing significant time, but because of this distribution neither the average nor median will capture that.
  • A better statistic might instead be the risk of missing any time vs. no time, or the risk of missing more than four weeks, vs. one to four weeks, vs. no weeks. That does a good job of dealing with the skewed data while providing useful projections.

Toll of Injuries -- Total Weeks Missed and Injury Severity

From 2000 to 2014 (15 seasons), 30,186 injury reports have been filed, leading to 51,596 regular-season weeks missed, an average of 1.71 weeks missed per injury. That's a pretty hefty toll!

I made a big deal about categories above, so let's look at which kinds of injuries cause the most damage in the NFL regular season. There are two dimensions to this question: damage at the population level (total weeks missed), and the individual level (injury severity).

Figure 2 is a busy chart -- actually, two of them -- but it conveys a ton of information. First a note on how to read it: the blue bars are the total player-weeks missed over the last 15 years for each injury type on the x-axis. The red bars are the average weeks a player was out with each type of injury (i.e., the severity). If you want to calculate the number of injuries of a given type that were sustained since 2000, you can divide the blue column by the red column. For example, ACLs led to about 4,800 weeks missed, at about 10.5 weeks per injury -- about 4,800 weeks/10.5 weeks/injury = 450 ACL injuries have occurred since 2000.

Now, what can we learn from this chart?

Population level damage (Total weeks missed):

  • At the population level (all NFL players), general knee injuries (non-ACLs or other tears) do the most damage, by virtue of their sheer frequency (about 4,500 since the 2000 season). They have cost players almost 7,600 weeks over the last 15 years. Although their severity is about average (1.7 weeks missed per injury), they happen so often their damage builds up quickly. No surprise here.
  • Some other common culprits round out the top 10 in all-player damage: ACL, hamstring, shoulder (non-tears), ankle (non-breaks or sprains), foot (non-breaks or Lisfranc injuries), groin, and Achilles injuries.
  • I was a little surprised to see back injuries at No. 8. I didn't think of them as that common until looking at the data.
  • Coming in at No. 10? Concussions. Yes, even Marcia Brady knows they are a pretty big deal.
  • Most of the top 10 cause so much damage due to their frequency, but ACL and Achilles injuries appear high up due to a mix of their frequency and severity (about 10.5 weeks out for an ACL, about 7.0 for an Achilles).

We all know ACL tears are season-enders, which brings up an important point: "severity" here is not necessarily a return time estimate. ACL tears occurring in Week 16 are coded as two weeks missed in this analysis, while an ACL tear in training camp is 16 weeks. It might be interesting to re-do this chart with severity as the percent of time left in the season the player missed, but we'll save that for another time. Figure 2 still provides a good comparison of the average seriousness of injuries, since (with some exceptions) injuries don't grow more or less common relative to each other as the season progresses.

All this is a nice segue into…

Individual-level damage (Severity):

If I read about one of my team's key players on the injury report, what words should make my heart sink? Let's consult Figure 2, shall we?

  • There are some no-brainers here: ACLs, spine injuries, and "the dreaded" (I always see it prefixed with these two words) Lisfranc injury in the foot. All result in an average of nine to 11 weeks out.
  • Other things with the word "tear" in them generally aren't good: knee (non-ACL) and shoulder tears (labrum, rotator cuff) mean on average six to nine weeks missed (again, not a return timetable).
  • Speaking of tears: "Pectoral," "triceps," and "biceps" are very bad words to see on an injury report. They often, but not always, refer to a tear in one of these muscles, which at least in the case of triceps and biceps often scratch a player for the rest of the year. Lucky these injuries aren't more frequent: about 100 of each of these injuries since 2000.
  • Even less frequent but hugely concerning are heart problems -- clots or arrhythmias, most frequently. If you pop up with one of those you're probably toast, but fortunately we have only logged 10 such injury-report appearances in the last 15 years.
  • Anybody who has broken a bone can tell you that fractures are not good: foot, leg, or arm fractures in particular are going to sideline you for a while, leading to an average of six to eight weeks missed.
  • Achilles injuries are typically nasty business, too, as noted above.
  • On the slightly-less-severe end, high ankle sprains lead to an average of 3.5 weeks missed. That's much worse than regular old ankle sprains, which clock in at about 1 week missed on average.

One thing that seems strange to me is what on earth "Unknown" is doing up so high here. I wish I had more for you, but as the category name implies, we simply don't know much about these injuries. My suspicion is that most of these represent lingering offseason or prior-year problems that stuck with players all year and often caused them to miss significant time or even get placed on IR during training camp or preseason for "undisclosed" reasons. Fortunately, of the thousands of injuries in our database we only have 77 unknowns, a testament to the great work of the interns and data managers over the last 15 seasons.

Next Steps and Comments

We have looked at some very basic data about NFL injuries. In epidemiology, after we take an overall look at a topic, we like to describe how the data varies in different dimensions. One classic framework is to inspect variation by person, place, and time. That seems good, so let's roll with it. In future articles, we'll take a look at how injuries vary by time, and then by person (position and age). Place is already partially covered by the AGL columns, so we'll leave that be for now.

There's a ton of data here, and there are a zillion ways to cut and inspect it. In the subsequent articles to be run the next few Fridays, we’ll be publishing information split up by calendar year, age, position, and week of season. In the comments below, though, I'd love to hear the readers' thoughts on the next direction.

Zach Binney is a freelance injury analyst and a Ph.D. student in epidemiology focusing on predictive modeling. He consults for an NFL team and loves Minor League Baseball. He lives in Atlanta.

Posted by: Zach Binney on 25 Sep 2015

23 comments, Last at 29 Sep 2015, 1:38pm by Eddo

Comments

1
by Parmenides :: Fri, 09/25/2015 - 3:51pm

Good stuff, this will be fascinating.

Will there be any work on team and turf specific injuries? I don't know if turf information is tagged. But it would be interesting if it was known which field injuries were received on either to talk about artificial and natural turf or if particular stadiums have some statistical relationship to injuries. Also, team specific or coaching specific breakdowns could be interesting. I don't think there would be anyway to tag the data to specific trainers as the work necessary would be prohibitive but that could also be interesting.

7
by Zach Binney :: Sat, 09/26/2015 - 10:11am

I would definitely like to look at different turf types - that's something we should be able to link in with a little extra work. One problem I can foresee is that we can't isolate whether injuries occurred in games or practices, so if teams practice on different surface types than they play on that could bias things a bit.

The AGL column has dealt with team-specific injury rates (adjusted for strength of the player), but we can certainly look later at just crude injury rates without those adjustments. Coaching staff-specific would be really great to get at, too, but it would require some more work and I'm not sure immediately how to define it other than just by head coach.

Thanks for the great thoughts!

9
by jtr :: Sat, 09/26/2015 - 10:52am

Team- or coaching-correlated stats would certainly be interesting. One thing that comes to mind is the Giants insanely bad injury luck the past few years--are there any particular injuries that occur more often for them that perhaps other training staffs do a better job of preventing? I would also be curious to see if Bruce Arians' policy of abolishing static stretching has increased their rates of muscle injuries.
Field could also be an interesting one to look at. If your data includes the home team when the injury occurs, you could look at the effect the lousy surfaces of Houston, Washington, and Pittsburgh on injury rates--off the top of my head, ankle and knee injuries seem likely to increase on a bad surface. It's a small sample, but you could also look at the London games, which have drawn complaints about the surface.
Another idea that just occurred to me is to see if teams suffer more injuries when they're several time zones away from home. It's well established that west coast teams perform worse in east coast 1 pm games. Perhaps injury rates increase as well?

It may have been a bad idea to solicit ideas from the FO comments section, we'll come up with enough ideas to work you ragged!

2
by nickbradley :: Fri, 09/25/2015 - 4:53pm

Great Job Zach!

3
by Jerry :: Fri, 09/25/2015 - 5:59pm

Thanks for doing this, Zach. It looks fascinating.

It's been a decade since I read this, so I don't know how useful it is, but a one-time poster here named Carl Prine wrote up some research back then:

http://www.footballoutsiders.com/extra-points/2005/bloody-sundays

4
by crosseyedlemon :: Fri, 09/25/2015 - 6:19pm

This is certainly a topic that warrants study. Weeks missed due to injury would depend on several factors not discussed here such as type of treatment, frequency and duration of treatment sessions, drugs used in treatment. First time vs repeated occurrences of the injury would be a influential factor as well.
The fact that 61% of players are likely to miss no time in a sport that continues to evolve into a faster and more physical game is pretty encouraging and with continued education about injuries and player safety along with advances in medicine in general I think that percentage is likely to increase.

5
by jtr :: Fri, 09/25/2015 - 7:40pm

One thing I would be curious to see is the "nagging" factor for injuries. Hamstring injuries were computed here to average only a 1 week absence, but at least anecdotally they have a reputation for slowing a player down for weeks after his return to the field. I would be curious to see if there is a significant dropoff in a player's DYAR after his return from that injury, and how long that dropoff sticks around. This could be compared between the positions we have DYAR for--we might expect it to affect a WR more than a QB, for instance.
It would also be interesting to compare with other injuries--how well does a guy play after coming back from a brief absence with a sprained ankle, for instance?

6
by PirateFreedom :: Fri, 09/25/2015 - 11:04pm

That's a great idea. Anecdotally I hear things like players being better in the second year back after returning from an ACL injury. I wonder if various types of injuries do have characteristic periods of reduced effectiveness even after a player is back on the field and how proportional such a period would be to time missed or if some injuries required a predictable amount of time for full recovery even if some players return earlier than others.

8
by Zach Binney :: Sat, 09/26/2015 - 10:12am

One of the first things I tried to do with injuries was a sort of "Nagging Index." I'd love to see if I can do that again (better) with FO's data, so it's certainly something I'll look into.

10
by ChrisLong :: Sat, 09/26/2015 - 1:04pm

My logic:
-ACL injuries can be assumed to end a person's season.
-The average games missed from an ACL injury is between 10 and 11.
-Therefore, ACL injuries are more likely to occur towards the beginning of the season.

"Knee-ACL" also likely includes some injuries that aren't full tears, so it might be even more skewed towards the beginning of the season.

I've seen some articles suggesting that NFL offseason conditioning programs might be leading to fatigue early in the season and in the preseason, leading to more severe injuries.

13
by Dan :: Sat, 09/26/2015 - 10:52pm

Preseason ACL injuries (like Jordy Nelson's) will lead to 16 missed games. So we can't tell from this data if week 1 ACL injuries are more common than week 14 ACL injuries.

16
by Scott C :: Sun, 09/27/2015 - 1:07pm

16 + 1 / 2 = 8.5

Any injury with an average time lost longer than 8.5 can not be evenly distributed through the year.

Although, we may have to adjust for the extra time in the preseason before game 1, and inspect / analyze the start-of-season data more.

17
by Zach Binney :: Sun, 09/27/2015 - 2:57pm

I was gonna address this next week, but any ACL in the offseason, training camp, or preseason is 16 weeks missed. So we can't do the calculations some are proposing to determine whether the distribution of ACLs is non-uniform over the season. I will, however, be happy to show the actual distribution in next week's post!

23
by Eddo :: Tue, 09/29/2015 - 1:38pm

That's true, but you could throw out the 16-missed-week data point and still see if 15>14>13>12... to try to determine if it's more likely for ACL injuries to occur earlier in the year - or at least determine they aren't uniform.

11
by theslothook :: Sat, 09/26/2015 - 1:38pm

Terrific article. One of the first thoughts I've had about injuries was about their distribution over time. Namely, do injuries by in large occur in the beginning of the year and then slowly peter out over time(ie- following a kind of AR(1) - corrected for time trend)? Are there repeating patterns over time?

14
by Zach Binney :: Sun, 09/27/2015 - 7:38am

Dan is exactly right re: ACL injuries and their distribution over time.

If these are some of the first thoughts you've had about injuries, boy are you gonna love Part II next Friday. This is exactly what we're gonna look at.

12
by Grendel13G :: Sat, 09/26/2015 - 4:54pm

I love this kind of thoughtful, robust analysis. Thanks for the hard work, and thanks for sharing!

15
by Tomlin_Is_Infallible :: Sun, 09/27/2015 - 11:01am

1) for the love of all that is important in life, please ditch Excel

2) it's not readily clear how the data in Fig1 and Fig2 are not in direct conflict

3) it would be interesting to see how the claim of 61% breaks down on a per-player basis (the Sam Bradfords vs the never-hurts, etC)

--------------------------------------
The standard is the standard!

18
by bush-did-wine-i... :: Sun, 09/27/2015 - 4:05pm

Fascinating subject, and hard to believe it's taken so long to compile and analyze this data. Any indication on whether more teams than the one you're working for have taken a serious look at their injury trends and adjusted accordingly?

19
by Willsy :: Sun, 09/27/2015 - 9:59pm

Zach,

Great work, really interesting.
At a school I know in Australia they have commenced a multi year project to look at the cpotential causality between injury types and the body type of the individual.
The sport is Netball which is akin to Basketball and involves a lot of verticle leaping which leads to lots of knee injuries. The long held tenant was theat girls had weaker adductor and glute muscles to boys and hence this is why they suffer a lot of knee injuries. However I always thought that this was correlation not causation as if you leap up and down a lot you are more likely to hurt your knee as opposed to someone who doesn't. So the analysis has done several things.
1. Measured everything imaginable about each girl so that there was a really good way to see if body type(tall and gangly versus broad and thick set)or physical condition (weaker or less flexible muscles) were the dominant factors in injury and if they were which one dominated.
2. What range of activities that could minimise injury exist?
3. Tried to identify if any position is more liable to generate injuries.

The end goal is to establish if the sport really does generate a high number of knee injuries and try and define what a high number means. Assuming that it does then see what factors are contributing to the injuries and if any preventable processes are working to reduce the number and severity of them.
If an NFL team isn't doing this type of analysis and conducting it over a multi year period then I am amazed.
On the preventative side one area that is of paramount focus is if the playing group had a proclativity for X number of knee injuries for a given number of games played and if some form of training reduced the the number to Y, then what was the optimal amount of that training, should it occur all year and at what age should the girls start doing it?
This is fascianting stuff.
Thanks,
Willsy

21
by Parmenides :: Mon, 09/28/2015 - 3:16pm

My understanding for women is that the wider pelvis creates a running motion that puts a lot more stress on the ACL. Or at least that was the explanation given after my sister tore both of her ACL's playing ultimate Frisbee.

20
by leviramsey :: Mon, 09/28/2015 - 9:59am

The first graph seems to indicate a Poisson-ish distribution for weeks missed per player-season, at least among those who miss 1 week.

  1. 2795
  2. 1855 (66% of 1)
  3. 1183 (64% of 2)
  4. 890 (75% of 3)
  5. 568 (64% of 4)
  6. 483 (85% of 5)
  7. 322 (67% of 6)

Past 7 weeks, the Poisson-ishness fades away. Some of this is accounted for by an observation that as weeks missed increase, there's downward pressure on weeks missed: bye weeks and then the post-/off-/pre-season (which in some sense combine to form 35 bye weeks for this data set) won't get counted as weeks missed: as the number of weeks missed increase, the chances of catching a bye week or something not in the regular season increase.

Also: 10502 players missed 1-16 weeks, so about 60% of players miss zero weeks, which is close to the 65%-ish week-to-week factor.

22
by MEVK :: Mon, 09/28/2015 - 7:17pm

The estimate of injury severity would be more accurate if you based it only on injuries reported in, say, the first 4 weeks. (Obviously, you would do this only for the severity measure and would want to use the full season to estimate total games missed.) As it is, the range of severity is artificially constrained because the most severe injuries that occur later in the season are considered artificially mild because there are fewer games to be missed. In contrast, the severity of minor injuries that rarely cost more than a game or two are less likely to be biased by occurring late in the season.

Constraining the time-frame will also dramatically reduce your sample size and add a lot of noise in your estimates for less common injuries. I wonder if you could use the most common injuries to estimate a correction factor for the true games lost? You could then apply that correction factor to present more accurate estimates. Come to think of it, the correction factor might be something one could work out purely mathematically if you assumed equal injury probability across the whole season.