Where does KUBIAK differ most strongly from public opinion, and which players are most likely to disappoint their owners in the upcoming fantasy football season?
01 Jun 2009
Our friend Judy Battista of the New York Times has a story regarding the NFL's report sent to owners regarding injury rates on a week-by-week basis:
Battista notes in her piece:
The study used data from injury reports from the 2003 season to the 2007 season — the number of players who missed games each week — to form a line graph with the intent of showing whether more players missed games as the season wore on. The graph indicated that the high point of players missing games with injuries — an average of little more than three players out with injuries per team — was in Week 10. The low point of the regular season was when an average of just over one player per team missed the Week 17 game — the final game of the regular season.
In measuring how many players miss games, the study did not take into account the importance of a game. If a player has a moderate injury in the early or middle part of the season, his team may rest him for two or three weeks. But if the injury occurs just before the start of the playoffs, the team may tell him they need him back on the field quickly.
I haven't read the report, so I can't speak to the veracity of the NFL's data, which was compiled from injury reports and, according to Mike Reiss, "information from team trainers". I know that in compiling our injury database, we use the same sources of data. Our data scrubs players who never had a serious shot of playing for the team or weren't expected to play (e.g. Willis McGahee in 2003, Kenechi Udeze last year) out to try and get a measure of how teams are affected by injury in a given season.
Quickly, though, I discovered how the NFL had likely laid out its report and presented its data. I took the data from 2003 through 2007 and calculated, for each week, the number of players per game who were listed on the injury report and did not play. (We'll get to the injury report excuse in a second, but in general, the only guys who don't get listed on the injury report and then don't actually play are usually the third-string quarterback). I didn't include players listed on IR or PUP, since the data was compiled from injury reports and that report doesn't list guys who aren't on the active roster (IR and PUP are separate lists).
Most of the results matched up well with the NFL's data; the line peaked right around Week 10, with a drop-off right after that until Week 17, when it peaked again, owing to an increase in games missed by players on teams with nothing to play for with regards to playoff positioning.
Of course, the IR and PUP lists do matter -- those players are just as unavailable to the team as a player listed on the active roster and out, albeit whilst not taking up a roster spot in the process. If we include them in the analysis, the data looks totally different, and in a bad way for the NFL:
|2003-2007||w/o IR, PUP||w/ IR, PUP||IR/PUP Totals|
When you include the absence of players on either IR or PUP, teams suffer steadily more injuries as the season goes along. Perhaps not coincidentally, players need to be taken off of PUP by Week 10, or else they get placed onto IR. When the considerations of those two lists are factored in, there's nothing to suggest that the actual health of players hits its nadir at Week 10. It does so at Week 17, after experiencing a steady climb throughout the season.
The issue of whether players are more likely to play "important games" is harder to decipher. The biggest problem, of course, is defining what an "important game" is -- is it games with direct playoff implications? Divisional games? Games in the second half of the season? Does a team at 0-2 consider their Week 3 game a "must-win" and have to include that? Realistically, in the NFL, there's only 16 games; every one is important. Considering that the number of players missing rises as the season goes along, even as the games they miss become more important, I'm not inclined to believe that players are more likely to miss any definition of "important" games.
Reiss raises the issue of the injury report being a reliable source of data, which Mike Florio also discusses in his piece at Pro Football Talk. It's something we've discussed before in the media with Alan Schwarz in a December New York Times piece pointing out Eric Mangini's reluctance to use "hamstring" as an item on the injury report.
Now, at Football Outsiders, one of the things I've developed is AGL (Adjusted Games Lost) -- a statistic which calculates the effects of injury on a team based, primarily, upon that very same injury report. Looking at the history of teams in the past, we calculate the likelihood of a player participating in a given game based upon his status as listed on the injury report and his relative role on the team (e.g. is he a starter, a situational player, or a reserve).
It's exactly that perspective that needs to be taken into context when analyzing the injury report. If you truly believe that players listed as Probable are going to play 75% of the time, Questionable 50% of the time, and Doubtful 25% of the time, well, you haven't been paying attention. If you actually look at the historical likelihood of players with a given injury status and team role playing, the percentage are totally different.
In reality, when you look at the data for starting-caliber players (no one is saying there's going to be a betting scandal because the Jets listed Tim Dwight with an ankle injury instead of a hamstring injury), Probable players make it to the lineup 19 out of every 20 times, while Doubtful guys show up about one out of every 10 games.
Since we've found that AGL has a significant correlative effect with a change in a team's performance from year-to-year, we build that methodology into our team predictions each year. If the injury report really bore no resemblance to reality, merely looking at the number of games missed by a team's players (or strictly a team's starters) in a given year would have a similar or superior relationship to AGL.
In reality, the difference is virtually nonexistent. When looking strictly at starters, AGL has a .32 correlation with wins in a given season, while games missed by starters has a correlation of .30. (If you include reserves, that figure falls to .19 and .16, respectively). In other words, Reiss and Florio have a point -- the injury report is, after the fact, about as useful as a binary report indicating "played" and "did not play".
As for the 18-game schedule, though? Regardless of how information is reported or how the NFL spins their data, players are more likely to be injured and miss time in an 18-game season than they are in a 16-game season. The effects of injuries don't rise or fall at an artificial point in time of the season; they are cumulative, and will only continue to accumulate with each additional regular season week added to the schedule.
43 comments, Last at 05 Jun 2009, 3:15pm by zlionsfan