Oddities of 2009
by Aaron Schatz
It's no secret that 2009 was not the best year for projections around here. All around the Internet, I'm guessing that fans on various message boards will ask if they should buy Football Outsiders Almanac 2010 based on last year's projections, which is a problem for us since a) we would really like people to buy the book because it is interesting, informative, and funny, not because they think they will find perfect projections, and b) we would really much prefer to be judged on our projections from 2007 and 2008, thank you very much.
However, let's be honest, saying "well, gee, everyone has a bad year, cut us some slack" sounds pretty damn whiny. It's probably a lot better to go look at how teams changed (and did not change) between 2008 and 2009 to see if that can teach us any lessons as to what went wrong with last year's projections.
With this in mind, I did go and look at 2009 compared to other years. What I found was somewhat odd, and I write about it in the introduction to this year's book:
Certainly, very few people went into last season expecting New Orleans to emerge holding the Lombardi Trophy. However, everybody knew that the Saints had a powerful offense, and that ties into the strange trend that defined the 2009 season. We're all so used to the NFL standings changing from year to year that it was hard to notice that the average team's change in performance was only about half the size it usually is.
The most dramatic issue was offense. From 2002 through 2008, only 15 teams per season came within 50 points of their total points scored from the previous year. Last year, 23 teams scored within 50 points of their total from the previous year.
Want a more extreme example? From 2002 through 2008, 6.2 teams per year saw their total of points scored either rise or fall by more than 100 points. Last year, only two teams had points scored change by over 100 points: Cincinnati, which scored 101 more points, and Tampa Bay, which scored 117 fewer.
We use a method called "Pythagorean wins" to estimate how many games teams should win based solely on points scored and allowed. The year-to-year correlation of Pythagorean wins from 2008 to 2009 was nearly twice as strong as any other two-year span in recent NFL history. And yet, the year-to-year change in each team's points allowed was actually no more consistent than in any other recent offseason.
There isn't time for a lot of graphs in a short introduction piece, but I wanted to look at the year-to-year correlations that I wrote about in the book, and show folks exactly what I mean. First, let's take a look at the year-to-year correlation coefficients for offense, for each two-year span going back to 2001-2002. This chart shows you the year-to-year consistency of two stats, DVOA and good ol' points scored.
As noted above in my quote from FOA 2010, offensive numbers last year were absurdly similar to the year before, especially compared to the way things usually go in the NFL. Yes, offensive DVOA was generally more consistent earlier in the decade, but the correlation from 2008 to 2009 was still higher than in any of those two-year spans, and the year-to-year correlation for points scored was off the charts. Now, as you probably know, there wasn't a corresponding rise in consistency on defense. In fact, early in the season we were writing a lot about how there had rarely been a season with more change on defense, with a lot of teams bringing in new coordinators who dramatically upgraded performance. By the end of the season, that had settled down a bit --Denver's run defense going in the tank in the second half, Tennessee ended up following a good 2008 with a bad 2009 instead of a ridiculously horrifying 2009, etc. -- so the year-to-year correlation of defense is not historically low. However, it still was lower than the average over the past decade, as you can see from this chart:
The year-to-year correlation stands out more with DVOA than it does with points allowed. As for the third element of football, well, we don't have a "points scored/points allowed" equivalent for special teams, but I need a place to toss in a similar graph showing year-to-year correlation of special teams DVOA. It turns out the year-to-year correlation of special teams DVOA from 2008-2009 was the lowest ever, nearly zero and definitely lower than the rest of the decade. This will look especially odd when we get to the chart that shows you how well last year's projections did in various categories.
Now that we've looked at offense, defense, and special teams, we can look at total performance, and it looks like the extreme offensive consistency from 2008 to 2009 overpowered the usual level of year-to-year change in the NFL. Both total DVOA and Pythagorean wins were more consistent from 2008 to 2009 than in any other two-year span since 2001. As I note in the book, the correlation coefficient for Pythagorean wins is twice as high as the average two-year span this decade.
Now let's bring in our projections from previous seasons. 2004 was the first year we attempted to project DVOA before the season, although that was only on the website, not in a book. In the middle of my self-flagellation over the inferior quality of our 2009 projections, I went back and looked at the correlation of our projections to actual DVOA in each season, using the projections as we published them at the time -- not the projections that would result from retroactively applying our current projection system. Here's a look at the result.
In retrospect, our 2009 projections were nice and strong on offense and special teams. The only year where we were more accurate when it came to offensive DVOA was 2007, and the 2009 special teams projections were the most accurate we've ever done. That seems a little screwy since actual special teams DVOA differed so heavily from what it had been in 2008, but apparently, our system spotted a lot of the trends that were going to fuel that change.
In defense, however, this did not happen.
I wrote about it a couple times earlier in 2010, but this chart fully shows how much our 2009 defensive projections sucked. The correlation between our projections and teams' actual defensive DVOA was pretty much zero -- in fact, it was on the negative side of zero. A dartboard would have been just as accurate. When we added up offense, defense, and special teams to get our total DVOA projections, we ended up with the opposite of what actually happened with NFL teams: our poor defensive projections overwhelmed our quality offense and special teams projections, leading to the worst projections we've done in six seasons.
There are two explanations for what happened last year (and I refer more to the overall NFL offensive numbers than I do to our projections). The first explanation is that something in 2009 dramatically changed in the way NFL teams build their rosters and turn over their talent from season to season. Because of this change, most NFL offenses in the near future will barely decline or improve from year to year. In addition, old trends that indicated when teams might improve or decline no longer apply, which means that projections based on previous data (such as ours) are now useless.
Or, 2009 was a bit of a fluke year.
Occam's razor points to the second explanation, and I'm inclined to agree. Of course, that doesn't mean I wasn't crazy busy this offseason, trying to rework the projection system and identify whatever caused changes from 2008 to 2009 without changing anything in the system based on the trends that indicated improvement and decline in previous seasons. However, I think it is safe to say that 2010 NFL offenses will not be as similar to 2009 NFL offenses as the 2009 NFL offenses were to 2008 NFL offenses -- and the projections in Football Outsiders Almanac 2010 will come closer to matching the accuracy we saw in 2007 and 2008.
Postscript: I had said earlier that we would announce in this space when we had a new version of FOA up with more typos fixed. There's now a PDF version online that fixes all typos found through Monday afternoon.
52 comments, Last at 31 Jul 2010, 11:03pm
#1 by vinyltoupee (not verified) // Jul 12, 2010 - 3:01pm
I've said this before, but I believe treating injuries like you do is a mistake. Injury rates are not a random variable. Some teams are consistently at the top in terms of player games lost, and some near the bottom. For example, you partly attribute the low 2010 projection for the Cowboys based on a return to the mean for injuries, yet you say this every year and every year the Cowboys are consistently among the least injured teams. Who knows what it is - the medical staff, the facilities, the climate - but injuries are not random.
#2 by JasonK // Jul 12, 2010 - 3:19pm
I'm guessing that it's the same voodoo that keeps Jerry Jones' skull from leaping out of his forehead as it is so plainly trying to.
#3 by Aaron Schatz // Jul 12, 2010 - 3:23pm
We don't treat them as random. Like a lot of other things you'll read in the discussion threads, that's shorthand for a much more complicated reality. Injuries at different positions affect the projections in various ways, and there are variables in there for specific teams that do seem to outperform or underperform the projections on a regular basis -- but only when those variables can be made statistically significant.
As for the Cowboys, our warning about depth and injuries was absolutely correct in 2008. The Brad Johnson games were not a happy experience.
(Edit:) I should also point out that this criticism has absolutely nothing to do with the article above. Let us imagine that the concept "Dallas has the best training staff in the league and we can't expect injuries to regress to the mean" is true. Does that statement explain why league-wide offensive figures were so much more consistent from 2008 to 2009? No. Does that statement explain why special teams DVOA was so inconsistent between 2008 and 2009? No. Does that statement explain why FO projections were much less accurate in 2009 than in years previous? Considering that we made the same assumptions in years previous, I don't think so.
As for specific complaints about specific teams, we have often stated, and I will state again, that we take complaints a lot more seriously when they are complaints about multiple teams and league-wide trends, not about specific franchises.
#11 by Key19 // Jul 12, 2010 - 4:18pm
The team still won 9 games in your "absolutely correct" 2008 and damn near made the Playoffs.
So they had a big gaffe at backup QB in 2008. I get that. But honestly, I think its a legitimate question to ask how you guys continue to say they have poor depth when you haven't seen any of their depth in real action in almost two years. They had 12 picks in the 09 draft and 6 in this year's (plus UFAs in both). Not all 18+ are still with the team, but that's a lot of depth guys we haven't seen in real action yet. But somehow, you guys are confident that they all are bad. If even half those guys are solid backups, the Cowboys have pretty darn good depth. I haven't read FOA 2010 yet (waiting on printed version) but I hope if you guys don't address this in there you'll address it here. Looking forward to the book but as a fan I am a bit disappointed that my team continues to get projected to be much worse than they turn out to be (better than projected in each of the last three years, with two Division titles included).
#20 by Tim Wilson // Jul 12, 2010 - 8:14pm
Agreed. Combine that with the projection system consistently falling in love with the Eagles in a disproportionate way, and this site can be torture for a Cowboys fan.
#22 by silentrat // Jul 12, 2010 - 9:23pm
... Yet another reason to love the crew at FO!
#41 by bingo762 // Jul 13, 2010 - 11:42am
But the Eagles consistently make the playoffs and usually go deep. So isn't their "love" of the Eagles warranted?
#43 by Key19 // Jul 13, 2010 - 1:01pm
I'm not saying the Eagles are a bad team. In fact they are really good. But we've been arguably just as good (aside from their one NFC Championship appearance of recent years, which we haven't reached in that same timeframe). We have two Division titles in the last three years. They have zero. And yet we're projected to finish at-or-below .500 every year in FOA while the Eagles are projected to just wipe the floor with us. It never happens, and it's frustrating as a fan to have to listen to a site I greatly respect just hate on my team's chances every year for (in my opinion) no legitimate reason. The fact that they're down on us every year and up on the Eagles every year just makes it more frustrating.
#46 by Jerry // Jul 13, 2010 - 6:32pm
Maybe the Cowboys should try to figure out what drives Aaron's projections and make appropriate changes, like the Eagles seem to have done. Then their outlook will be better. If it adversely affects their record, so what? The FO projections are obviously more important.
#34 by Thomas_beardown // Jul 13, 2010 - 12:25am
I think you could argue that if the depth was any good, we would see them play from time to time. Except for positions that don't rotate like QB and O-line.
#44 by Key19 // Jul 13, 2010 - 1:14pm
But those two positions you just named are the ones that we get harped on for depth every year!!! They bring up the Brad Johnson games and say "Cowboys have no depth." Well, since then we've signed Jon Kitna and drafted a QB. Yet we still somehow have no depth according to them.
As far as offensive line goes, they say "Cory Procter sucks as a C/G backup, the Cowboys have no depth!" That's great because he does suck, but the only reason he was even playing LG when our starter Koiser went down in 2008 was because our FIRST LG backup (Montrae Holland, former Broncos starter during Shannahan days) was injured as well. How many teams have a great guy at 3rd-string LG? I bet none. So how are we worse-off than any other team based on Procter's horrid starts?
Last year, RT Marc Colombo went down. Doug Free stepped in and played brilliantly (well enough to win the starting LT job this year). That's a GLARING example of OL depth. But somehow, we still have no OL depth. We drafted a LT in 2008 in Robert Brewster. He got put on IR before the 2009 season. He's back now. We drafted a RT in this year's draft. No one knows how good he'll be, but to assume he sucks is just ridiculous. We still have Holland to fill in at either Guard, and we even traded for Alex Barron to be a backup LT. Our starting LG, Koiser, will play C and Holland will play LG if starting C Andre Gurode goes down. I don't really know where the supposed depth problem is on the line. If we lost both Guards, we would probably be in trouble. But so would most teams.
FO always says our line is aging, yet now that Free's starting at LT, our average starter's age across the line is lower this year than last year (and that's accounting for the other four guys all being a year older!).
As for depth at other positions, I don't see a problem. We have four starting-caliber DEs, two young NTs behind Ratliff, a 2009 3rd rounder and a 2010 2nd rounder at the ILB spots behind Brooking and James, a solid backup and a decent backup behind the two OLB spots, roughly four safeties that are in the mix to win the two starting jobs, and if Alan Ball loses the FS job, he's great depth at #4 CB. If he wins the FS job, then maybe we have a little bit of concern at #4 CB. But he could always rotate down to #4 CB and have one of the other potential FS-es fill in his spot if need be for a certain package.
Oh and everybody knows all about our RB depth. Our WRs are even stronger this year with Dez.
So where's the glaring depth concern here? I just don't see it.
#5 by GoPackGo (not verified) // Jul 12, 2010 - 3:27pm
Meanwhile teams like the Bills and Seahawks always seem to be towards the top of the list in terms of injuries and games lost.
#7 by Bill Barnwell // Jul 12, 2010 - 3:35pm
Just for reference, the year-by-year ranks since 2000 of these teams in question that are supposed to be exclusively healthy or injured:
Bills: 16, 28, 19, 2, 5, 8, 5, 27, 13, 32 (average: 15)
Seahawks: 1, 12, 31, 13, 14, 11, 19, 13, 29, 27 (average: 17)
Cowboys: 25, 19, 25, 1, 21, 2, 1, 18, 17, 3 (average: 13)
#17 by jebmak // Jul 12, 2010 - 6:16pm
#23 by prophetik (not verified) // Jul 12, 2010 - 10:03pm
how were the bills ranked 13th in '08 with so many people on the IR? are you accounting a sprained ankle and a season-ending concussion on the same list? because the bills usually are in the top five over the last three or four seasons with players on the IR.
#24 by BigDerf // Jul 12, 2010 - 10:50pm
It doesn't just account for people on IR I believe it's starter games lost that they use to rank injuries.
#33 by Key19 // Jul 13, 2010 - 12:19am
Correct me if I'm wrong, but I see two average-health years in there for Dallas in the last three years. However, they have two Division titles in three years. So how can you guys continue to argue that if "they even had average health, they'd likely miss the Playoffs"?
I also see three top-three years in the last five, with two average seasons rounding out the group. That's pretty damn good, and MUCH better than the other two teams. In fact, the last five years have not even been close between the Cowboys and the other two teams.
#40 by Noahrk // Jul 13, 2010 - 11:19am
That's what the TV networks do. They do arbitrary cutoffs to find fake -or maybe only more striking in some cases- correlations and records. "In the last 17 games team X has done Y they have won". "In the last 23 games...", "In the last 22 quarters..."
So unless somebody can point to something that changed exactly 5 years ago that might explain why the Cowboys started having less injuries and the Seahawks more (the Bills had 2 top 10 years), we should look at the whole data set.
I would be curious, in fact, to see injury correlations between HCs. As in the difference in injuries when Wade took over compared to before and the league average. Maybe there's something there.
#4 by andrew // Jul 12, 2010 - 3:26pm
I know you have set deadlines by which time you need to go to press, but would the projections have differed significantly if you could have run with all the info available just prior to kickoff just prior to the 2009 season? There were still some unresolved issues (e.g., Favre was still unsigned and the write-up for the Vikings in the 2009 book seemed to lean towards Jackson/Rosenfels, though the projections for the team were pretty good regardless.... trying to think of other factors that might apply...
I'm still waiting for my hardcover copy (I always get hardcover, not PDF, as I end up keeping all my old Prospecti/Almanacs in my bathroom reading shelf for some reason and typically grab one at random and flip to a random page every time... if it ends up the 2007 Bills offensive line, so be it....)
#6 by Aaron Schatz // Jul 12, 2010 - 3:27pm
Sorry I didn't make that clear above: the projections that I'm using for correlations are the ones published on the website in early September each season, not the ones published in the books.
#8 by silentrat // Jul 12, 2010 - 3:50pm
Could the past year's low correlation in Special Teams DVOA be explained by the elimination of the wedge, and ensuing changes in kickoff strategy, or is it more complicated than that?
#9 by spenczar // Jul 12, 2010 - 4:09pm
Erm, isn't there a third possibility beyond "The NFL has totally changed" and "This was a fluke" - that your model isn't completely accurate? That actually seems like Occam's pick to me. It looks like the defensive correlation data from this year is about two standard deviations away from the mean which isn't crazy I guess, but isn't usually just a fluke.
I guess I just felt while reading the article that you were appropriately humble most of the way, until the very end when you seemed to heavily downplay the possibility that the projection system is flawed.
#10 by Aaron Schatz // Jul 12, 2010 - 4:18pm
Well, of course the projection system is flawed! Otherwise I wouldn't spend so many hours trying to improve it every year!
However, the point of this article isn't just about the projections. We seem to have this strange thing around here -- if we write an article where we talk about multiple things, and one of them has to do with us, people only focus on the part that is about us. Whether or not our projection system is flawed (which it is, of course) doesn't have anything to do with the consistency of offensive numbers from 2008 to 2009.
#13 by Bobman // Jul 12, 2010 - 5:17pm
Well, duh! Then just fix them right this time and you won't have to spend all off-season fixing them next year. There, that was easy. I just cleared a few weeks for you to hit the beach next July. Glad to help.
Seriously, though, this was a helpful article for me at least. I have long appreciated your view (as I read it) that your system is not perfect, but is still pretty damn good and constantly evolving to get better. You guys take more reader comments seriously than anybody I know.
What, do people seriously think outlier years do not happen? Now if you put a string of those together... well, I'll just quote Tony Dungy on that: Once is an occurence. Twice is a coincidence. Three times is a trend.
#14 by Independent George // Jul 12, 2010 - 5:31pm
In keeping with the traditions of Loser League, what if you tried to tweak your projections get get the biggest negative correlation possible? Call it the COSTANZA system, where you try your best to be absolutely wrong so that you can pick the opposite?
#19 by Aaron Schatz // Jul 12, 2010 - 8:11pm
Let's see... start with a binary variable that adds -50% to the DVOA projection of any team with Peyton Manning at quarterback...
#28 by Lance // Jul 12, 2010 - 11:10pm
You know-- I once went through a year of football picks (each game each week straight up) and about half way through it I was dead last. But moreover, I was so bad that a person following our pool would have done better had she or he took my picks and picked the opposite than simply follow the pool leader.
On those grounds, I argued that I was in fact the winner and should have claimed the prize. No one else seemed to get this logic and I didn't win the pool. For a variety of reasons, I haven't participated in such a thing since.
#35 by Big Johnson // Jul 13, 2010 - 1:28am
haha thats pretty awesome. While ur opinion would be more valuable than the winners opinion, u still didnt meet the goal of predicting the most games right. If u knew your own logic, you would have wrote your picks on paper and then turned in all the opposite picks!
#16 by spenczar // Jul 12, 2010 - 5:57pm
Yes, I know I didn't comment on everything you wrote in the article. Sorry.
But the inaccuracy of the projection system is frankly much more interesting to me than the consistency of offenses. The consistency doesn't seem to point to some bigger issue in the NFL, and really does look just like a fluke. But the inaccuracy of the model seems to indicate that there are deeper things going on with defenses which aren't accounted for and are not well understood. To me, that's much more intriguing and noteworthy, and I'd genuinely love to read about what you think those issues are. I would be fascinated to read about where you think the model should be tweaked or what you think is going wrong - I'm not complaining that your model isn't a perfect predictor. Not everyone who criticizes is attacking your project or your methods, Aaron.
#18 by Joseph // Jul 12, 2010 - 7:37pm
I too would be interested to hear what you guys changed between 08 & 09 regarding defensive projections, and what you plan to change (or UNchange) for this year. Or could it be that, because offenses were SO consistent compared with years past, that this threw off the defensive projections? In other words, by predicting that certain offenses would get worse/better, when they didn't, this completely changed the defensive projections because certain factors that you use to predict defense were dramatically different than what was expected.
#25 by BigDerf // Jul 12, 2010 - 10:55pm
Wait.... So the consistency of offenses is a fluke but the inconsistency of their defensive projection points at there being something wrong with the projection system?
#29 by spenczar // Jul 12, 2010 - 11:25pm
There are two different things noted in the article above. One is the consistency of offense, defense, and special teams. This is just "how much do teams look like they did last year?" The second is the predictive ability of FO's model. This is "how well did we predict teams based on last year's data?"
A spike in consistency on offense - teams looking very similar two years in a row - has nothing to do with the projection system. I'm talking about the plunge in the accuracy of defensive predictions.
#36 by BigDerf // Jul 13, 2010 - 2:34am
I know they are two different things but look at those two graphs. The difference between the spike of offense and the prior years average correlations was about the same as the difference between the drop in correlation of DVOAs accuracy. On a statistical level you can't assume one is a fluke and one is a sign of a problem with their projection system. Until there is a second year of data matching the one badly correlating year we have there is no reason to throw the baby out with the bath water. I'm sure there were minor changes this year like every year but if the one year caused a drastic system overhaul it would be a stupid knee-jerk reaction.
Should they totally overhaul the way they calculate the offensive projections based on the one year of high correlation?
The system has worked for the past 5 years... usually getting better as they tweak it. There was nothing that suggested to me based on what I saw last year in the NFL to say that something fundamental about the game has changed on offense or defense.
#39 by spenczar // Jul 13, 2010 - 3:45am
The difference between the spike of offense and the prior years average correlations was about the same as the difference between the drop in correlation of DVOAs accuracy. On a statistical level you can't assume one is a fluke and one is a sign of a problem with their projection system.
Right. The drop in predictive accuracy might be a fluke. I know. It might also be a flaw in the way FO extrapolates data to make predictions. But with the offensive consistency spike, there is no extrapolation which could be flawed. The offensive consistency is a descriptive statement rather than an extrapolation from data. The point isn't that the defensive inaccuracy thing MUST be a problem with the model, but that it could be, and that's more interesting and more solvable. It seems like a good opportunity to learn about football instead of throwing it away as a fluke.
(EDIT:) Florida Danny below at post 37 makes some far more statistically-based (and perhaps more convincing) arguments than I to argue the same point.
I'm sure there were minor changes this year like every year but if the one year caused a drastic system overhaul it would be a stupid knee-jerk reaction.
I don't recall ever demanding an overhaul; I said that another plausible explanation beyond "fluke" and "change in the NFL" is "flawed model." Is that really so crazy? I agree, the system has worked, and tweaking helps. I get the feeling you are reading my posts like I am some dude coming from ESPN who hates stats and believes chemistry wins playoff games. That's not me. I bought the prospectus in 2005 and have been a fan since, but I'm a fan because of the constant revisions and insight, not because of blind faith.
#15 by jimbohead // Jul 12, 2010 - 5:37pm
Just so we're clear about the statistics here, say there's a true mean correlation (m), and every year, the correlation varies around that mean according to a normal distribution, with standard deviation "s". So, assuming s is exactly known with infinite degrees of freedom, a point 2s away from m is rejected as a part of your population with probability 0.95. That's the definition of a t-test, which is what you're implicitly doing.
The thing is, if you randomly select 8 samples from your population, there's a 33.7% chance that at least one will be more than 2s away from m. That's just simple binomial math. All that is to say, fluky years, and data points, happen. Just because it can be rejected from your t-test doesn't mean that the method for producing the number is flawed.
#12 by Aaron Schatz // Jul 12, 2010 - 4:28pm
SORRY TO INTERRUPT BY DELETING SOMEONE'S POST, BUT JUST A REMINDER THAT WE DO TRY TO KEEP THESE THREADS SOMEWHAT ON TOPIC. YOU CAN POST YOUR COMMENTS ABOUT THE BOOK IN GENERAL HERE:
#21 by speedegg // Jul 12, 2010 - 8:54pm
Hmmmmmm, statistically the mean for a process/system with a normal distribution will shift indicating there aren't enough data points, there is a process change, there is a defect, or it is a fluke.
The system will shift back by either the defect is removed (spread offenses...or bad scouting), process controls kick in (defensive coordinators adapt), or regression (can't get lucky all the time). There's a sizeable amount of data collected, so small sample size can be discounted. I'm inclined to think this year's forecasts might be more accurate than before, though we might see DEFENSES show a sharp correlation this year to the next. Cool.
#26 by wpolling (not verified) // Jul 12, 2010 - 11:00pm
I'm a little confused. I see your offensive and special team DVOA projections are very high. I also see your defensive projections very low. Why do they not offset one another to create a "normal" year for your projections?
Thanks for the article. I enjoyed reading it.
#27 by BigDerf // Jul 12, 2010 - 11:06pm
Knowing you tinkered with the projection system makes me believe it was kind of random. Also who saw some of these defenses performing like this even based on common sense....
Denver was a free agent piece together defense that looked old.
The Giants returned almost everyone on Defense after being the number one seed the year before and promoted the linebackers coach to coordinator in order to keep the system that worked. Ended up being one of the worst scoring defenses in the league.
The Jets went from middle of the road to the best defense in the league by bringing in two free agents (Leonhard and Scott) and Rex Ryan.
The year to year correlation stayed was fairly middle of the road and that hurt you guys in a year where you were very wrong on some teams. Those teams were also the teams that swung a lot in one direction or another.
#31 by Thomas_beardown // Jul 12, 2010 - 11:55pm
They predicted the Bears would have a good defense, and then it turned out Urlacher is really important and they had no pass rush.
#30 by BenM (not verified) // Jul 12, 2010 - 11:48pm
This is cool and all (it really is) but...
Is it football season yet?
#32 by cormeagles // Jul 13, 2010 - 12:00am
Maybe it's as simple as putting more weight into changes in coordinators? The Jets, Giants & Broncos were good examples above and as an Eagles fan, I can tell you they got much worse going from Johnson to McDermott (esp after the week 1 domination, then a steady progression downward). It's like the Johnson magic wore off and teams just got a read on what McDermott was doing (a la Denver?).
The key would be how to quantify it as a positive or negative, but maybe you could look at prior years and see if there's any trends - e.g. years in league, past jobs, within or outside org, salary rank, etc.
#38 by BigDerf // Jul 13, 2010 - 2:45am
Aaron looked into this last year here http://www.footballoutsiders.com/dvoa-ratings/2009/week-5-dvoa-ratings
On average teams with bad defenses that hire new coordinators do no better than teams that don't hire new coordinators. They regress to the mean at the same rate. So the broncos sudden turn around with a new coordinator was very unexpected.
And with McDermott and Sheridan... They each were still running the same hyper-aggresive system as their predecessors (both ran more or less the same system in fact) however one (McDermott's infact... just check the DVOA) was still productive while the other just fell apart at the seams. I don't know how DVOA is to be expected to account for whats going to happen with new first time coordinators. You can look at history and know what happens on average but its hard to predict the outliers.
#42 by cjfarls // Jul 13, 2010 - 12:33pm
I was one of the biggest bashers of the Broncos projection last year, but was also completely surprised by the DEF turn around. I think the thing about Denver's DEF last year was there was basically no way for anyone to project it... they had 9 new starters, a completely different scheme, and new coaches. As Aaron said here, a dart board WOULD have been more accurate.... which in retrospect was true.
So where I disagreed with the projection was that they expected Denver to be a very poor team overall because of a bad defense, and I said that the DEF was more likely to be "average" because most teams defenses are, by definition, average. In addition, many of the additions (Dawk, Hill, Goodman, Fields, Davis, etc.) were average-to-good players, so it wasn't like there was an obvious talent deficiency, with the exception of depth on the D-line... which also in retrospect probably predicted the 2nd half collapse of the D-line (easy to say now).
All that said, Denver was an exceptional case last year, and I NEVER advocated that FO should change their projection system to deal with such crazy outliers... that would've been counter-productive to the accuracy of the whole system. What I did say some of the FO staff should've done is back off some of their more strident personal/subjective comments about how crappy Denver was going to be (some predicting far worse than even the projection), because it was fairly obvious even going in that previous year's data wasn't going to be very informative.
#37 by Danny Tuccitto // Jul 13, 2010 - 2:38am
Very insightful post...I apologize in advance for the length of this comment, but I think the answers you're searching have much more to do with statistics than football; and statistical methodology requires extreme long-windedness...
You offer 2 explanations for 2009, and spenczar offers a 3rd:
1) Structural shift in the NFL
2) Fluke year
3) Limitations of your projection model
I think your graphs tell the entire story, and the answer is a combination of explanations #2 and #3.
With respect to the "fluke year" explanation, keep in mind that all of these "year-to-year" correlations are -- I'm assuming -- based on n = 32. Also keep in mind that a correlation is simply an estimate of the true magnitude of performance carry-over between Year A and Year B. So, by calculating a year-to-year correlation, you're basically trying to estimate the true carry-over in a performance stats from only 32 data points. At that small of a sample size, the true carry-over is going to fall somewhere in the middle of a wide normally-distributed range of correlation estimates. Given this, it could easily be that, for example, true carryover on offense between 2008-2009 was overestimated, whereas true carryover on offense between 2006-2007 was underestimated. Combine all these sample-size-driven potential over- and under-estimations (among only 8 estimations per performance category no less), and it becomes pretty obvious that what appears to be a statistical "oddity" is actually random variation in disguise (aka a "fluke year"). Check the standard errors of your correlation estimates and see how wide the "normal" range of each estimate is. I bet you'll find that, when looking at the ranges for the correlation estimates, most within each performance category will overlap.
As far as the other part of your post, which tackles the whole "accuracy of our projection model" issue, I'm pretty confident that, again, the devil is in the variation details. Specifically, in your final graph, I notice that your single best projection from 2004-2009 (i.e., 2007 offense) resulted in a pre-post correlation of about .625 or so. Although that high of a correlation is commendable, when we square it, we find that your model only predicted about 40% of the variation in NFL offense that season. Or, conversely, 60% of the variation between NFL offenses in 2007 was totally unaccounted for in your projection model. And that's for your best projection! Apply that same principle to the accuracy of your average defense or special teams projection, and we're up to about 80%-90% of performance variation between teams being part of some nebulous "other variables" category that aren't being captured by the model.
Of course, my point here isn't to bash the model, obviously. Predicting 40% of NFL performance variation is an awesome feat given the state of the field, and I know how hard (and constantly) you guys work on improving the model. On the contrary, it's to say that, with that much unexplained performance variance, the yearly fluctuations in projection accuracy are entirely to be expected. In other words, when well over half of what you're trying to predict is unknown, there's a ton of room for random successes and failures in prediction. No one who understands this should give you guys any grief whatsoever about how much your prediction "sucked" in 2009.
In terms of a quick and easy fix to your explanation-of-variance "problem", I think there's one glaring methodological issue that might be a clue, and also is something I've been meaning to start working on because it really affects all NFL research. But that's outside the scope of this particular thread. E-mail me if you want to discuss.
#45 by Jeff Fogle // Jul 13, 2010 - 1:22pm
Okay, let's start with the generally accepted premise that the league liberalized offensive holding a few years ago. The obvious takedowns are called, the clutching and grabbing are basically okay as long as you don't pull the guy down on the way to a sack. Defenses were controlling the flow of the game a bit too much for the league's tastes, and they made this move a couple of years ago to create more of an equilibrium.
Now, let's add in the suppositions that:
*Quality quarterbacks can take advantage of this
*Poor quarterbacks can't
*There aren't 32 quality quarterbacks to go around
This creates a set of have's and have not's at the QB position...which leads to the same kind of thing for offensive production. And, barring future rules changes or evolutionary ticks in defensive strategies (which are actually likely but haven't had time to happen since the liberalization of the holding rules), these will remain relatively "stable." Meaning...the top QB's are going to continue having success until they get hurt or retire. The bad QB's will either learn the ropes and become good, stay bad and get replaced, or get replaced immediately by another "experimental" guy who may not be any better.
The rich stay rich. The poor try not to stay poor, but there aren't enough quality quarterbacks to go around so they typically flounder until striking oil with a draft pick or a youngster who matures into a star.
Aaron's offensive numbers from the most recent years could be used as evidence for this. The 2007-2008 "lack of correlation" from the first chart represents the league taking a big step forward with the new friendlier passing rules. The "extreme correlation" from 2008-2009 is the stability that followed the possible reality of the new dynamic.
*I went to nfl.com and counted these up. In 2007-2008, 13 teams made a jump of 50 or more points for the season, compared to just eight who went down 50 or more.
*In 2008-2009, only three teams made a jump of 50 points or more, while seven went down.
Small sample size obviously. But, we have a standard range where injuries or personnel changes lead to "going down." There were 7 and 8 of those (quick examples, NE went way down without Brady in 2008, Cincy went way down without Palmer in 2008, TB went way down with a switch to a question-mark rookie in 2009, the Jets dropped 57 points in the switch from veteran Favre to rookie Sanchez).
We have a big difference though in "going up," from 13 to just 3. The ratio is 8-2 using 80 points as the cut-off. It's 5-1 at 100 points or more (the only team jumping more than 100 points in 2009 was Cincinnati with the return of Palmer). It's at least reasonable to consider "big jump after rules change, stability afterward" as an interpretation of the data.
Now...let's assume for a second that this new dynamic allows the best quarterbacks/offenses to "impose their will" or "control their destiny" to a bigger degree than before. Maybe, in report card terms...offenses range from A down to D...but defenses only range from B to C in terms of their control.
Green Bay has a great defense, but:
Warner: 29-33-0-379 with 5 TD's in the playoffs
Roethlisberger: 29-46-0-503 with 3 TD's in December
Favre: 17-28-0-244 with 4 TD's in one game
Favre: 24-32-0-271 with 3 TD's in the other
GB grades out extremely well in any conceivable metric, but they're still capable of allowing a 15-0 TD/INT ratio and a zillion yards to quality QB's.
Kansas City has a lousy defense, but:
Oakland: 7-24-0-109 in one game
Oakland 13-32-2-113 in the other
Washington: 15-30-1-164 with two Skins QB's
Fitzpatrick: 12-20-1-86 in a game with Buffalo
KC grades out extremely poorly in any conceivable metric, yet their defense can still shut down or slow down bad quarterbacks.
On defense (and I'll grant there are possible exceptions on the extreme ends), "great" means a B, and "lousy" means a C in terms of imposing their will on a game. On offense, you have a range from A to D (or even F for Jamarcus).
To me it's certainly conceivable that the new dynamic created by the liberalization of holding could wreak havoc with projection systems that don't acknowledge that the best defenses are more at the mercy of the best offenses than they used to be. In fact, what we perceive as "defense" may significantly be correlated to how often teams face good quarterbacks, and how often they face bad quarterbacks. A "B-minus" defense will have horrible metrics if they run into a lot of A and B quarterbacks. A "B-minus" defense may have quality metrics if they get to face a lot of offensive weaklings.
Is this behind the defensive "oddities" in Aaron's article? Not enough evidence to say "of course, that's it!" But, it would seem to be a logical contributing factor. And, it looks like Aaron is EXCLUDING the possible influence of liberalized holding with these comments:
"There are two explanations for what happened last year (and I refer more to the overall NFL offensive numbers than I do to our projections). The first explanation is that something in 2009 dramatically changed in the way NFL teams build their rosters and turn over their talent from season to season. Because of this change, most NFL offenses in the near future will barely decline or improve from year to year. In addition, old trends that indicated when teams might improve or decline no longer apply, which means that projections based on previous data (such as ours) are now useless.
Or, 2009 was a bit of a fluke year."
I would suggest that the impact of liberalized holding is a more logical influence than "something in 2009 dramatically changed in the way NFL teams build their rosters." And, that whole quoted section is a false dichotomy...really just a Larry King softball tossed to Occam...where Occam "has to choose" between an option with "dramatically," and an option with "a bit of." Occam goes with "a bit of" and Aaron bravely concurs.
Now, I'm not saying that the numbers are DEFINITELY a result of the rules change. The world is complicated. To the degree there's a mystery to be solved, we're likely to see a mix of influences rather than just "it's this or it's that." I do think this should be part of the discussion. It was in various comment sections last year. It's disappointing that it wasn't in the article above. How could the impact of liberalized holding be ruled out as an explanation, but "something in 2009 dramatically changed in the way NFL teams build their rosters" be given as the only non-randomized option to the guy with the razor?
Regarding "It's no secret that 2009 was not the best year for projections around here," I hope there's also some recognition that it wasn't the best year for "post-jections" either. After the games were in the books, Baltimore was ranked as the best team in the league despite an inability to beat quality opposition (using a simplification of the descriptions from above, Flacco was able to pound C defenses but couldn't impose his will on B defenses...which foreshadowed playoff efforts of 4-10-1-34 at New England and 20-35-2-189 at Indy...24-45-3-223 for the postseason). Three of the top four teams couldn't win their division. The "imposing their will" offenses of Indy and New Orleans didn't crack the top five in the regular season despite earning top seeds, then ultimately winning their way through their conference brackets.
The lead paragraphs in the article are basically saying "2009 stunk for FO." The summation is literally "2009 was a bit of a fluke year." The world may have changed with the liberalization of holding (and a variety of other influences). The evidence is at least in line with that possibility if not confirming it. Hope you'll consider possibilities beyond roster construction.
#47 by Jeff Fogle // Jul 14, 2010 - 1:45pm
Aaron, you made these comments in earlier posts within this thread:
"We seem to have this strange thing around here -- if we write an article where we talk about multiple things, and one of them has to do with us, people only focus on the part that is about us. Whether or not our projection system is flawed (which it is, of course) doesn't have anything to do with the consistency of offensive numbers from 2008 to 2009."
"Does that statement explain why league-wide offensive figures were so much more consistent from 2008 to 2009? No."
If you don't want people to focus on the part about you, don't lead with it in the first sentence of the article(!). If you're concerned about the consistency of offensive numbers from 2008, my post above made an attempt to explain it. When people WEREN'T talking about that, you jumped right in to make them focus. When a post does offer up something, it's the sound of crickets for a day.
The stability doesn't look to be all that mysterious. There's a stability in schematics and quarterbacks that SHOULD stay consistent if the league has found an equilibrium with the liberalized holding that allows QB's a little extra time. Good QB's/smart schematics can make use of that time. Bad QB's/ill advised schematics can't.
Here's a list of teams that stayed within 50 points of the 2008 in 2009:
365 POINTS OR MORE BOTH YEARS (many 400's)
(I'm typing these in divisional order starting with the AFC because I logged everything in divisional order when I was taking notes. Note the consistency in QB's and schematics from year to year for everyone. NE did switch from Cassell to Brady...but it was still a similar schematic. Should probably add in to the "theme" that in addition to having an extra second or whatever to make decisions, this is helping QB's stay healthy by avoiding hits...or there's been a simultaneous realization from teams about how to keep their QB's healthy by emphasizing quick strikes or something---a combination of liberalized holding AND keeping QB's healthy is in play).
MID 300'S GIVE OR TAKE
(Not much in terms of coaching changes here (Denver). Denver and Chicago flipped QB's but didn't flip production...actually they dropped from 370 and 375 to 326 and 327 respectively...kind of funny). For the group, not consistently elite, but consistently acceptable.
302 OR LESS
(Teams who couldn't get things figured out).
Only two teams jumped up more than 50 (I mentioned 3 in the earlier post. I had a logging error, sorry my bad).
*Cincinnati went up 101 as Palmer returned to take over for Fitzgerald
*Minnesota went up 91 as Brett Favre took over for the prior tandem
These seven teams went down:
NY Jets (from Favre to a rookie)
Tampa Bay (from Garcia and others to a rookie)
Not 100% consistency, but more than usual as your graph made very clear.
So, in the "Does that statement explain why league-wide offensive figures were so much more consistent from 2008 to 2009?" category, I'll offer up liberalized holding plus improvements in efforts by teams to keep their QB's healthy (which is, of course, easier to do if you're allowed to hold). I think it's reasonable. The fact that there was a clear liberalization of holding means we have to be more careful about assuming random influences were the main cause.
The NBA made changes to the hand check rule and it opened things up. The first years of "opening things up" and "stability in success for teams with scoring stars" wasn't random, it was a sign of the new reality. MLB cleaning out steroids led to fewer home runs and lower scoring. Those decreases weren't random. They were a sign of the new reality. Swimmers with those bodysuits reduced their times. The reduced times weren't a random thing that was about to regress back (unless they outlawed the suits).
#49 by Dan // Jul 14, 2010 - 4:23pm
What other data could we look at to test if this story is correct?
I looked at leaguewide passing stats over the past decade (2000-2009) to see if there are any notable trends. Three stats have shown significant improvement for the league: sack rate, interception rate, and completion percentage. All three correlate with the year at a magnitude of r > .65. Interception rate and completion percentage show relatively steady improvement, but sack rate seems to show a pretty sharp improvement from 2006 to 2007 (although it also shows more random year-to-year variation, so it could just be that steady improvement is masked by a bunch of random variation).
#50 by Jeff Fogle // Jul 14, 2010 - 5:05pm
Not positive completely what you're saying here Dan. Are you saying that offenses have been gradually trending better in avoiding sacks, avoiding interceptions, and completing passes...and that this trend is continuing through last year in a way that suggests it will continue? Or, is it just that those categories were better for offenses in 2009 than they were in 2000....but random variation masks elements of the trend?
I do think we've all seen with our own eyes that teams are doing much more short stuff off quick reads (or bailouts to the running back) as the march of time has progressed through the decade. Your data seems consistent with the eyeball test.
Aaron was talking more about the EXTREME offensive correlation that 2009 to 2008 had in a way that was abnormal for the decade. So, the mystery to be solved is why was 2009 so similar to 2008 in a lot of offensive categories (scoring in particular). A lot more teams "stayed the same" than we've been accustomed to.
#48 by Aaron Schatz // Jul 14, 2010 - 4:05pm
Jeff, there are a couple of problems with your thesis.
First, it seems strange for a rule change from a couple years ago to explain something that happened in 2009, but not in the years immediately previous. Why would it take so long for teams to react to a change in how holding is called, and why would 2009 specifically be the year when they figured out what to do?
Second, holding calls did not go down last year. They went up. Here are the totals on offensive holding the last four years:
#51 by Jeff Fogle // Jul 14, 2010 - 5:08pm
Aaron, I really do appreciate the response, so I'll try not to be too sarcastic. It looks like you just skimmed the post rather than reading it...
The thesis is that liberalized holding led to:
*A very poor correlation in the 2007-2008 numbers, as 13 teams had jumps of 50 points or more in their scoring once QB's who knew what they were doing started taking advantage of it (the next to last bar in your points graph that ran at the top of the article).
*A very strong correlation in the 2008-2009 numbers as so many teams stood pat after their jump that only 2 teams had jumps of 50 points or more in their scoring (the last bar in your points graph that ran at the top of the article).
The thesis is that the liberalized holding has created an equilibrium...leading to a set of have's and have not's at the QB position that will continue to produce at consistent rates until some new development happens (another rules change, an evolutionary tick from defenses who finally figure out a way to slow down Manning, Brees, Brady, Rivers, Rodgers, et al).
So...your comment that:
"it seems strange for a rule change from a couple years ago to explain something that happened in 2009, but not in the years immediately previous. Why would it take so long for teams to react to a change in how holding is called, and why would 2009 specifically be the year when they figured out what to do?"
Has me asking...WHAT???????
The big increase happened soon after the change in how holding is called. 2008 was the year of the big jump. 2009 was the year that caused you to state "the offensive numbers last year were absurdly similar to the year before." Why would 2009 specifically be the year when they figured it out when we already know that 2009 was extremely similar to 2008?
Regarding holding calls. Nobody said the liberalization of holding meant a reduction of holding calls. It meant that the clutching and grabbing was now allowed...but only the pulldowns were called. It's certainly possible for offensive lineman to INCREASE their aggressiveness in a way that the number of holding calls stays the same (more pulldowns to a degree that offsets past numbers of calls on clutches and grabs), while the players now clutch and grab right off the snap.
So, I don't think your points are problems with the thesis. The first one is a confusing misreading of the thesis. The second one actually provides more evidence that QB's are being protected better than ever from the volume of big hits (a lot of holding calls plus the clutching and grabbing). If QB's stay healthy, and have an extra second or whatever to make their passes...that could lead to resolving the stability "mystery" that you were encouraging posters to discuss.
#52 by Roh (not verified) // Jul 31, 2010 - 11:03pm
I agree with your thesis. Here's how I can best summarize Aaron's problem, he's reading too much into this. He can't take a step back, clear his head and understand what you're trying to say.