Guest columnist Zachary O. Binney fact-checks a story in a national publication and finds that everyone makes mistakes.
18 Nov 2005
by Aaron Schatz (and the FO staff)
Time for another look at the Football Outsiders mailbag. Like last week, I'll note that our e-mail traffic has grown substantially in the last few weeks, as have the comments on the discussion threads, so I apologize if your question doesn't get answered. There simply are too many good questions that require well thought out answers. The best way to get your question answered at this point, if it is a question not related to the DVOA stats, is to send it to one of the other writers via the contact form.
Be aware that we reference plenty of our innovative FO stats here, not to mention their unfamiliar terminology, so if you are a recent addition to the readership you might want to read this first.
It's been a strange and difficult week and while I don't want to dedicate this entire mailbag to the "Indianapolis anomaly," I should address things somewhat. Let me see if I can answer the most popular criticisms, both on the website and in my e-mail.
You only changed things because of FOX, right?
Yes and no. I truly felt there was something wrong with this week's numbers. I wanted to be honest about that. If we were still doing the ratings only here on FO, I know I could write a big long explanation and most readers would understand. But that's simply not the case with the FOX ratings, where many people skip all my commentary and only are interested in the order of the teams. I didn't want this to become "That Schatz guy thinks the Colts are worse than the Jaguars and Bengals" because a) that makes me look stupid to the 95% of people who won't read the long, detailed commentary and b) I don't believe the Colts are worse than the Jaguars and Bengals, even if that is what the numbers say.
Are you just going to fiddle with the numbers whenever something happens that you don't like?
If by "fiddle" you mean change numbers subjectively so they look the way I prefer, the answer is no. Lest I remind people, I didn't even fiddle with the numbers this week, I only fiddled with the geography.
But if by "fiddle" you mean "are you going to try to change your formulas just because they kick out an answer you don't agree with" the answer is yes. This is how the scientific method works, and this is how DVOA has been developed. Every step of the process was taken because I thought the numbers as they stood at the time looked strange. When the site launched in 2003, it was obvious we needed a special teams measure because Dante Hall was going nuts. Then it became obvious that special teams had to be adjusted for weather. Heck, the entire concept of comparing each play to similar plays in similar situations came because when I was first doing this, the equations made Mike Alstott and Zach Crockett look like the greatest things since sliced bread. After some thought, I realized those guys were in situations where achieving success was easy and therefore I had to compare them to other players in those situations. Eureka, and so forth.
The "official" scientific method has four steps:
1. Observation and description of a phenomenon or group of phenomena.
2. Formulation of an hypothesis to explain the phenomena. In physics, the hypothesis often takes the form of a causal mechanism or a mathematical relation.
3. Use of the hypothesis to predict the existence of other phenomena, or to predict quantitatively the results of new observations.
4. Performance of experimental tests of the predictions by several independent experimenters and properly performed experiments.
The problem created by the Indianapolis situation is that the FO Scientific Method has an additional step:
3.5. If said phenomena takes place during the season, particularly during a period with extra writing responsibilities (i.e. midseason), create a stalling tactic that tries to explain why the phenomena is questionable without changing previously established formulas. Delay experimentation until time permits, whether that be two weeks from now or February.
Right? In case it wasn't clear, I would have preferred to have spent a ton of time testing ways to change the strength of schedule adjustment, and then comparing year-to-year and first half-to-second half correlations to see which of those changes improved upon the current formula. And there are a ton of good ideas in that Week 11 DVOA thread, some of which I just have not had time to try, others of which I had not even thought of.
I've always said DVOA is a work in progress, not a perfect system.
By the way, this is not the first time this year that I have looked at the numbers and said, "This does not look right." I quickly mentioned it a couple weeks ago, and you may not have noticed, but I spent a few hours testing the coefficients that I use in weighted DVOA specifically to see if there was a way to make the Miami upset less important to Denver's rating while at the same time improving the overall correlation of weighted DVOA with future performance. In that case, I could not prove my hypothesis, and so the change made to the weighting system was minimal and didn't help Denver much at all.
Would you have done this if we were talking about a 5-4 team, not an undefeated team on top of every other power ranking out there?
Here's what drew my attention: the Colts were first for the last few weeks. Suddenly this week they dropped to seventh in DVOA (sixth in weighted DVOA) despite winning by two touchdowns. It seemed crazy, so I investigated, and it became clear it was tied to a schedule that was a historical outlier, far from anything I had ever measured previously.
I'm not afraid to list an undefeated team lower than first place. If this had dropped the Colts from first to second place, I would not have done the same thing. If the Colts had been third the entire season, I would have been defending the rating the entire season.
If this was a 5-4 team, I might not have changed the ranking. But you better believe I would write a long commentary about it. This was wacko. When I see wacko, I write long essays about it.
How good a game did Indianapolis need to play against Houston in order to stay number one?
On offense, the Colts have had the third and fourth best games against Houston. Seattle had the best game, Pittsburgh is second. I suppose Peyton Manning could have not thrown an interception.
On defense, the Colts needed to get turnovers. This was the first game all year where Houston did not fumble or throw an interception (although there have been games where all Houston recovered all its own fumbles). The VOA (unadjusted) was 2.4%, which is the third-highest for Houston's offense all year. Of course, the first Indy-Houston game was one of Houston's two worst offensive games of the year. The other, and the game which is really skewing Houston's rating, was that first week stomping by Buffalo where the Texans gained 120 yards and turned the ball over five times.
It is possible that the gradual increase in the strength of the opponent adjustments is an issue here, but I can't figure out why that would suddenly hit the Colts this week when they had basically the same DVOA for three weeks before this.
What if you adjusted the Colts' DVOA, replacing San Francisco and Houston with a team that wasn't an outlier, like the team ranked number 30?
That's an interesting one. It's not really a one-for-one replacement, because the opponent adjustments are split between passing and rushing as well as offense and defense.
Since the 49ers are 32nd on (unadjusted) offense, I replaced those adjustments with the New York Jets, who are ranked 31st. I left their defense as is.
The Texans are 32nd on (unadjusted) defense, but I couldn't replace them with the team ranked 31st (Patriots) because the Patriots are actually average against the run. So I replaced the Houston pass defense with the New England pass defense, and then went to replace the Houston run defense with the run defense ranked 31st, except that's Buffalo and I wanted a team that like Houston was poor in both areas of defense, so I replaced the Houston run defense with the St. Louis run defense (30th).
Then I re-ran the entire league.
The result: 1) Jacksonville; 2) Cincinnati; 3) Indianapolis; 4) Denver. Making these changes also moves the Jags and Bengals because those teams also have each played Houston once. But the Colts move up far, far more than anyone else in the league.
I think this is a more accurate portrayal of where Indy should be than either first or seventh. But since this isn't a system that has been tested sufficiently, I would never consider changing the actual numbers to match this "blunt the outliers" idea.
Are you going to be changing things in next year's book too?
No. First of all, this problem is temporary. There are six different teams that are playing Houston and San Francisco three times combined. For some reason, the Colts are the only team that had more than one of those games in the first ten weeks. The Colts' schedule will even out, and so will the ratings for Houston and San Francisco. Houston in particular has been better in recent weeks.
Second, we're going to be trying all kinds of new things to improve DVOA before we publish next year's book. So even if this anomaly affected all the teams at the end of the season, we'd figure out if it was messing with the quality of the ratings and have the best possible system in PFP 2006.
Haven't you opened the door to criticism from fans of every other team who want the same treatment?
and it's sister question
All that garbage about how you wouldn't move Denver up because of your numbers, but now you'll move the Colts up? You're just another a$$hole Raiders fan. Your system is worthless crap and I hope FOX fires you.
I tried to explain this at the bottom of the commentary, so either I didn't do a good enough job or people didn't read the whole thing. I reacted differently to this situation because it was an anomaly, i.e. different from anything else. The low ratings for Washington, Denver, and the NFC South teams were similar to situations from last year, and the year before that. I did try to explain each of those issues in the commentary for the people who actually read my words.
I'm going to get e-mail from people telling me to move their favorite team no matter what I do. If we ever have something as ridiculous happen again, I might do it. What would that something ridiculous be? I don't know, that's why it would be something ridiculous that we had never even thought possible. But for the usual stuff like winning close games, fluke upsets, and games where backup players sucked? No.
Your system said Team X wasn't one of the ten best teams in the NFL and yet they keep winning. Stats can't predict the future so they're a stupid way to do power ratings.
Well, I'm not sure Peter King or Dr. Z can predict the future either. If the Browns suddenly reeled off seven straight wins, would anyone be able to say they predicted it? You also have to remember that people use stats all the time to make their arguments. I'm only saying that our stats are more accurate, not that they are perfect. (Every single power rankings article has this sentence right before the rankings start: "Remember, of course, that any statistical formula is not a replacement for your own judgment, just a tool to use in analyzing performance.")
Why do power ratings based on just numbers? Because that's my gig. That's what I do for a living. I can't do Dr. Z's thing better than he can so I do my thing. I'll be featured on next year's NFL Films compilation of guys saying "We just gotta do what we do."
Hmm, that's more about Indy than I wanted to write. I hope this covers most of the general questions, and there's a limit to how much I can go through and answer the specific ones. Let's talk about some other stuff instead.
Mitch Wojcik: The 2005 San Diego Chargers strike me as being very similar to the 2003 Kansas City Chiefs: solid-to-very good QB, top players at RB & TE, serviceable WRs, and a porous defense. Though the Chargers don\'t have the kick return game (a.k.a. Dante Hall) that the Chiefs did that year. And if I am not mistaken, Chiefs lost (at home) in 2003 in a 1st round game to the Indianapolis Colts in the playoffs. Just curious if the DVOA numbers support my observations?
Aaron Schatz: First, the Chargers are a more balanced mediocre defense compared to the Chiefs. The Chiefs were 28th against the run. Second, the defining attribute of the 2003 Chiefs is the way they folded in the second half of the season on defense. It's not just a schedule thing, their adjusted numbers plummeted also. That's why leading the league in DVOA shouldn't always be translated into being favored to win the Super Bowl. It was prety clear by the end of 2003 that, although the Chiefs finished the year on top, they built up most of that value in the first half of the year.
Randy Smith: Be warned, it's another question about Denver. Specifically, I'm wondering about Denver's DBs. I'm having a hard time figuring out if they are good or not. They aren't giving up big plays for the most part but they, like the rest of the team, kind of disappear late in games. Can you shed some light on what's going on?
Aaron Schatz: I asked Randy if he wanted a scouting answer or a statistical answer. The statistical answer seems to be that they don't disappear late in games except against the Giants. Denver's pass defense DVOA is -15.3% in the fourth quarter, better than its overall pass defense DVOA. You also might find this guest column interesting. For a scouting answer, I turn you over to Mr. Tanier.
Mike Tanier: The Broncos secondary has been playing extremely well. Champ Bailey has been playing hurt, and he's clearly not having a great year despite four interceptions. But rookies Darrent Williams and Domonique Foxworth have played like veterans. Williams looks like a Rookie of the Year candidate; Broncos fans may worry that he's another Deltha O'Neal, but he looks like another Aaron Glenn. And while it seems silly to call John Lynch a suprise player, he has been outstanding in coverage at a time in his career when he should be too slow to do anything but play in the box.
Coordinator Larry Coyer deserves a lot of credit for fitting the rookies into his scheme and making sure that they are protected. And LBs Ian Gold and Al Wilson are outstanding in coverage, freeing safeties to drop into deep zones or double cover receivers instead of worrying about running backs.
Opponents throw the ball a lot against the Broncos; many of them (the Raiders, Eagles, Patriots) are either pass-first teams or teams that were trailing for much of the game. But opposing quarterbacks complete just 54% of their passes against Denver.
Bockman: How do Eli Manning's similarity scores stack up now that he's started for a full season? Also, for reference, could you include some other QB's similarity scores for their first 16 games? Such as Peyton, McNabb, Palmer, etc?
Aaron Schatz: I'll have to wait for another time to do a comparison of various quarterbacks, although that's a keen idea. But I promised I would do the Eli Manning similarity score thing again after he hit 16 games, so here we go. We can do this two different ways. First, here are similarity scores as if Manning had played all 16 of his games in the same season:
Oh, that's not good. Now let's try similarity scores that pro-rate this season to 16 games, and then look at two-year trends. I think this is a more accurate way to look at things. I've also removed any players who were far more similar to Eli in year one than year two, since we're trying to measure his maturation process. There aren't that many players similar in both years. Stats here are second year only, and you'll see one player stands head and shoulders above the rest:
*Stats pro-rated to 16 games.
That combination of interception rate and completion percentage is still extreme, even after the Vikings game. And now some more Giants goodness...
Conor Lyons: I'm a diehard Giants fan and have been very impressed with the special teams play this year, especially as compared to the past. (Trey Junkin, anyone?) Anyway, it was great to see them so far ahead of everyone else in the special teams rankings, only to blow it all with the awful performance last week. The question I had is, first of all, is that the worst special teams game ever, and secondly, what are the odds of the team with by far the best special teams in the league having such a bad game? Obviously, the second part is more of a rhetorical question and me venting more than anything else, but I would like to know how that performance ranks, at least this year.
Aaron Schatz: Well, the question about the odds would probably take a good deal of playing around with math; perhaps that's a good one for next year's book.
But I bet we could have a lot of fun with the worst special teams games ever (at least, since 1998) so I went looking through the data. Remember, we're talking about DVOA ratings here, so everything is adjusted based on weather and opponent, and we're not including extremely rare plays like blocked field goals or field goal returns for touchdowns.
Last week, the Giants scored a single-game DVOA for special teams of -36.6%. That translates to -12.7 points of field position compared to the average NFL team in the same situations. Yikes. But that's actually not the worst rating of the year. The worst rating of the year was -39.8% put up by the Arizona Cardinals in Week 1 against -- yes, you guessed right -- the New York Giants. The Giants returned both a kickoff and a punt for a touchdown, just like the Vikings last week. Their other kickoff returns were great as well -- including the touchdown, the Giants began their average drive after a kickoff at their 46-yard line -- plus they punted six times and the combined punt return total of the Cardinals was -5 yards.
This game is why the Giants still lead the league in special teams even after last week, although if you remove both this game and the Vikings game the Giants would still lead the league in special teams at 10.8%.
But these games are actually nowhere near the lowest special teams game we've ever tracked. Let's look at the five worst games ever:
5) 2000 Buffalo Bills at Tampa Bay, Week 13: -44.3%. The 2000 Bills may have had the worst special teams in NFL history, as I detailed when I first broke down the 2000 DVOA ratings. They also hold the sixth and seventh spots for Weeks 3 and 17 in that same season. The highlights of this game include a 73-yard punt return by Tampa's Karl Williams and Tampa kick returns of 24, 35, and 45 yards. Meanwhile, Buffalo's Chris Watson had a grand total of -3 punt return yards on seven Tampa punts, and for just a little extra negative value, Steve Christie honked a 42-yard field goal.
4) 2004 St. Louis Rams at Buffalo, Week 11: -46.2%. Nate Clements returned one punt for an 86-yard touchdown, while Jonathan Smith returned another for 53 yards. The Rams also fumbled away a kickoff return, while Buffalo's four kickoff returns ended, on average, at the 31-yard line.
3) 2004 Jacksonville Jaguars vs. Detroit, Week 10: -47.1%. This is the game where Eddie Drummond had two punt return touchdowns of 55 and 83 yards, as well as punt returns of 23 and 24 yards. He also had a 35-yard kickoff return on a short kick that landed on Detroit's 18-yard line. The other Lions forgot to show up that day, so Jacksonville actually pulled this game out in overtime.
2) 2002 Buffalo Bills vs. Jets, Week 1: -48.6%. You all probably remember this one, the game with the two Chad Morton kickoff return touchdowns, a 98-yarder and then a 96-yarder that won the game on the first play of overtime. Buffalo also returned New York's three punts for just 0, 2, and 10 yards. Everyone's kickoffs and punts get a little penalized because things are easier in September. Mike Hollis missed a 50-yard field goal but he hit a 52-yarder, which ends up being a net positive value.
But the worst special teams game of the past eight years was:
1) 2002 Cincinnati Bengals at Carolina, Week 14: -55.1%. If you are playing at home, that translates to -19.1 points worth of field position. Steve Smith had two punt return touchdowns (87 and 61 yards). Cincinnati punter Travis Dorsch also managed just 40 yards on a free kick after a safety, and if that's not embarrassing enough, his final punt of the game went a grand total of 10 yards from the Cincinnati 10-yard line to the Cincinnati 20-yard line. The Bengals could not return a single one of Todd Sauerbrun's punts, and for a little extra negative value, Neil Rackers missed a 52-yard field goal. (We think of those as long, but they're hit about half the time.)
See Giants fans, it could have been worse.
46 comments, Last at 24 Nov 2005, 2:06pm by Jeff