by Aaron Schatz (and the FO staff)
Time for another look at the Football Outsiders mailbag. We get a lot of e-mail, and there are a lot of comments on the discussion threads, so I apologize if your question doesn't get answered. There simply are too many good questions that require well thought out answers. The best way to get your question answered at this point is to use the contact form [1]. If it is a question not related to the DVOA stats, it is more likely to be answered if you send it to one of the other writers, not me.
Be aware that we reference plenty of our innovative FO stats here, not to mention their unfamiliar terminology, so if you are a recent addition to the readership you might want to read this first [2]. All numbers below are through Week 16.
There were a number of questions about how DVOA works in the discussion thread for DVOA ratings two weeks ago [3], so I wanted to respond to them. Let's start with this post, where reader MME is responding to another reader who felt the DVOA system was subjective, not objective, because at some point we had to make subjective decisions about the values of certain plays in the formula.
MME [4]: Aaron already explained himself, but I thought I’d point out what I believe are misunderstandings on your part. (Aaron, of course, can correct me if I’m wrong.)
“Offense, defense, and special teams are all weighed equally in total team DVOA. A reasonable fan could argue that the three should be weighed 40/40/20, 45/45/10, or any other combination. Making them each a third of the total is a subjective assessment of how much each contributes to winning.�
Take a look at the numbers on the DVOA table [5]. If you notice, the DVOA numbers for off/def get much larger than the ones for special teams (which generally stay less than 10%). I believe Aaron has stated that this is because the formula for ST ratings takes into account that there are far fewer ST plays than Off/Def plays, and thus the weighting you speak of is in the numbers, thus it makes sense to add them up equally in the overall rating. (Also, I believe Aaron has stated that ST ratings are about 1/7 of total weight, with Off/Def getting 3/7 each.)
“It’s subjective to assess success rates based on individual plays rather than each series. For example, a set of downs starting from first-and-ten that results in gain of six, gain of two, no gain, punt earns a success rate of 66% (the punt is excluded, yes?) while a set that results in no gain, no gain, gain of 12 produces a success rate of 33%. Obviously in that case you’d rather have the “less successful� series. Again, I’m not making a case against DVOA here, just saying these are decisions someone has to make that aren’t entirely objective.�
Yes, the choice of what constitutes "success" is subjective, but is mostly based on fan’s understanding of football. However, I believe you are confusing Aaron’s Running Back Success Rate [6] stat with DVOA:
In the example you gave, the first series would be broken down so that the first play gets a little bit more than one "success point" because it got more than 45% of yards needed on first down (say, 1.5 points). The second play would get less than one, as it failed to make 60% of yards needed on second down (say, .75 points). The third down play of course gets nothing. Thus, the total "success" for the series is about 2.25.
The second series is different. The first two plays obviously get 0 success points. The third play actually gets a little over 3 success points (Aaron, somewhere, pointed out a first down is worth 3 success points regardless of what down it is gained on). So this series has 3 points worth of ’success’, which is more than the first series.
And just to make it clear, the RB Success Rate stat for both series (assuming all plays are running plays) would be equal, as only the first down play of the first series is considered "successful" and only the third down play of the second drive is considered "successful."
I hope that makes the math a little more clear. Also, I kind of posted this so Aaron could confirm if my understanding is correct. Knowing how DVOA works is what makes me believe in its legitimacy.
Aaron Schatz: That about explains it. MME is pretty much right on both of these issues.
The special teams numbers are based on actual points and yards, not a system of "success points," because they don't need to take first downs into account. That total of points needs to be multiplied by some coefficient and added to offense and defense to produce total DVOA. To figure out what this coefficient was, I simply ran a regression analysis. It's been re-run several times as DVOA gets tweaked and we have more years of data, so it has changed every so often, and it may change again. I'll fully admit that I do not have complete confidence in the multiplier for special teams. And yes, in the regression a team's total rating ends up being three parts offense, three parts defense, and one part special teams.
I get a lot of questions about the system that ask something along the lines of, "Doesn't your system penalize teams that throw incomplete passes on first and second down and then get a big gain on third down?" As MME points out, the answer is usually no, because the value of a first down conversion is always very high. It's not three points, but it is very close to three -- since 80 percent of needed yards is two points, and 11+ yards is three points, a play that gets 100 percent of needed yards but isn't 11+ yards is something like 2.8 points. The consistent team will usually have a little more value, but that makes sense. The value of three straight four-yard plays should be higher than the value of two incompletes followed by a 12-yard pass, because you can't build an offense around being terrible two-thirds of the time and hoping that the good plays you have every so often just happen to come when you desperately need them to keep a drive going. The best teams gain yards consistently and stop the other team from gaining yards consistently.
Nuk: I noticed that among the worst games of the season, there were ratings below -100%. Are these percentages at all? Percentage of what? How can the percentage be below -100%?
Aaron Schatz: First of all, no, those are not percentages. That's because the rating for a team for the entire season or an individual game is actually three ratings combined: OFFENSE - DEFENSE + SPECIAL TEAMS. So a team with a bad offense (very negative) and bad defense (very positive) will end up with a rating below -100% for the game.
However, I should point out that you can have a rating below 100% for just OFFENSE or DEFENSE, though it is very rare. That's because some plays, like loss of yards or turnovers, have negative values. What DVOA does is add together the value of each play that a team has in the game and NFL average value for each situation that a team has in the game. Then you get this formula:
If the team's value for all the plays is LESS than the average NFL value, you end up with a negative number. And if a team has had a game SO BAD that the team value for the game is actually BELOW ZERO, you end up with a number below -100%. If a team has a game so strong that they end up with more than twice the average NFL value for the situations in the game, you end up with a number above 100%. (The opponent adjustments also play into this, allowing games to move up and down a bit.)
Only three offensive performances this year have been so bad that they register as -100% before considering defense and special teams: Houston's loss to Buffalo in Week 1, San Francisco's loss to Indianapolis in Week 5, and Philadelphia in their ridiculous 17-16 win over St. Louis in Week 15, the game that offense forgot (for both teams). One game has offense above 100%, San Diego in their Week 3 win over the Giants.
On the defensive side there are no games above 100% and only one game below -100%, the Giants when they obliterated the Redskins in Week 8.
Chucky [7]: Is there an FO-approved adjustment from DVOA differential to actual NFL points? VarlosZ (comment #44 in the linked thread) seems to think so, but I don’t remember seeing it.
Aaron Schatz: There is and there isn't. I've worked a couple times on creating a multiplier to turn my system into actual NFL points, but these attempts have never been about DVOA. That's because DVOA is a percentage that represents value per play, and to measure points, you need to take into consideration how many plays a team is going to run in an average game. That's why I always tell people who ask about how much DVOA equals an actual NFL point, "it's more complicated than that." This became an issue in the off-season when I came up for a method that would turn DVOA into "adjusted points per game." We put this in the book, thinking we were going to switch to this new format, but then I became unsure of my math, and when we brought it up, a majority of the readers didn't seem to want to switch from "percentage" DVOA to "adjusted points per game" DVOA. I guess I'll have to re-calculate and make a final decision on this in the off-season. In the meantime, a very, very, very loose formula says that every 1% of DVOA is worth .264 points. But this isn't quite definite and I am not responsible for anyone who uses this formula to bet a point spread and then loses their money.
David Brude [8]: I noticed that only about 10% points separates Manning from Palmer in DVOA yet the total team pass offenses are about 20% apart. What causes the difference in the team DVOA since those guys have basically played almost all of the snaps?
Aaron Schatz: First of all, let's look at the numbers after Week 16:
| IND | CIN | |
| Manning/Palmer passing DVOA | 43.1% | 35.1% |
| Team passing DVOA | 56.2% | 33.0% |
| Team passing "1st order" DVOA | 51.4% | 32.5% |
As many readers know, back in Week 12 I changed the opponent adjustment system for team DVOA so that the opponent adjustment is based on not just your opponent but their other opponents. But I won't have a chance to do this for individual stats until the off-season. As you can see, that's part of the reason why the gap between Manning and Palmer is smaller than the gap between Indianapolis and Cincinnati. The other reason is fumbles. Cincinnati receivers have fumbled nine times. Indianapolis receivers have fumbled just once. For those of you paying attention, Indianapolis receivers fumbled more times in last year's playoff loss to New England than they have in the entire 2005 regular season. Receiver fumbles show up in team passing numbers, but individual quarterbacks do not get penalized for them.
Bencoder [9]: Do you use a macro to generate your final tables or do you have to plug the numbers in by hand?
Aaron Schatz: Once upon a time, I did everything by hand. That changed in the middle of last season. I figure once a year I should publicly thank John Argentiero, the man who has put in an absurd amount of time building complicated Excel macros that build all the stat tables and spit them out as HTML. For all this work, he gets the colossal payment of one free copy of the book. He's the man. (Just so folks know, the code to spit out HTML for adjusted line yards wasn't built until last week, which is why those pages were updated less frequently than others.)
Craig Birkemeier: In the last mailbag [10], you mentioned the record for teams taking the wind in overtime. Out of curiosity, what is the record for teams taking the ball in overtime? I'd assume it is above .500, but an actual number would be helpful in seeing just how dumb it is to take the wind.
Michael David Smith: In the history of NFL overtime, the team that wins the toss wins the game 52.5% of the time, the team that loses the toss wins the game 43.4% of the time, and the game ends tied the rest of the time.
Both teams have at least one possession 69.2% of the time.
The team that receives the opening kickoff drives down the field and scores on the first possession 27.7% of the time.
The team that has to kick off wins the game without ever being on offense 2.1% of the time (that has happened eight times -- five times on interception returns for touchdowns, and once each on a fumble recovery, a blocked punt and a blocked field goal).
If you're interested in a better way of doing NFL overtime, here's my suggestion [11], which happens to be the first article I ever wrote for FO.
Chris Brose: I have a question about Week 14's power rankings. Actually, it's not about the rankings as much as it is something you did with them. In the commentary, you included a list of the 10 worst games and the 10 best games. Among the 10 best games, you included Seattle over San Francisco 41-3. However, Seattle beat Philly 42-0, and the Eagles are a better team than the 49ers. Why is the Seattle-San Francisco game on the 10 best list, while the Seattle-Philly game didn't even rate a mention?
Aaron Schatz: I actually got this question from a few different people. The DVOA metrics give a large penalty for fumbles and interceptions but they don't consider the length of fumble or interception returns. The length of a turnover return is in large part random, based not on the defender's ability to pick off a pass but instead various other factors: where the offensive linemen happened to be situated when the pass was thrown, how well the defensive players can block, how well the offensive players can suddenly turn into defensive tacklers, and so forth. I don't think, for example, that the ability to tackle a cornerback after a pick is really demonstrative of the quality of an offense.
Anyway, 28 of the 42 points Seattle scored in that game were the direct result of turnovers. I'm not even talking about turnovers that give the offense a short field -- even if the offense gets the ball on the opponent's 25-yard line, it still takes some skill to put the ball in the end zone. But Seattle had the 72-yard Dyson interception return for a touchdown, the 38-yard Tatupu interception return for a touchdown, the 32-yard Boulware interception return that was practically a touchdown (teams almost never fail to score a touchdown from first-and-goal on the two-yard line), and the 25-yard Dyson fumble return for a touchdown.
So when you consider just the turnovers, not the touchdown returns, Seattle ends up with a DVOA for the game of 67.9%, while Philadelphia's DVOA is -52.0%.
Someone else asked in one of the threads (I can't remember which one) where the Cleveland-Pittsburgh game would fall on the list of the year's best and worst games. Based on current opponent adjustments, Cleveland had the 11th-worst game of the year that day, while Pittsburgh had the sixth-best game of the year at 118.5% DVOA.
Steve [12]: The explanation of DVOA on this site, clearly identifies that each play has a different value based on its situation (e.g. gaining eight yards on third-and-10 is not as valuable as gaining two yards on third-and-1). DVOA tries to standardize this situational valuation - I think that is good. However, is the valuation comprehensive enough? And should it (could it) vary from team (or head coach) to team? Should it vary from game to game (interconference worth less than intradivision)? Should it vary based on a mathematical probability of whether winning/losing the game will affect the liklihood of making the playoffs, gaining HFA, getting a bye? For example one head coach may be more inclined to hide his team’s true strengths than another coach. And in doing so, the true power of the team may need a different calculation than other teams. I believe this to be the case, but have no idea how one would quantify such an attribute.
Aaron Schatz: The answer to this question is that my head hurts. Listen, there are a lot of places in the DVOA system where the edges get smoothed in an attempt to simplify things a little bit. You can't build a system where the baseline for each play is based on a sample size of two or three plays. And I'm not trying to measure the impact of a play on winning, per se. I'm trying to balance this with an attempt to also give DVOA some predictive value by softening the impact of very rare, unrepeatable plays. (This is why, for example, interceptions returned for touchdowns count the same as any other interceptions.)
By the way, this is the answer to another popular question, "Why don't you just measure every play by the amount that it changes the probability of each team winning the game?" I don't know if a player's performance in close games is really more predictive than his performance in blowouts. Remember, normal DVOA is slightly more predictive than estimated wins, which takes these things into account. A system like this would give Kyle Orton more value because his defense could keep games close in the fourth quarter, and they would make the LaDainian Tomlinson of 2002-2003 look like a terrible player because he gained all those yards in totally meaningless situations where the Chargers were behind by three touchdowns. What, Larry Fitzgerald sucks because the Cardinals have no defense so he's never in close games? (By the way, check out what a good year [13] Larry Fitzgerald is having.)
As for the question of changing the valuation of plays based on the personal strategies of each head coach, what am I, Professor X? I don't have his kind of wheelchair budget.
Charles Jake: We all know that 1000-yard seasons don't mean much anymore thanks to longer seasons and inflated offensive numbers. Would it be possible to come up with a number of yards for today that would be as significant as 1000 used to be?
Benjy Rose: I believe the correct answer is: total yards don’t matter and aren’t a very good measuring stick for a running back. Total yards are more of an endorsement (or indictment) of a rushing offense, consisting primarily of an offensive scheme/plan (e.g. Denver, Atlanta, last year’s Jets), a quality offensive line for the running game (same), and a running back(s) that fit the system.
Mike Tanier: The hard head answer is to say that the 1000-yard standard goes back to the 12-game season, so a quick 1000 times 16 divided by 12 yields 1333 yards.
Of course, there are other factors, the most significant being that two- and three-back systems were much more common up until the early 80s. Offensive levels used to be lower, but the percent of rushing plays was much higher. And of course Benjy is right about total yards as a measure of a running back's performance.
David Hess: Totally off topic, but you know what would be cool? A version of DVOA that only includes plays against teams that are in the top half (or third or whatever) of the normal DVOA rankings. Kind of like when they show college basketball teams' records against teams in the top 50 of the RPI. I know STOMPS (big wins over bad teams) are supposed to be a better predictor of playoff success than GUTS (close wins over good teams). But I would think that teams that performed well (regardless of win/loss) against good teams would tend to do well in the playoffs. I could be way off, of course.
Aaron Schatz: OK, let's try it. This is actually a very good year to do it because there is such a clear gap between the good or mediocre teams and a bunch of very bad teams. The gap isn't really between 16 and 17, though, it is between 21 (Minnesota) and 22 (Cleveland). The DVOA gap there is the biggest one between any two teams except for the gap between San Francisco and everyone else. Above the gap, every team has a winning record except for a team that was good early (Philadelphia), a team that was good late (Baltimore), and a team with a huge schedule adjustment (Oakland). Below the gap, every team is 5-11 or worse.
So, I took out games against the 11 worst teams and here's what we get. I hope nobody minds that to save time, I only re-did offense and defense; the special teams rating here is the normal one. The three teams with losing records aren't listed, but games against them are included in the ratings. Black is weighted DVOA, blue is full-season DVOA. Ready to be surprised?
| TEAM | WEIGHTED DVOA |
RK | TOTAL DVOA |
RK | TEAM | WEIGHTED DVOA |
RK | TOTAL DVOA |
RK | |
| JAC | 53.4% | 1 | 43.7% | 2 | SD | 14.0% | 10 | 23.7% | 7 | |
| DEN | 39.6% | 2 | 30.7% | 5 | PIT | 9.8% | 11 | 9.8% | 12 | |
| SEA | 37.8% | 3 | 32.5% | 4 | NE | 8.1% | 12 | 8.1% | 15 | |
| IND | 37.6% | 4 | 39.2% | 3 | TB | 2.5% | 13 | 8.4% | 14 | |
| CIN | 36.0% | 5 | 45.3% | 1 | DAL | 2.1% | 14 | 9.8% | 13 | |
| KC | 35.1% | 6 | 26.0% | 6 | MIA | -1.3% | 15 | 11.3% | 11 | |
| WAS | 23.3% | 7 | 17.1% | 9 | CAR | -2.6% | 16 | 3.0% | 17 | |
| CHI | 22.5% | 8 | 12.5% | 10 | ATL | -4.6% | 17 | 3.1% | 16 | |
| NYG | 20.5% | 9 | 22.1% | 8 | MIN | -26.2% | 18 | -42.6% | 18 |
The Jaguars demonstrate why this type of exercise can often be a bit silly. Can you guess how many games Jacksonville has played this year against teams in the top 21 of DVOA? Seven. That's all. Only two of the Jaguars' last nine games count on the chart above: a 30-3 whipping of Baltimore and a 26-18 loss to Indianapolis. Otherwise, they've played the Rams, Cardinals, Browns, 49ers, Titans, and the Texans twice.
The Jaguars are one team that looks better when you only consider games against good opponents. The Bears are the other, moving from 13th to 8th in weighted DVOA. You can also see here how Minnesota's midseason surge was just a bunch of wins over bad teams.
Anthony Coleman: In your opinion, where does Marvin Harrison rank among history's greatest receivers?
Aaron Schatz: I wonder if Harrison will suffer from being connected to Peyton Manning, since there will be a bit of an assumption that Manning made Harrison. You don't have that with Jerry Rice, because he played so absurdly long and with three well-known quarterbacks. But by the end of his career, Harrison will be second or third in all-time receiving yards [14] (he's tenth, 1700 yards from third), second in touchdowns [15] (he's already third). I'm guessing that a majority of people will consider him the third-greatest receiver of all time, behind Rice and Don Hutson, but there will be a sizeable minority who disagree, favoring Cris Carter or Tim Brown or Lance Allworth, or maybe even Rod Smith. (I'm deliberately not doing a huge research project here, because I think this should spawn some fun discussion.)
Links:
[1] http://www.footballoutsiders.com/formmailer/contact.php
[2] http://www.footballoutsiders.com/info/methods
[3] http://www.footballoutsiders.com/2005/12/20/ramblings/dvoa-rankings/3393
[4] http://www.footballoutsiders.com/2005/12/20/ramblings/dvoa-rankings/3393/#comment-106889
[5] http://www.footballoutsiders.com/stats/teameff.php
[6] http://www.footballoutsiders.com/info/glossary#rb_success
[7] http://www.footballoutsiders.com/2005/12/20/ramblings/dvoa-rankings/3393/#comment-107205
[8] http://www.footballoutsiders.com/2005/12/20/ramblings/dvoa-rankings/3393/#comment-106655
[9] http://www.footballoutsiders.com/2005/12/20/ramblings/dvoa-rankings/3393/#comment-106644
[10] http://www.footballoutsiders.com/2005/12/10/ramblings/fo-mailbag/3349/
[11] http://www.footballoutsiders.com/ramblings.php?p=87&cat=1
[12] http://www.footballoutsiders.com/2005/12/20/ramblings/dvoa-rankings/3393/#comment-106921
[13] http://www.footballoutsiders.com/stats/wr.php
[14] http://www.pro-football-reference.com/misc/rcy.htm
[15] http://www.pro-football-reference.com/misc/rct.htm