In this week's Varsity Numbers, Bill Connelly revisits some measures and concepts: Adjusted Scores, Covariance, and momentum (or whatever you choose to call it).
31 Aug 2010
by Aaron Schatz
Over the next couple weeks, we're going to run a series of articles we're calling FO Basics. We get a lot of questions about our work, but there are also a lot of readers who don't ask questions. We hope this series will help answer some questions and clarify some confusing things for even those readers who don't respond on the message boards.
While reading Football Outsiders or the Football Outsiders Almanac, new readers will often come across an offhand comment about, for example, the idea that fumble recovery is not a skill, and wonder what in the heck we are talking about. In Football Outsiders Almanac 2010, we include an essay called "Pregame Show" which gives a basic look at some of the most important precepts that have emerged during seven years of Football Outsiders research. Today we are republishing that essay here as part of our preseason "FO Basics" series. You will also find links to the original research online when possible, or mentions of where that research appeared in print. (Some research was developed over time and therefore doesn't have specific articles to be linked.)
This essay is always featured on the website if you scroll to "About" in the green bar above and then scroll down and click on "FO Basics."
The first article ever written for Football Outsiders was devoted to debunking the myth of "establishing the run." There is no correlation whatsoever between giving your running backs a lot of carries early in the game and winning the game. Just running the ball is not going to help a team score; it has to run successfully.
There are two reasons why nearly every beat writer and television analyst still repeats the tired oldschool mantra that "establishing the run" is the secret to winning football games. The first problem is confusing cause and effect. There are exceptions, usually when the opponent is strong in every area except run defense, like last year's New Orleans Saints. However, in general, winning teams have a lot of carries because their running backs are running out the clock at the end of wins, not because they are running wild early in games.
The second problem is history. Most of the current crop of NFL analysts came of age or actually played the game during the 1970s. They believe that the run-heavy game of that decade is how football is meant to be, and today's pass-first game is an aberration. As we addressed in an essay in Pro Football Prospectus 2007 about the history of NFL stats, it was actually the game of the 1970s that was the aberration. The seventies were far more slanted towards the run than any era since the arrival of Paul Brown, Otto Graham, and the Cleveland Browns in 1946. Optimal strategies from 1974 are not optimal strategies for today's game.
A sister statement to "you have to establish the run" is "team X is 5-1 when running back John Doe runs for at least 100 yards." Unless John Doe is ripping off six-yard gains Chris Johnson-style, the team isn't winning because of his 100-yard games. He's putting up 100-yard games because his team is winning.
This is a corollary to the absurdity of "establish the run." If you don't believe us, meet our good friends the 2006-2007 Minnesota Vikings. With rare exceptions, teams win or lose with the passing game more than the running game -- and by stopping the passing game more than the running game. The reason why teams need a strong run defense in the playoffs is not to shut the run down early, it's to keep the other team from icing the clock if they get a lead. You can't mount a comeback if you can't stop the run.
Note that "good pass defense" may mean "good pass rush" rather than "good defensive backs."
On average, passing will always gain more yardage than running, with one very important exception: when a team is just one or two yards away from a new set of downs or the goal line. On third-and-1, a run will convert for a new set of downs 36 percent more often than a pass. Expand that to all third or fourth downs with 1-2 yards to go, and the run is successful 40 percent more often. With these percentages, the possibility of a long gain with a pass is not worth the tradeoff of an incomplete that kills a drive.
This is one reason why teams have to be able to both run and pass. The offense also has to keep some semblance of balance so they can use their play-action fakes, and so the defense doesn't just run their nickel and dime packages all game. Balance also means that teams do need to pass occasionally in short-yardage situations; they just need to do it less than they do now. Teams pass roughly 60 percent of the time on third-and-2 even though runs in that situation convert 20 percent more often than passes. They pass 68 percent of the time on fourth-and-2 even though runs in that situation convert twice as often as passes.
When you open your newspaper on Sunday morning, you'll see that the little agate-type previews of each game list team rankings by total yardage. That is still how the NFL "officially" ranks teams, but these rankings rarely match up with common sense. That is because total team yardage may be the most context-dependent number in football.
It starts with the basic concept that rate stats are generally more valuable than cumulative stats. Yards per carry says more about a running back's quality than total yardage, completion percentage says more than just a quarterback's total number of completions. The same thing is true for teams; in fact, it is even more important because of the way football strategy influences the number of runs and passes in the game plan. Poor teams will give up fewer passing yards and more rushing yards because opponents will stop passing once they have a late-game lead and will run out the clock instead. For winning teams, the opposite is true. Did Detroit really have a better passing game than San Diego or New England in 2006, or did the Lions have more passing yards because they went 3-13 and thus threw the ball more than any team except for Green Bay, while the Chargers and Patriots were a combined 26-6 and spent a lot of time killing the clock with the running game?
Total yardage rankings are also skewed because some teams play at a faster pace than other teams. In 2009, Houston had 320 more total yards than Indianapolis, but that's not because Houston had the better offense. The Colts ran only 164 offensive drives that year, compared to 180 offensive drives for the Texans.
This sounds absurdly basic, but when people consider team and player stats without looking at strength of schedule, they are ignoring this. In 2004, Carson Palmer and Byron Leftwich had very similar numbers, but Palmer faced a much tougher schedule than Leftwich did. Palmer was better that year, and better in the long run.
In 2007, Oakland running back Justin Fargas had four games with at least 115 rushing yards. Those games came against the teams ranked 31st (Miami), 30th (New York Jets), 29th (Houston), and 26th (Denver) in defensive DVOA against the run. On the other hand, he gained only 41 yards on 15 carries against Green Bay, ranked eighth, and only 58 yards on 22 carries against Minnesota, ranked first.
Because players and teams don't give the exact same performance every week, this is more of a general law, and it doesn't necessarily apply in the short term. Sometimes the short term lasts a whole year -- for example, if you are the 2006 Jacksonville Jaguars.
Our brethren at Baseball Prospectus believe that the most precious commodity in baseball is outs. Teams only get 27 of them per game, and you can't afford to give one up for very little return. So imagine if there was a new rule in baseball that gave a team a way to earn another three outs in the middle of the inning. That would be pretty useful, right?
That's the way football works. You may start a drive 80 yards away from scoring, but as long as you can earn 10 yards in four chances, you get another four chances. Long gains have plenty of value, but if those long gains are mixed with a lot of short gains, you are going to put the quarterback in a lot of difficult third-and-long situations. That means more punts and more giving the ball back to the other team rather than moving the chains and giving the offense four more plays to work with.
The running back who gains consistent yardage is also going to do a lot more for you late in the game, when the goal of running the ball is not just to gain yardage but to eat clock time. If you are a Chicago Bears fan watching your team with a late lead, you don't want to see three straight Matt Forte stuffs at the line followed by a punt. You want to see a game-icing first down.
A common historical misconception is that our preference for consistent running backs means that "Football Outsiders believes that Barry Sanders was overrated." Sanders wasn't just any boom-and-bust running back, though; he was the greatest boom-and-bust runner of all time, with bigger booms and fewer busts. Our play-by-play database only goes back to 1993, but Sanders led the league in rushing DYAR for 1996 and was second behind Terrell Davis in 1997.
Some readers complain that this idea contradicts the previous one. Aren't those consistent running backs just the product of good offensive lines? The truth is somewhere in between. There are certainly good running backs who suffer because their offensive lines cannot create consistent holes (Frank Gore in 2009, for example). Most boom-and-bust running backs, however, contribute to their own problems by hesitating behind the line whenever the hole is unclear, looking for the home run instead of charging forward for the four-yard gain that keeps the offense moving.
As for pass protection, some quarterbacks have better instincts for the rush than others, and are thus better at getting out of trouble by moving around in the pocket or throwing the ball away. Others will hesitate, hold onto the ball too long, and lose yardage over and over.
Note that "moving around in the pocket" does not necessarily mean "scrambling." In fact, a scrambling quarterback will often take more sacks than a pocket quarterback, because while he's running around trying to make something happen, a defensive lineman will catch up with him.
Over the past three seasons, offenses have averaged 5.9 yards per play from Shotgun, but just 5.1 yards per play with the quarterback under center. This wide split exists even if you analyze the data to try to weed out biases like teams using Shotgun more often on third-and-long, or against prevent defenses in the fourth quarter. Shotgun offense is more efficient if you only look at the first half, on every down, and even if you only look at running back carries rather than passes and scrambles.
Clearly, NFL teams have figured the importance of the Shotgun out for themselves. Over the past four seasons, the average team has gone from using Shotgun 19 percent of the time to 36 percent of the time, not even counting the Wildcat and other college-style option plays that have become popular in recent years. Before 2007, no team had ever used Shotgun on more than half its offensive plays. In the past two seasons, five different teams have used Shotgun over half the time. It is likely that if teams continue to increase their usage of the Shotgun, defenses will adapt and the benefit of the formation will become less pronounced.
Terrell Davis, Jamal Anderson, and Edgerrin James all blew out their knees. Larry Johnson broke his foot. Earl Campbell and Eddie George went from legendary powerhouses to plodding, replacement-level players. Shaun Alexander broke his foot and became a plodding, replacement-level player. This is what happens when a running back is overworked to the point of having at least 370 carries during the regular season.
The "Curse of 370" was expanded in Pro Football Prospectus 2006 to include seasons with 390 or more carries in the regular season and postseason combined. Research also shows that receptions don't cause a problem, only workload on the ground.
Plenty of running backs get injured without hitting 370 carries in a season, but there is a clear difference. On average, running backs with 300 to 369 carries and no postseason appearance will see their total rushing yardage decline by 15 percent the following year and their yards per carry decline by two percent. The average running back with 370 or more regular-season carries, or 390 including the postseason, will see their rushing yardage decline by 35 percent, and their yards per carry decline by eight percent. However, the Curse of 370 is not a hard and fast line where running backs suddenly become injury risks. It is more of a concept where 370 carries is roughly the point at which additional carries start to become more and more of a problem.
Research in Pro Football Prospectus 2008 suggests that overuse in college does not create a problem for top prospects, but also shows that players chosen after the first round rarely have a successful NFL career after a college season over 330 carries.
We don't yet know enough to precisely parse the blame for incomplete passes, but we know that wide receiver catch rates are as consistent from year to year as quarterback completion percentages. Since 2001, Hines Ward has never had caught fewer than 59 percent of intended passes, whether from Kordell Stewart, Tommy Maddox, or Ben Roethlisberger. Plaxico Burress, playing with the same quarterbacks as well as with Eli Manning, never caught more than 58 percent of intended passes, and in three different years had a catch rate below 50 percent. However, it is also important to look at catch rate in the context of the types of routes each receiver runs. We recently expanded on this idea with a new plus/minus metric.
There are three units on a football team, but they are not of equal importance. Our DVOA ratings provide good evidence for this. The special teams ratings are turned into DVOA by comparing how often field position on special teams leads to scoring compared to field position and first downs on offense. After figuring out these numbers, the top ratings for special teams are roughly one-third as high as the top ratings for offense or defense.
Nobody in the NFL understands this concept better than Indianapolis Colts general manager Bill Polian. Both the Super Bowl champion Colts and the four-time AFC champion Buffalo Bills of the early 1990s were built around the idea that if you put together an offense that can dominate the league year after year, eventually you will luck into a year where good health and a few smart decisions will give you a defense good enough to win a championship. (As the Colts learned in January 2007, you don't even need a year, just four weeks.) Even the New England Patriots, who are led by a defense-first head coach in Bill Belichick, have been more consistent on offense than on defense since they began their run of success in 2001.
Defensive penalties often represent strong defensive play that goes just over the line between legal and illegal. As long as penalties are only called every so often, this kind of close play leads to successful defense.
Connected to this precept: The penalty that correlates highest with losses is the False Start, and the penalty that teams will have called most consistently from year to year is the False Start.
This theory, which originally appeared in the New York Times in October 2006, is one of the our most controversial, but it is hard to argue against the evidence. Measuring every kicker from 1999 to 2006 who had at least ten field goal attempts in each of two consecutive years, the year-to-year correlation coefficient for field-goal percentage was an insignificant .05. Mike Vanderjagt didn't miss a single field goal in 2003, but his percentage was a below-average 74 percent the year before and 80 percent the year after. Adam Vinatieri, supposedly the best kicker in the game, has never has never had two straight seasons with accuracy better than last year's NFL average of 83 percent.
On the other hand, the year-to-year correlation coefficient for kickoff distance, over the same period as our measurement of field-goal percentage and with the same minimum of ten kicks per year, is .61. The same players consistently lead the league in kickoff distance, particularly Rhys Lloyd, Olindo Mare, and Stephen Gostkowski.
Stripping the ball is a skill. Holding onto the ball is a skill. Pouncing on the ball as it is bouncing all over the place is not a skill. There is no correlation whatsoever between the percentage of fumbles recovered by a team in one year and the percentage they recover in the next year. The odds of recovery are based solely on the type of play involved, not the teams or any of their players.
Fans like to insist that specific coaches can teach their teams to recover more fumbles by swarming to the ball. Chicago's Lovie Smith, in particular, is supposed to have this ability. However, since Smith took over the Bears, their rate of fumble recovery on defense went from a league-best 76 percent to a league-worst 33 percent in 2005, then back to 67 percent in 2006. Last year, they recovered 57 percent of fumbles, close to the league average.
Fumble recovery is equally erratic on offense. In 2008, the Bears fumbled 12 times on offense and recovered only three of them. In 2009, the Bears fumbled 18 times on offense, but recovered 13 of them.
Fumble recovery is a major reason why the general public overestimates or underestimates certain teams. Fumbles are huge, turning-point plays that dramatically impact wins and losses in the past, while fumble recovery percentage says absolutely nothing about a team's chances of winning games in the future. With this in mind, Football Outsiders stats treat all fumbles as equal, penalizing them based on the likelihood of each type of fumble (run, pass, sack, etc.) being recovered by the defense.
Other plays that qualify as "non-predictive events" include blocked kicks and touchdowns during turnover returns. These plays are not "lucky," per se, but they have no value whatsoever for predicting future performance.
Every yard line on the field has a value based on how likely a team is to score from that location on the field as opposed to from a yard further back. The change in value from one yard to the next is the same whether the team has the ball or not. The goal of a defense is not just to prevent scoring, but to hold the opposition so that the offense can get the ball back in the best possible field position. A bad offense will score as many points as a good offense if it starts each drive five yards closer to the goal line.
A corollary to this precept: The most underrated aspect of an NFL team's performance is the field position gained or lost on kickoffs and punts. This is part of why players like Devin Hester and Josh Cribbs can have such an impact on the game, even when they aren't taking a kickoff or punt all the way back for a touchdown.
Although play in the red zone has a disproportionately high importance to the outcome of games relative to plays on the rest of the field, NFL teams do not exhibit a level of performance in the red zone that is consistently better or worse than their performance elsewhere, year after year. The simplest explanation why is a small(er) sample size and the inherent variance of football, with contributing factors like injuries and changes in personnel.
We discovered this when creating our first team projection system in 2004. It said that the lowly San Diego Chargers would have of the best offenses in the league, which seemed a little ridiculous. But looking closer, our projection system treated the previous year's performance on different downs as different variables, and the 2003 Chargers were actually good on first and second down, but terrible on third.
Teams get fewer opportunities on third down, so third-down performance is more volatile -- but it's also is a bigger part of a team's overall performance than first or second down, because the result is usually either very good (four more downs) or very bad (losing the ball to the other team with a punt). Over time, a team will play as well in those situations as it does in other situations, which will bring the overall offense or defense in line with the offense and defense on first and second down.
This trend is even stronger between seasons. Struggles on third down are a pretty obvious problem, and teams will generally target their off-season moves at improving their third-down performance ... which often leads to an improvement in third-down performance.
However, we have discovered something surprising over the past three years: The third-down rebound effect seems to have disappeared on offense, as we explain in the Philadelphia chapter of Football Outsiders Almanac 2010. We don't know yet if this change is temporary or permanent, and there is no such change on defense.
There are no doubt teams with streaks of good or bad health over multiple years. However, teams who were especially healthy or especially unhealthy, as measured by our Adjusted Games Lost (AGL) metric, almost always head towards league average in the subsequent season. Furthermore, injury - or the absence thereof - has a huge correlation with wins, and a significant impact on a team's success. In 2008, six of the seven least-injured teams in the league made the playoffs. The teams with the biggest drop in injuries between 2008 and 2009 were Baltimore and Dallas, a pair of playoff teams. Teams with a high number of injuries are a good bet to improve the following season.
Connected to the previous statement, because teams need to go into the season expecting that they will suffer an average number of injuries no matter how healthy they were the previous year. The Redskins went into 2006 with a Super Bowl-quality starting lineup, and finished 5-11 because they had no depth. You cannot concentrate your salaries on a handful of star players because there is no such thing as avoiding injuries in the NFL. Every team will suffer injuries; the only question is how many. The game is too fast and the players too strong to build a team based around the idea that "if we can avoid all injuries this year, we'll win."
This research was originally done by Doug Drinen (editor of Pro-Football-Reference.com). In recent years, a few players have had huge seasons above these general age limits (most notably Tiki Barber, Tony Gonzalez, and Terrell Owens), but the peak ages Drinen found a few years ago still apply to the majority of players.
During the summer of 2007, ESPN The Magazine asked us to research when players decline at "non-skill" positions. This research was not as rigorous as our usual work, and needs a little more attention before we're ready to stand by it. For the curious, however, the preliminary results said that defensive ends and defensive backs generally begin to decline after age 29, linebackers and offensive linemen after age 30, and defensive tackles after age 31.
This theory was introduced in Pro Football Prospectus 2006 and further refined in Pro Football Prospectus 2007. The projection created by these stats is known as the Lewin Career Forecast, after the creator of the theory, David Lewin, who now works for the Cleveland Cavaliers.
Scouts expected players such as Kyle Boller (48 percent), Jim Druckenmiller (54 percent) and Ryan Leaf (54 percent) to suddenly figure out how to complete passes once they hit the NFL. It isn't surprising that it didn't happen. Having a high completion percentage (above 60 percent or so) is no guarantee of success, especially if it was done in a small number of games in a fluky system (Tim Couch being a strong example), but it is a prerequisite for it. Games started are important because the more film that exists of a player in game conditions, the easier it is to find weaknesses that might come out against different opponents or different schemes. When scouts don't get sufficient information, they place too much weight on "measureables" and off-field workouts, and make mistakes like Couch (26 starters), Leaf (24 starts) or Akili Smith (19 starts).
The Lewin Career Forecast only applies to the first two rounds because it assumes that with enough game film to judge, scouts can accurate identify players who are "system quarterbacks" and will not succeed in the NFL, and those players appropriately fall on draft day (Texas Tech quarterbacks like Graham Harrell are a good historical example.)
From 1996-2005, the worst quarterback drafted in the top two rounds who had 37 or more college starts and a completion rate above 60 percent was Eli Manning. When the worst projection belongs to a quarterback who just led a two-minute drill to finish off a historic Super Bowl upset, that's a good projection system. However, the Lewin system has mixed successes (Kevin Kolb) with failures (Brady Quinn, Matt Leinart, Brian Brohm) in recent years, and we're likely to revisit it in the near future.
Football Outsiders Almanac 2009 introduced a new metric called Playmaker Score, which measures rookie wide receivers by simply multiplying average yards per reception and total career touchdowns in college. Players who score high in this metric do not necessarily become stars in the NFL, but no first- or second-round pick with a score below 8.0 has yet to live up to his draft position. Like the Lewin Career Forecast, Playmaker Score is far more accurate with receivers chosen in the first two rounds, and it doesn't seem to work for hybrid slot receiver/running backs such as Percy Harvin and Dexter McCluster.
Football games are often decided by just one or two plays -- a missed field goal, a bouncing fumble, the subjective spot of an official on fourth-and-1. One missed assignment by a cornerback, or one slightly askew pass that bounces off a receiver's hands and into those of a defensive back five yards away and the game could be over. In a blowout, however, one lucky bounce isn't going to change things.
Championship teams beat their good opponents convincingly and destroy the cupcakes on the schedule. Certainly there are exceptions to this rule, including the past two Super Bowl champions. Unless this becomes a trend that lasts four or five years, it is hard to say this rule no longer exists.
48 comments, Last at 06 Sep 2010, 10:49am by Theo