Writers of Pro Football Prospectus 2008

22 Oct 2012

What Happened to Football's Next Great Stat?

ESPN's Total QBR was supposed to be the ultimate rating that quantified the most complicated position in sports. Instead, it more or less disappeared except on ESPN's website. What happened? Really good article by Aaron Gordon, and I'm not just saying that because I'm quoted a lot in it, although it does feature my "proprietary stats as ham sandwich" metaphor, which I think I've used in five or six different conversations since I did the interview for this article.

Posted by: Aaron Schatz on 22 Oct 2012

58 comments, Last at 24 Oct 2012, 11:23pm by Bowl Game Anomaly


by JIPanick :: Mon, 10/22/2012 - 2:16pm

No offense, but Total QBR was like Jamarcus Russell; ESPN was hyping it up, but most folks knew it was going to be useless, except possibly in a hilarious train-wreck sort of a way.

It failed because there's no market for it, sure, but there was no market for it because it's a bad stat.

by Theo :: Mon, 10/22/2012 - 2:25pm

I loved that stat because with that stat I could prove that my guy was better than their guy.

by Moin (not verified) :: Mon, 10/22/2012 - 2:26pm

Is it me or is there a slight mis-representation of what the Central Limit Theorem is? It is a nuanced difference between taking a sample of the same random distribution over and over again (Law of Large Numbers) and saying that the addition of large number of finitely defined random distributions will converge to a normal distribution (Central Limit Theorem).

by Jeff M. (not verified) :: Mon, 10/22/2012 - 4:10pm

Not slight. Burke gets it out-and-out wrong and the article takes him at his word.

The most charitable interpretation is, however, that he just meant to invoke the Law of Large Numbers and misspoke (as you point out).

by Moin (not verified) :: Tue, 10/23/2012 - 2:21am

Well, I was trying to be charitable. That is a pretty fundamental statistical (yet understandably confusing) concept to get mixed up.

by JasonK :: Mon, 10/22/2012 - 2:29pm

To extend the metaphor, QBR's "clutchiness" adjustments are the mayonnaise on the ham sandwich.

by evenchunkiermonkey :: Mon, 10/22/2012 - 3:14pm

Berman wants to hear some more about this sandwich.

by Marko :: Mon, 10/22/2012 - 5:00pm

I doubt that Aaron would use that metaphor. Mayonnaise on a ham sandwich? No way.

by JasonK :: Mon, 10/22/2012 - 6:32pm


by Marko :: Mon, 10/22/2012 - 6:44pm

Aha. Good one.

by The Ninjalectual :: Tue, 10/23/2012 - 2:52am

IS mayo on a ham sandwich unusual or something? It's perfectly common where I come from.

by Marko :: Tue, 10/23/2012 - 2:35pm

It's a Jewish thing. Some Jews don't eat ham. Some don't use mayo. I don't know any (even if they are not religious) who would eat ham with mayo.

by Podge (not verified) :: Tue, 10/23/2012 - 7:08am

But then you could argue that DVOA's inclusion of things like red zone performance is the tomato, or inclusion of sacks is lettuce. Adding in opponent adjustments and down and distance context could be considered to some sort of artisan bread in comparison to the original QB Rating's plain white bread, and some people would prefer plain white bread instead of some sort of Tuscan Olive Loaf.

Hmmm. I'm hungry now.

by Guido Merkens :: Mon, 10/22/2012 - 2:29pm

I never understood what QBR and its attempt to measure clutch performance was trying to add to the discussion.

Anyone who cares about stats enough to care about QBR knows that at best "clutch performance" is a poor predictor of future success, and at worst has no correlation with future success. So QBR is a primarily backward-looking stat. Fine. But as such, it didn't provide much insight, because "clutch" performances are easy to spot without the help of advanced stats.

And anyway, most of the reason people look at statistics is to predict future performance, which QBR was far worse at than most other advanced metrics. Andrew Luck has a QBR of 73.0 this season. Does that mean that he looks like he will be a successful quarterback in the league... or that he led a second-half comeback against a crappy team and has otherwise been ordinary? This rating creates more questions than it answers.

by QCIC (not verified) :: Mon, 10/22/2012 - 2:59pm

What happened was that they made the classic mistake of trying to be all things to all people. You cannot placate both the irrational fans and the rational ones by making something that is "semi-rational". QPR is the "Terra Nova" of stats. No one dies, there are never any consequences to anyone's actions, and everything is market tested within an inch of its life. It is also terrible.

I suspect that valued input from players and coaches too highly. Players and coaches are really poor at the numerical analysis of football generally.

by sundown (not verified) :: Mon, 10/22/2012 - 3:05pm

Agreed on QBR. But most statistics do a lousy job predicting future performance, including DVOA and every stat on this site. Interestingly enough, in the early days of Football Outsiders, Aaron used to mention fairly often that the intent wasn't to be predictive but just to better understand what had happened. But I can't recall the last time an Outsider actually made that point, either because their focus changed to believing it could be predictive or simply because the question maybe doesn't come up as much any more.

by Sophandros :: Mon, 10/22/2012 - 3:16pm

Along with the other flaws voiced here, I think that ESPN failed when they tried to create a statistic that is both predictive and explanatory while failing to perform either role well. Or, as others have said, they're trying to please as many people as possible.

I also have a problem with their apparent refusal to refine and update the algorithms behind the metric, as FO has done with DVOA and DYAR.

Sports talk radio and sports message boards are the killing fields of intellectual discourse.

by RickD :: Mon, 10/22/2012 - 6:31pm

I would disagree with this:
"most of the reason people look at statistics is to predict future performance"

People look at statistics because they constitute an objective way to quantify judgments that are otherwise subject to biases and various conflicting interpretations. If you have stats that can predict future performance, great!

Edit: I would agree with your comment to the extent that the people who generate a stat insist on keeping its method of generation secret. If you don't explain how a stat is generated, and if it fails to have any predictive value, the stat you produce is of little interest.

by Brent :: Tue, 10/23/2012 - 4:07pm

I think it can actually be extended: I think all human intellectual endeavors are an attempt to predict the future.

That's probably not quite true as broadly as I stated it, but I think it's very close. We don't like uncertainty. We study things in an effort to remove that uncertainty.

by Anonymous Reader (not verified) :: Tue, 10/23/2012 - 1:43pm

I think the real issue is that they never justified that performance in high-leverage situations (i.e. 'clutch' performance) makes the stat better overall. I am willing to believe that some quarterbacks could wilt when needing to drive for a quick touchdown to tie / win (mostly because i believe the inverse - Peyton Manning for example is extremely adept at 2 minute drills, why couldn't someone be extremely inept at them?)

But that belief needs to be fed numbers. Not platitudes from ESPN talking heads. If they can show that the stat better correlates to future performance, then it ought be included. If including leverage makes it a worse stat, it should be pulled.

And thats what a lot of this comes down to - they've never really gone into how the stat works, done comparisons and justifications for why components are included, etc. They should have when they unveiled it, but its not really what their target audience wants, so they didn't.

by mavajo (not verified) :: Mon, 10/22/2012 - 2:48pm

"Clutch-factor" doomed QBR from the start. The big mistake was trying to quantify things that really have no business being quantified. I mean, c'mon, you're trying to tell me Charlie Batch's game against Tampa back in 2010 was the greatest quarterback performance of all time? Ridiculous. I only have to go back a week to find a game that absolute blows Charlie's out of the water -- Rodgers' 6 touchdown game, on the road, against one of the best defenses in the NFL. Any formula that puts Batch's game over that is immediately discredited.

Don't try to quantify things that aren't quantifiable. Leave that up to the observer.

by Stats are for losers (not verified) :: Mon, 10/22/2012 - 3:24pm

Well, half the fun of this website and others is the attempt to quantify the unquantifiable!

It's just EPA adjusted for WPA. PAR makes an appearance, too, which the old-timers will remember from the days before DYAR.

If they'd adjusted for defense, and not used the word "clutch," it'd probably be a perfectly cromulent rating system for us to argue about incessantly instead of dismissing summarily.

by DGL :: Mon, 10/22/2012 - 2:50pm

"ESPN's Total QBR was supposed to be the ultimate rating that quantified the most complicated position in sports."

See, I thought ESPN's Total QBR was supposed to be a proprietary, made-up stat with no analytic basis that ESPN used to hype up its football coverage.

by QCIC (not verified) :: Mon, 10/22/2012 - 3:00pm

Well said.

by Arkaein :: Mon, 10/22/2012 - 3:30pm

If there is a need to replace passer rating (and I think there is), I wish PFR's ANY/A could get more traction. It really is a better passer rating.

It's no more complex than official passer rating, is easier to calculate becuase it starts with a well understood statistic in YPA and modifies it, with fewer magic numbers than passer rating requires. Most importantly, it's simpler better by being more balanced in favor of passing plays that actually contribute to winning than passer rating is.

It's also non-proprietary, and due to it's relative simplicity can do a good job without annual tweaking, factors that I think make it more palatable for a broad audience than stats like passing DVOA.

The main drawbacks are lack of opponent adjustments and lack of incorporation of QB running plays, but even these issues could be solved easily.

For opponent adjustments it would be easy to calculate ANY/A for defenses and compare a QB's per-game ANY/A to the average of the defense faced over the course of a season. The difference bwtween the two values could produce a "DANY/A" value.

QB runs could be incorporated simply by counting each run as an "attempt" and adding rushing yardage to passing yardage.

The end result would be a simple, non-proprietary, but very useful rate stat. It could even be converted into a cumulative stat like YAR or DYAR by multiplying times total number of attempts (passes + sacks + QB runs), at least if a replacement level baseline could be selected.

by jimm (not verified) :: Mon, 10/22/2012 - 3:59pm

sounds like a very sensible approach to me.

by Chase (not verified) :: Tue, 10/23/2012 - 2:04pm

That's what I've been doing, first at the PFR Blog, and now at Football Perspective, for the last few years.


That gives you the sort of value over average you're looking for.

Re: SOS, I've done that a few times, too, although not in this latest version. It's certainly a worthwhile addition, but I was short on time to go SOS analysis for 80 years.

The reason we don't include rushing data is two-fold. One, kneel downs can be pretty significant, because we're talking about 3 plays for -3 yards to a guy who might average 8 ANY/A. If he's at 8 ANY/A on 30 passes before the kneels, he's at 7.18 ANY/A after. The other reason is because a lot of runs are short but for first downs, which makes them more valuable than they might appear.

To be fair, for games since 2000, this is something we can pretty easily fix due to PBP logs. But when I'm trying to analyze QBs on the same scale since 1940, it doesn't really work.

by Arkaein :: Wed, 10/24/2012 - 9:13pm

Thanks for pointing this out, this is really good stuff.

One other thing I had thought about, regarding the value of first downs, would be to have a bonus for plays resulting in first downs (and possibly a penalty for plays that didn't achieve a first down, for sake of symmetry). This would require full play-by-play data to generate, but would help solve the issue of short but valuable QB rushes that you mentioned, while also rewarding QB who completed passes in the same situations. Essentially it would serve as a proxy for QB success rate.

You also use an interesting approach in counting rushing yards past 4 and rushing TDs. Have you considered counting all rushes that don't lose yardage, both for yards and attempts? That would work around the kneel down issue while still capturing plays like most failed QB sneaks. Without actually looking at the data I'd think there are fairly few plays with a QB getting tackled for a loss that aren't already considered sacks or kneeldowns.

by rj (not verified) :: Mon, 10/22/2012 - 3:34pm

Dear God, what a horrible article, a person so convinced of his own intelligence that he considers all speaking to him beneath him and he spends half his time on that before even getting to the reason he's writing and even then it's only given a modicum of attention.

I think that is what the hipster snob culture applied to sports looks like.

by Bobman :: Mon, 10/22/2012 - 3:43pm

What? Wait, you mean that Luck is not (currently) elite? Aw crap!

I have to admit that the 1-100 scale is really, really comfy. Like a good chair, or pair of running shoes.

by dmstorm22 :: Mon, 10/22/2012 - 4:46pm

You bring up a good point with the 1-100 scale.

If whoever created passer rating just added one extra line to the formula of '/1.583, and normalized it to 100 as the maximum, that passer rating would've been much more accepted.

by Jerry :: Tue, 10/23/2012 - 3:56am

As I recall, passer rating was designed so that 100 would be a REALLY good season.

by Dean :: Mon, 10/22/2012 - 4:49pm

Well, there's what, 7 billion people on the planet? And even if you have such an anti colt bias that you called him one of the top 40 QBs in the game and meant it as an insult, that's still kinda elite. Even if you factor out women and non-US citizens (as the rest of the world largely ignores football), and even if you narrowed it down to ages 18-40, he'd STILL be in the top 40 (and realistically smaller than that) of, say, 40 million or so.

Now granted if we were talking about Mark Sanchez, I'd say the same things and then say, "but he still sucks."

by rageon :: Mon, 10/22/2012 - 6:11pm

I'm not sure he's "elite." However, from what I can tell from my 10 minutes a day of ESPN radio, apparently 15 quarterbacks are "elite," another 15 are "average," and the rest are Jay Cutler and Tim Tebow.

by rj (not verified) :: Mon, 10/22/2012 - 3:45pm

"But most statistics do a lousy job predicting future performance, including DVOA and every stat on this site. Interestingly enough, in the early days of Football Outsiders, Aaron used to mention fairly often that the intent wasn't to be predictive but just to better understand what had happened."

I respectfully disagree. That's like an economist telling you his job is not to determine when recessions are likely to happen but tell you why we entered a recession four years ago. Society hates those people for that.

If you are going to do advanced statistical analysis I expect said metric to win the weekly picks pool against a group of subjective pickers and to make a profit over time betting in Vegas because aforementioned advanced statistical analysis should be smarter than those people.

by tuluse :: Mon, 10/22/2012 - 4:06pm

Why? Do you think Vegas isn't using it's own advanced stats?

I would expect DVOA to pick about 50% against the spread, and I think that's a solid achievement right there.

by The Powers That Be :: Mon, 10/22/2012 - 4:58pm

According to FO's Premium Picks, they're at .537 vs. the spread since 2008. Of course, there's the question of whether you can actually get the spreads they're using, so maybe you need to knock it down a bit.

by DisplacedPackerFan :: Mon, 10/22/2012 - 4:59pm

Well it does a bit better than 50% ATS. Looks like it gets them to about 53%.

2008 regular season: 133-114-8, .537
2009 regular season: 127-121-8, .512
2010 regular season: 141-108-7, .561
2011 regular season: 127-117-12, .520
2012 so far: 50-39-2, .560 (before this week)

This week wasn't good my quick check was 3-8-1 so far so 53-47-3 so far this year it looks, which is right around the average.

Straight up picks DVOA will get you about 64%

by Lance :: Mon, 10/22/2012 - 4:48pm

I disagree. For me, FO was a great way to think about things in a historical way. For instance-- How good was team X? I got tired of rating a defense by things like "yards allowed" and thinking that such things were a good way to measure defense across time.

I'm still not convinced that the metrics presented at FO are perfect measures of that (and the grand questions: who would win between the winners of Super Bowl Y and Super Bowl Z?), but they're better than what we've seen before. I don't much care about predictive powers of such stats only because I know that those efforts are futile.

by BDC :: Mon, 10/22/2012 - 3:52pm

"If you are going to do advanced statistical analysis I expect said metric to win the weekly picks pool against a group of subjective pickers and to make a profit over time betting in Vegas because aforementioned advanced statistical analysis should be smarter than those people."

Except that it doesn't, and it isn't really that hard (though a little time consuming) to run the numbers and see that the aforementioned advanced statistics aren't any better at predicting future results then subjective analysis.

by rj (not verified) :: Mon, 10/22/2012 - 4:44pm

"Except that it doesn't, and it isn't really that hard (though a little time consuming) to run the numbers and see that the aforementioned advanced statistics aren't any better at predicting future results then subjective analysis."

So what's the point of wasting your time doing this stuff then?

by 40oz to Freedom (not verified) :: Mon, 10/22/2012 - 4:58pm

Any QB stat that ranked Tebow at the top is bound to fail.

by andrew :: Mon, 10/22/2012 - 5:38pm

" The game-charting is a labor-intensive task that only ESPN could do, Schatz says"


by Arkaein :: Mon, 10/22/2012 - 5:40pm

FO takes a long time to turn around full charting stats, due to it being done largely by volunteers. ESPN has it done in something like a day (I presume).

by DisplacedPackerFan :: Mon, 10/22/2012 - 5:59pm

Actually having looked at box scores they are almost getting the data collected and compiled in "real time", they put that QBR stat in the box scores at half time (or at least I've seen it there at half time a couple of times), and then update it again usually within 15 minutes of the game ending.

They can probably throw two or 3 interns at each game, following the charting formulas with access to the coaches film pretty much real time (I'm sure they pay the NFL for it).

FO is happy to get the charting data turned around in just a week, because as mentioned it's a volunteer force, and one that I'm hoping my scheduling will allow me to be a part of next season.

by zlionsfan :: Mon, 10/22/2012 - 11:36pm

And they can pay a raftload of developers to create and maintain a web-based tool (I'd guess) to enter that information, a database to manage the data alongside their usual NFL storage, and a distribution process to get the compiled information out to FO and others.

When we get the charting workbook on Tuesdays that contains the previous week's plays, it already includes all the ESPN stuff. We chart things that ESPN currently isn't looking at, and we might catch some things that they'll miss, most likely due to the volume of plays they chart, but they've done a considerable amount of work before we even get started ... and even with more volunteers now than in the past, it still takes more than a week to get everything in, and after that, FO still has to combine all of our charts into a single file and then run their processes on it.

by Karl Cuba :: Wed, 10/24/2012 - 5:49pm

This is what I don't get. There are at most 16 games per week, if it takes about four hours to chart what FO wants from a game then that would be 64 man hours per week. If I was Aaron I'd pay sixteen college kids ten bucks an hour or a flat 40 dollars to get it done by Sunday evening. That's $640 per week or just over ten thousand dollars for the regular season. That might sound like a hefty sum but then he could run all those charting numbers by Wednesday at the latest and have current charting data available on FO premium, which would probably either get more people to but it or allow him to charge a bit more. Then they could run more content exploring and analysing the resultant data, resulting in more page views and probably more income from more articles on ESPN.

This should also allow FO to publish an end of season review book by March with all the stats and analysis of the past season. Not a replacement of FOA but the chance to get our grubby hands on their data months earlier, then publish another season preview at the regular time, after the dust of free agency and the draft has settled. Hey presto, another revenue stream!

If you really wanted to push the boat out I'd pay $10 an hour to a team of people to copy and paste the play by play for Aaron live so that they could publish DVOA updated drive by drive. Imagine what 'Vegas-based' gamblers would pay for that, real information that would pinpoint which teams are ahead as a result of unsustainable chance events and so are likely to fall back as the game goes on. This would also allow Aaron to publish conditional DVOA figures by Monday morning, when everyone is still digesting the previous day's games.

by JonFrum :: Mon, 10/22/2012 - 6:09pm

The reason stats in general fail to attract general acceptance is that no one likes to be told - with authority - that they are wrong. It's one thing to argue with your buddy, or disparage a columnist for his ideas. But Joe Public senses that a stat-generated rating is going to shut down discussion, with him the loser. This is just one more in a long line of populist vs expert battles, in which the expert always gets his lunch money taken away, and ends up stuffed in a locker.

by dbostedo :: Mon, 10/22/2012 - 6:28pm

I've always wondered if the main reason QBR didn't get more traction was because it was an ESPN stat. As such, it didn't get much play on CBS, or CNN, or FOX, etc. If it had gained popularity organically from, say, this site, or Pro-Football-Reference, maybe it would have been more adopted.

by Ben :: Tue, 10/23/2012 - 10:17am

I think there's something to this. Honestly, nothing is going to replace passer rating unless the NFL starts putting it in gamebooks and it's official stats. That's not going to happen if the stat is proprietary to some other company.

by Gri34 (not verified) :: Mon, 10/22/2012 - 6:53pm

When you read the article, there are many references (by Aaron S) to how useful the stat is--I wonder if there's a way to gain access to all the ESPN data (since there's at least some connection b/t FO and ESPN) and tweak it

by Red (not verified) :: Mon, 10/22/2012 - 8:46pm

In addition to all the other issues mentioned here, the biggest problem I have with QBR is that it only goes back to 2008. What good is a stat if it doesn't allow for historical comparisons? For all its flaws, at least Passer Rating can be calculated for any QB in history, and can be further calibrated by adjusted for era, providing a decent historical comparison for QB's of any time period.

One other stupid feature of QBR - it apparently penalizes QB's for interception and fumble return yardage. For example, Kurt Warner's SB performance against the Steelers rated a lowly 27.0, despite him putting up great numbers against the best defense in the league, and doing so in "clutch" situations. The only explanation is that he was essentially penalized -13 EPA or so for the Harrison INT return, which we all know is totally idiotic.

by Aaron Schatz :: Tue, 10/23/2012 - 1:50pm

If things go as planned, QBR will soon go back to 2006-2007 using FO game charting data.

by andrew :: Wed, 10/24/2012 - 12:49am

Is the ESPN game charting roughly the same as the FO game charting, then?

by zlionsfan :: Mon, 10/22/2012 - 11:39pm

I actually think that BBQ sauce is a better analogy than a ham sandwich. Everyone can see what's in a ham sandwich - I mean, here's the ham, here's the cheese, here's the bacon, you know?

But sauce is different. I can tell you what goes into the sauce, but that doesn't tell you how much of each ingredient I put in, or anything else I might do to it ... for all you know, I'm just buying it off the shelf and calling it my own, or maybe I'm adding ingredients that don't add anything at all to the taste ... or maybe one or more ingredients I do add are what makes the difference.

And if I don't tell you what goes into the sauce, you might ignore it just because you don't like the way it looks.

by Not Verified (not verified) :: Tue, 10/23/2012 - 1:47pm

Ham sandwich is also apropos. What kind of ham, crappy store brand or real smoked ham, wonder bread or rye or ciabatta, mayo or miracle whip, brown, yellow, horseradish mustard, onion, tomato, lettuce, toasted, ..... I am eating a ham sandwich now. But I do like the thought of calling DVOA the "sauce".

by TomC :: Tue, 10/23/2012 - 2:35pm

Still, most people outside the comment sections of Football Outsiders do believe in the concept of clutch...

Hey, look, guys! We've been identified as a population and generalized about! This is so exciting---I've never had an oppressed subgroup identity before.

by Bowl Game Anomaly :: Wed, 10/24/2012 - 11:23pm

Must have been tough enjoying your white privilege.