DVOA Analysis
Football Outsiders' revolutionary metrics that break down every single play of the NFL season

Week 20 DVOA Ratings

Travis Kelce
Photo: USA Today Sports Images

As usual after the conference championship games, we're not going to bother with the full 32-team table of weighted DVOA ratings, since there are only two teams left and most teams haven't played for three weeks. We'll just take a quick look at both teams.

Tampa Bay is now No. 1 in both weighted and total DVOA after the Bucs' NFC Championship victory over Green Bay. The Buccaneers have ranked in the top three of DVOA ever since Week 4, including when they were just 7-5 and thought to be struggling. They always were playing better than their win-loss record and our ratings reflected this.

On the other hand, you have the Kansas City Chiefs, the team that conventional wisdom likes more than DVOA. The Chiefs rank only fifth in weighted and total DVOA despite their victory in the AFC Championship, but some of that is due to their Week 17 game where they sat their starters. The Chiefs would rank third if we remove that game.

  TOT Rk OFF Rk DEF Rk ST Rk
TB weighted 38.7% 1 27.6% 2 -13.9% 5 -2.8% 27
KC weighted 27.4% 5 25.6% 3 1.7% 18 3.5% 8
KC weighted (No Week 17) 34.1% 3 27.3% 3 -3.3% 15 3.5% 8
TB total 33.7% 1 20.9% 3 -15.7% 5 -2.9% 26
KC total 25.2% 5 26.1% 2 1.5% 18 0.5% 17
KC total (No Week 17) 28.4% 3 27.1% 2 -1.0% 13 0.3% 17

For those who are curious, the team that is second in weighted DVOA between Tampa Bay and Kansas City without Week 17 is Green Bay. The team that is second in total DVOA between Tampa Bay and Kansas City without Week 17 is New Orleans.

Next, here are one-game ratings for the conference championships. As you can see, the Kansas City Chiefs were outstanding, putting up their best single-game DVOA of the year.

DVOA (with opponent adjustments)
TEAM TOT OFF DEF ST
TB 56% 13% -35% 8%
GB 26% 8% -17% 1%
BUF -48% -29% 31% 12%
KC 94% 60% -31% 3%
VOA (no opponent adjustments)
TEAM TOT OFF DEF ST
TB 29% 9% -12% 8%
GB -6% -8% 0% 1%
BUF -57% -23% 46% 12%
KC 76% 60% -13% 3%

We've written numerous times about how DVOA was unimpressed by some of Kansas City's close wins this season. That was not the case in the AFC Championship Game, which was not as close as the final score of 38-24. Buffalo getting that close required a muffed punt recovered by the punt coverage (a rare event) and an onside kick recovery (an even rarer event). Kansas City gained more 1.7 yards per play plus Buffalo had lower third-down efficiency and an interception.

Kansas City's single-game rating of 94.3% was the sixth-highest single game by any team this season. It was by far the highest single game for Kansas City, way past the 54.6% they had for Week 3's win against Baltimore. The Chiefs had their best single game on offense and their second-best game on defense behind that Baltimore win.

This was the latest piece of evidence for the "Kansas City flips the switch" theory. That's the theory that the Chiefs deliberately took it easy with big leads during the regular season, leading to string of wins that ended up a lot closer than they should have been, but were always going to play at a higher level in the postseason when the games really mattered.

The week before, in the divisional round, Kansas City had 71.5% offensive DVOA and -23.2% defensive DVOA in the first half of the game, before Patrick Mahomes went out with an injury. So that's one and a half games in the playoffs where the Chiefs have been absolutely dominant. If they've truly "flipped the switch," they've done it on both offense and defense.

I was not a believer in the "flip the switch" theory because we've never really seen it happen in the NFL. We've had teams go on runs in the postseason, sure, but those were lesser teams in the regular season that got hot, not defending champions with the expectations that the Chiefs had this year. We've seen really good teams with high expectations win a ton of close games and then lose in the playoffs, suggesting that maybe they weren't as good as their win-loss record. There hasn't been a team that underperformed in the regular season and won a bunch of games anyway and then turned it up to a championship level in the playoffs.

Maybe the closest team to this year's Chiefs is the 2009 Indianapolis Colts. That Colts team did far outplay its Pythagorean projection and DVOA, winning a number of close games in the second half of the season before they sat their starters in the final game and a half of the regular season and took two losses. The 2009 Colts won their first two playoff games fairly easily. But in the Super Bowl, the switch flipped back, and they lost to New Orleans despite being 4.5-point favorites.

I went to take a look at the Chiefs compared to a number of other recent outstanding offensive teams, and the way the Chiefs played with a late lead does stand out. In the fourth quarter with a lead of more than eight points, the Chiefs' offensive DVOA dropped to -15.2% while their defensive DVOA ballooned to 23.6%. There's no question, the Chiefs simply let up with a big lead late. Past research suggests that this does give us information about how good the Chiefs will be in the future. But that research is always worth reexploring and questioning. I know there are other metrics that have been improved by lowering the weight of plays when a team has a very high or low win probability.

The Chiefs' decline when defending a big lead late is not unique, but it is distinctive. I went back and looked at ten other top teams from the last few years, from the 2007 Patriots to the 2019 Ravens. The only one of those teams that had a similar trend with a big lead late was the 2011 Green Bay Packers, a team I've already compared the Chiefs to in this space. Like the Chiefs, the 2011 Packers played a number of games that were only close because their opponents scored meaningless points late in the fourth quarter.

The aforementioned 2009 Colts also saw their offense go into a shell with a late lead, but unlike the Chiefs and the 2011 Packers, their defense was just as strong when defending late leads.

I also looked at the Chiefs themselves in 2018 and 2019 and, what do you know, the Chiefs show the same trend in 2018. The decline on offense was even stronger. And like this year's Chiefs, the 2018 Chiefs had some games where opponents scored late to make things closer than they were most of the game. In Week 1, a 31-12 lead against the Chargers ended up 38-28. In Week 8, the Broncos scored twice in the fourth quarter to make the final score 30-23. In Week 13, the Raiders scored 17 points in the fourth quarter to finish the game 40-33. And in Week 15, it actually cost the Chiefs when they allowed the Chargers to score two fourth-quarter touchdowns and a 2-point conversion to win 29-28.

The 2019 Chiefs also showed this trend, although not as strongly as the 2018 and 2020 teams.

Other teams I looked at kept the offense humming even with a late lead. As you probably can guess, the Patriots always kept their offenses going with a late two-score lead. So did the 2013 Broncos and recent Saints teams. Last year's Ravens were not as good on offense with a big fourth-quarter lead but their defense got even better.

Here's a look at these teams. It's by no means a scientific study, but I think it's interesting.

Top Offensive Teams and Performance Up 9+ Points in Q4, 2007-2020
Year Team W-L Offense
Q4 Up 9+
All
Offense
Defense
Q4 Up 9+
All
Defense
2007 NE 16-0 36.8% 44.1% 0.6% -5.7%
2009 NO 13-3 35.6% 24.7% 0.1% -4.0%
2013 DEN 13-3 34.5% 34.2% -2.9% 0.0%
2011 NE 13-3 28.6% 32.6% 31.4% 15.2%
2010 NE 14-2 22.4% 42.8% 27.6% 3.2%
2011 NO 13-3 20.9% 33.5% 8.4% 5.8%
2019 BAL 14-2 11.9% 28.2% -16.1% -11.5%
2018 LAR 13-3 10.0% 25.0% 15.6% 1.0%
2019 KC 12-4 8.2% 23.5% 11.1% -2.6%
2011 GB 15-1 -15.0% 34.2% 31.9% 9.8%
2020 KC 14-2 -15.2% 23.9% 23.6% 4.9%
2009 IND 14-2 -16.9% 17.2% -2.6% -0.9%
2018 KC 12-4 -26.1% 35.4% 18.8% 7.7%

I have to admit that the Chiefs' performance over the last two weeks has convinced me that the "flip the switch" theory is more accurate than not, and that we need to correct for this tendency in our playoff odds. To try to do that, I created playoff odds based giving Kansas City their final offensive DVOA from 2018. That's the highest of the three seasons for Kansas City's offense, tied for the third-highest offensive DVOA ever, and I think a better proxy for what to expect from the Kansas City offense for the long term. Of course, it's not as good as how the Chiefs have played in the last two games, but we've never done projections based strictly on a two-game sample.

Even with this adjustment, we still only come out with Kansas City winning the Super Bowl 53.2% of the time. That's lower than the odds we had for them in Super Bowl LIV, a year ago at this time. As I said at the start of this article, our numbers really like Tampa Bay this year. That's a very good team that the Chiefs have to beat in two weeks.

I know some of you are wondering, so: If we used the regular weighted DVOA -- adjusted to remove Week 17, like I've been doing the last two weeks -- we would have Tampa Bay as favorites to win the Super Bowl 54.2% of the time. Neither of these projections gives Tampa Bay any home-field advantage.

Voting for the 18th annual Football Outsiders reader awards should start later this week, so look for that banner on our front page and make sure to get in your votes!

Comments

66 comments, Last at 29 Jan 2021, 11:16am

1 The 2019 Chiefs also showed…

The 2019 Chiefs also showed this trend, although not as strongly as the 2018 and 2020 teams.

'course, that's the year where Mahomes missed a bunch of games and gimped through some more, so they had fewer late leads.

Is this a Reid thing? Did his good Eagles teams show this?

His pre-Mahomes Chiefs teams were infamous for coughing up big leads.

4 I was wondering how much of…

I was wondering how much of it is coaching, whether Reid or Bieniemy/Spagnuolo or all of the above. It's a hard thing to test, but it would make some sense to switch to calling basic/vanilla plays when you've got the game in-hand. May as well not give future opponents more looks at your best material, if you can afford to do so.

That might also explain why a couple of other teams, NE and NO (also having reputations for good coaching), show up a few times. Although not all of their seasons do. But then, they wouldn't consistently have top-performing offenses year-in/year-out. So... *shrug* Maybe someone could look into the role of coaching in this phenomenon in more depth.

2 Objectivity or subjectivity?

Interesting analysis to produce KC as a slight (very slight) favorite (to win via simulations) 53.2%.  But this treats KC differently that all others especially TB.  Omitting week 17, TB comes out the marginal favorite?  Under any scenario it ought to be necessary to treat both identically?  Or did I miss something?

5 I don't believe the "flip…

I don't believe the "flip the switch" theory, exactly.  And I'm a Chiefs fan.  I do believe they get "up" for strong opponents and tend to take lesser foes too lightly.  I guess you could say that is some variant of the theory.  What I really believe is that the OL got healthy and the secondary came together in the playoffs.

I think the dip in the offense as the year went on was mostly due to decreased OL play.  Some of that was attrition as Osemele and Schwartz went out.  Some of it was injuries to Remmers and Fisher that diminished their ability.  The uptick in the playoffs was largely due to a two-week rest period that got Remmers and Fisher fully healthy and the unit getting its coordination together.  Now the loss of Fisher could hurt the team significantly.  Note that going into the season, the starting OL L-R was planned to be Fisher - Osemele - Reiter - Duvernay-Tardif - Schwartz.  It looks like the SB line will be Remmers - Allegretti - Reiter - Wisniewski - Wylie.  That is a significant vulnerability against a good TB front four.  Line coach Andy Heck has his work cut out for him.

The secondary has played much better in the playoffs, in part, I believe, because the top 3 corners have settled into their roles.  It wasn't until late in the year that Breeland, Ward, and Sneed played much at the same time.  And Sneed, a 4th round rookie*, started the year at outside corner before moving into the slot after he came back from injury.  I believe he really benefited also from the two weeks between Week 16 and the Divisional Round in learning to play that position.  Breeland being cleared from the concussion protocol was a key factor in the win on Sunday and the Chiefs will definitely need Sneed  to be cleared for the SB.  Also, I think Thornhill has finally gotten close to full speed after his ACL tear at the end of 2019 - this week was the first time he has really "showed up" to my untrained eye.  The position coaches Madison and Merritt deserve a lot of credit for this unit coming together.

*GM Veach said in a post-game press conference they were lucky to get Sneed and that in hindsight he should have been a first round pick.  But he moved to safety his final year in college after starting out as a CB and teams mis-graded him.

9 Starting O-Line

I'm fairly certain they picked up Osemele after Duvernay-Tardif opted out and were planning on going with Wyle as LG.

Of the 5 two were 7th Round pick:. Allegretti (2018) and Reiter (claimed by the Chiefs off waivers). Two were UDFAs: Wylie (2018) and Remmers (Free Agent pickup 2020) and Stefen Wisniewski was a 2nd Rounder but was picked up as a free agent when he was waived by Pittsburgh in November.

This is a remarkable testament to GM Veach and OL Coach Heck.

44 I'm fairly certain they…

I'm fairly certain they picked up Osemele after Duvernay-Tardif opted out and were planning on going with Wyle as LG.

You are correct, I had thought they added Osemele earlier.  Anyhow, they certainly didn't envision Wylie at RT in preseason

6 Defense

How has the old adage "defense wins championships" held up statistically in recent years?

7 Denver in 2015 was the last…

In reply to by Gladiator of t…

Denver in 2015 was the last true “defensive team” to win a Super Bowl. There were also a couple balanced teams with great defenses and offenses (I.e. the 2013 Seahawks, 2010 packers) over the past decade. Ultimately, per DVOA, championship teams are usually no worse than average on defense but there’s a much stronger correlation to offensive DVOA. 

41 I'd argue that the late…

I'd argue that the late season 2018 Patriots team that beat the Rams was a lot more like the 2019 team than it was the early season 2018 team, and was a "true defensive team" - but yeah, defense wins championships is only a thing if you also have a good offense at this point. 

43 2017 in theory

In theory 2017 saw the #5 defense by DVOA beat the #1 offense.

 

Of course, it was probably decided by the #7 offense also running roughshod on the #31 defense.

8 I’m sorry but just because a…

I’m sorry but just because a good team had two really good games and there is a very subjective feeling that they sometimes coasted doesn’t mean you should change your projection system. A better way of checking would be to use their season long mean and variance to estimate how likely these games would be if their season long performance was their “true” distribution of performance. It should be a pretty simple MCMC analysis. Until you do that or something similar this is just subjective manipulation so that you don’t look bad if KC plays a third good game in the Super Bowl.

10 It's important to note that…

It's important to note that even with a small sample issue (18 data points), it is still generally the case that more recent performances tell us more about a team than early-season performances. Another thing to remember is this a performance-based projection system, so ultimately it's utility is based on whether it accurately matches reality. Finally, the choice of what data is used and how much weight it's given is itself an arbitrary choice, so complaining about arbitrariness is a bit thick. I think they're analysis is basically correct: the game could easily go either way, but given even odds, bet the chiefs.

12 None of those things are…

None of those things are true.

-Aaron says all the time that full season DVOA is equally predictive of future performance as weighted DVOA

-It is a *statistical* performance-based projection system that has been tested and refined for almost two decades now. It’s utility isn’t in predicting one single game’s outcome at all. It’s to describe previous play-based performance (by calculating DVOA) and make predictions of future play-based performance, which it does really well! To just see two games and go along with a popular story about what’s going on with Kansas City is the opposite of what FO stands for.

-The choice of what data to use is absolutely not arbitrary, or at least it hasn’t been in the past. https://www.footballoutsiders.com/dvoa-ratings/2020/introducing-dvoa-v73 That article details how painstaking the process was to improve the predictive power of weighted DVOA. It’s not arbitrary at all. In playoff odds they sometimes make a little more arbitrary moves like not considering games with a backup QB or something like that, but that isn’t the case here. I would, for example, be fine with excluding the offensive snaps in the Cleveland game after Mahomes got hurt.

Now instead of using the data they’ve collected from this season, for very arbitrary reasons, they’ve decided instead to use data from the last three seasons as a sample despite significant reasons to think that it’s not a good match. They don’t have Kareem Hunt any more, their OL has had a lot of injuries this year (I believe), and of course the scheme has evolved as well. This is a decision that frankly I was very surprised to see and doesn’t fit with normal FO standards.

14 It is a *statistical*…

It is a *statistical* performance-based projection system that has been tested and refined for almost two decades now. It’s utility isn’t in predicting one single game’s outcome at all. It’s to describe previous play-based performance (by calculating DVOA) and make predictions of future play-based performance, which it does really well! To just see two games and go along with a popular story about what’s going on with Kansas City is the opposite of what FO stands for.

DVOA assumes teams go all out, all the time.

Aaron tends to find that garbage time still matters, but I have a suspicion what it's really seeing is confirmation bias of regression to the mean -- most teams aren't actually as good/bad as their best/worst performances, so garbage time returning closer to 0 values matches expected behavior -- most teams are not outliers.

But those assumptions make DVOA blind to teams that actually are slacking. This is relatively rare in football (it's more physically dangerous to loaf than other sports), but it's relatively common in the NBA. This isn't based on two games. Quite a number of posters have been suggesting all season the Chiefs were pulling a Milton Berle.

https://www.urbandictionary.com/define.php?term=Milton%20Berle

And then suddenly the Chiefs get to the playoffs and look like the juggernaut we were expecting all season. Now it might be two random games. But it's two random games from a team that's suspiciously good in close games and just as much better against great teams as they are against terrible teams. The Chiefs have been turtling teams all season -- no matter how good a team is, the Chiefs are 3-points better.

https://www.youtube.com/watch?v=cymQHdMEUrQ

Now this could be freakish odds luck. Or it could be the Chiefs aren't playing with a fair die.

16 suspiciously good in close games

The Chiefs played beat 4 out of the final 8 teams in the playoffs too. Pack beat 2 out of 8 and lost to the other one they played (Bucs). That should also count for something.

17 Ok so taken at face value…

Ok so taken at face value that Chiefs were coasting somewhat in games that “didn’t matter to them” why not take a better approach? Why not lessen the weighting of the second halves of games where they “weren’t trying”? There’s no reason to think that they weren’t trying so much that they didn’t care about the first half, right? Or if you want to take an extreme position, lower the weighting on all of those games?

But again we are getting into super duper arbitrary territory here. I don’t see any evidence or reason to buy that they weren’t trying but for the sake of argument; there are many other explanations that fit the data just as well. The assumption should be that they were trying and that their performance in the regular season was their true performance, and then you try to disprove that assumption. Two games wouldn’t do that no matter how you test it statistically, because of course it wouldn’t.

I disagree with tinkering with the formula in this scenario. All they have done is predict it as a 50-50 game, to cover their asses either way the game goes instead of following what the data say. If you want to make arbitrary decisions and pick and choose your data like this in a super bowl preview article or something, then fine. But explicitly including it in the playoff odds is a bad idea.

20 I’ll also just point out…

I’ll also just point out that the theory as described in the article is that the Chiefs took it easy with big leads in the regular season. They barely had any games that would qualify under this definition.

Week 1 (HOU)- they had a big lead (31-7, but 24-7 entering fourth) but if you’re seriously suggesting they decided to coast bc they couldn’t be bothered to try in *week 1* bc they knew they would make the playoffs then that’s ridiculous

Week 3 (Balt)- Up by two tds entering fourth, they gave up a TD 5 seconds in and kept trying, scoring another TD to ice it.

Their next big lead that they could have conceivably coasted with wasn’t until being up 27-10 on Tampa Bay in Week 12 and the game ended 27-24. Again, if you really want to seriously suggest they lost this lead because they coasted and not because the Bucs are also a great team, then you are on the wrong site.

Week 14 in Miami they were up 30-10 and ended up winning 33-27. If you want to pick any game as evidence of the switch flipping, then I guess this is it. But it’s still weak evidence and confirmation bias.

That’s it. That’s the extent of the big leads that they ended up closer in score by the end of the game.

If you want to try the other way, which is that in certain games they didn’t care as much in late season games where the opposition was weak, you have two games of evidence: 17-14 at ATL, 22-16 vs Denver. But the Falcons gave the Bucs a game in week 17 game before they pulled away late, and generally weren’t a total push over (17th in full season DVOA). There’s no reason to think they couldn’t give the Chiefs a game either.

26 It's what I'm here for. …

It's what I'm here for.

\there have been a bunch of posters who have been posting long streams that are mostly replies to themselves
\\which quickly becomes impossible to read on your phone

28 While I also wouldn't assume…

While I also wouldn't assume that the "coasting" theory is correct based on two playoff games, I think your criticisms of it are a little too... rational?

If the Chiefs were indeed coasting, I doubt it was a formal strategy.  It's more likely it was a more emotional response; they just won a Super Bowl, they were extremely confident they'd make the playoffs no matter what(*), and football is a physically and emotionally taxing sport.  I'm not surprised that they would relax and/or lose focus a little bit when they already had the lead.

(*) This is my only argument with your specific points: I think in week one they did "know" they would make the playoffs, barring a serious injury to Mahomes.

36 Against the Texans

I have been trying furiously to point out this game in particular, as the failure of DVOA, as currently constructed, as a valid metric when measuring the Chiefs.

Aaron has mentioned before that the chiefs had a negative VOA in this game (in hindsight Houston becomes a weaker opponent - but they were not only a 2019 Playoff team but actually be upstart Bills.)

Chiefs won the Turnover battle 1 - 0 which is not unusual since they have Patrick Mahomes at QB and their main RB is CEH who was known for ball security in college.

The Chiefs outgained the Texans 369 - 360 and 28 - 21 in First downs. Based on these numbers one could make an argument that the Chiefs got lucky with some high variance 3rd downs, etc. and the 14 point difference did not represent the true gap between the teams as DVOA would also argue.

But let's look at the stats at 11:25 of the 4th Quarter. The Chiefs led 31 - 7.

Yards: KC 347 HOU 215 - That's right folks Houston had 145 Garbage time yards for the Texans

First Downs: KC 27 Hou 12 (9 garbage time FDs)

Now DVOA rates these two teams as about equal. So does the traditional Yard Gained. But Andy Reid is known for taking his foot off the gas and easing up.

The prevent defense gets a bad rap because we generally never give it credit it when it works only when it fails.

Interestingly the Chiefs had played a better version of these Texans just a few months prior. In the playoffs, famously fell behind 24 - 0 before winning 51 - 31. The yards gained in this game was also fairly equal (Houston +8) and even in turnovers.

So for whatever reason it is clear DVOA under rates the Chiefs. Now it is an interesting question of why. But the idea that this team coming into last week at  #6 was worse than New Orleans, Tampa and Buffalo all of whom they had beaten on the road and just slightly better than the Ravens who they beat by 14 on the road. Well it is just not right.

The Chiefs out gained the Bills by 439 - 363 but if we take out the action after the Chiefs went up 38 - 15 midway through the 4th Quarter that number comes 408 - 272. For some odd reason DVOA agreed it was a dominant performance. But so was the performance against the Texans (watch the game.)

38 To be clear, I’m not arguing…

To be clear, I’m not arguing that teams don’t take their foot off the gas sometimes, including the Chiefs. Of course that happens. But what I object to is somehow saying that the Chiefs have some sort of systematic trend towards doing this that is so much greater than other teams that it somehow warrants special consideration and special tweaks to the numbers.

One game doesn’t mean anything, especially when what you’re really talking about is a small subset of the game. What Aaron is arguing is this: we take the 4 games I listed where they were up big in the 4th, and say their poor performance was all caused by this fault in the Chiefs that they take it easy of their own volition. (He’s actually saying all of the games where they had big leads are poor based on the table he shows in the article but in the other games they had a big lead they still won big.) Then he’s arguing that these 4th quarters are so biased and are dragging down their rating so much that instead a better estimate of their full season offensive DVOA is actually their DVOA from two years ago that was 10% higher when they had Kareem Hunt and a healthy OL!? It doesn’t make any sense.

46 In my opinion

It isn't that they systematically let their foot off the gas differently than everyone else (although they may do it a bit earlier) but they do it more often as a percentage of their games. There is also a human element to it. They are the defending champs and you could tell, they didn't get up for the "ho-hum" games they were big favorites in. But when they played a team that was considered in their tier (BAL, BUF, NO, TB, MIA, CLE, BUF) they turned up the intensity and focus. Its clear when you watch it, but that doesn't do any good from a modeling standpoint. You can't model that. I'm fine with not making any adjustments to DVOA and saying "DVOA says TB is the superior team." But if that doesn't line up with reality, then you take a look at your model, which is what Aaron seems to be attempting with adjustments. Its an unenviable position to try and model outliers, but KC seems to be one.

50 My last comment on this…

In reply to by nsheahon

My last comment on this thread bc I’m tired of debating it, but here’s the rub with this whole thing: everyone keeps saying they turn down the intensity against bad teams and turn it up against good teams, or they keep saying what Aaron is saying which is that they coast with big leads. But on your list of good teams they’ve played where they supposedly turned it on, three of those games are the big leads that they’ve supposedly coasted with: MIA, TB, and BAL. And on the list of bad teams they’ve played where they supposedly coasted is some of the games where they have done fine, blown out the opposition, and held their leads in the 4th (NYJ and 1st DEN game).

It’s all circular and it’s honestly kinda conspiracy theory-ish, seeking over and over to confirm the belief that KC is so unusually good or lackadaisical or something that they don’t care except against good teams that pique their interest (or against bad teams they only care enough to eke out a win, or...something?), despite loads of evidence to the contrary and no rational reason to believe they are actually doing the thing you think they are doing. People keep saying “anyone watching KC knows they do this” and I just disagree that we know anything at all. You’re fitting a story to the data that the data don’t support in any meaningful way.

This is all on top of the problem of tinkering with the numbers in a biased and arbitrary fashion, picking numbers from two years ago instead. They have literally taken the whole season’s worth of data on KC’s offense and thrown it out the window as not worth using because of some pet theory. What the hell kind of Football Outsiders in-depth analysis is that?

There are multiple layers of problems with this analysis, and to what end? Kansas City is a great team, you don’t have to massage the data to be able to say that is the case! Why change the data so that the SB odds flip from 55-45-ish in favor of Tampa to the reverse? Is that even meaningful? At the end of the day it’s a coin flip either way so why even bother. It doesn’t make any sense from any perspective whatsoever.

30 Milton Berle

I have to correct a terrible wrong that has been perpetrated here since you made this post.

That Milton Berle link/story is one of the funniest things I've ever heard.  How long has this been a thing?  I'm old, but not old enough to remember Uncle Miltie from his TV days (except for s disastrous SNL hosting gig in the 70s), and I've never heard of this before.  That man was a god, not because of his physical gift, but because he Milton Berled--just enough to win, every time.

31 I remember Simmons making…

In reply to by Bobman

I remember Simmons making reference to it back when he was still on Page 2, and when ESPN still had a Page 2 (and some remnant of their soul).

35 Aaron tends to find that…

Aaron tends to find that garbage time still matters, but I have a suspicion what it's really seeing is confirmation bias of regression to the mean -- most teams aren't actually as good/bad as their best/worst performances, so garbage time returning closer to 0 values matches expected behavior -- most teams are not outliers.

This is a very interesting idea. 

42 ", but I have a suspicion…

", but I have a suspicion what it's really seeing is confirmation bias of regression to the mean"

This is ABSOLUTELY the case. DVOA finds garbage time predictive because what typically happened before garbage time is a significantly large DVOA value, and every team trends away from those values. 

It's the same sort of thing that causes all their preseason win projections to be overly conservative. 

18 Quick note on weighted DVOA

>> Aaron says all the time that full season DVOA is equally predictive of future performance as weighted DVOA

With the new weighted DVOA -- I did research and changed the weights this year -- this should no longer be true. The difference is small but weighted DVOA should now be slightly more predictive.

21 Yes in the article…

Yes in the article introducing new DVOA you say that  full season DVOA predicts future DVOA with 0.483 correlation, while the new weighted DVOA predicts it with 0.487 correlation. Is that technically better? Sure.

But 0.483 vs 0.487 is essentially the same in my book.

23 Yes

Turns out it's remarkably hard to get a weighted rating that turns out to be more predictive than a total rating!

49 It is potentially awkward…

It is potentially awkward that standard DVOA produces a prediction that is so wildly out of line with the betting markets. That doesn't justify such a blatant fudge. You accept and acknowledge that your model may be missing something, and trust that readers can think critically for themselves about why the numbers might be wrong. 

58 Betting markets can be…

Betting markets can be biased as well (they move with the money or open with the expected money) Look at the recentConnor McGregor fight, he was favored on the money line at -$300 no way any unbiased observer had him as a significant favorite

60 We can argue about "unbiased…

We can argue about "unbiased", but McGregor seemed to be the favorite among MMA followers, too. -300 might be too high, but I don't think it's reasonably arguable that Poirier was favored.

He was White's favorite, but White is strongly biased towards McGregor's superior draw numbers.

63 I've worked in the betting…

I've worked in the betting industry, and I can assure you that honestly isn't true. These days, betting markets are highly efficient on the bigger sporting events (and it doesn't get much bigger than the Super Bowl).

I know nothing about MMA, but unless you had a proper bet yourself, you don't get to claim the market was wrong in hindsight (and if you did then congratulations, great call!).

11 Bucs or Chiefs

Does anyone think the Bucs played a great game Sunday? Yes, their defense is very good. Yes, they beat the Packers and Rodgers in Green Bay. However, Brady threw 3 picks and the Packers gave them a couple of turnovers too. Contrast that with how KC played after going down 9-0 to the Bills - Mahomes carved up a good defense and the Chiefs D harrassed Josh Allen all day. I'm not buying into the "flip the switch" narrative, KC is a team that would usually jump out to a big lead then allow some garbage time points at the end. You look at both championship games - which team looks like the SB champs, and which one looks like they just barely won?

I don't care what defense is out there (see last year's supposedly dominant 49ers in the Super Bowl) Reid, Mahomes and company will find a way to score. I think the Chiefs D will also find a way to mess with Brady and will find a way to win. I don't know that it will be a multi score win, but I could see them win by a TD. 

15 Weird that you didn't mention that the Bucs were one too..

The Bucs were down 17-0, then 27-10 into the 4th and then got within 3 when the Chiefs easily wound down the clock with ~4+ minutes left. Garbage points where the Chiefs easily won that game.

 

I don't know how to adjust for this with DVOA - but is there any way to look at how a team usually plays vs. how they play against another team? Can we see how aggressive offenses are on average vs. how aggressive they are against the Chiefs? This would show that the Chiefs defense has to go against a 'harder' offense on avg.

25 Their offense is a little…

In reply to by barf

Their offense is a little better (as that disaster against NO recedes) and their defense is a little worse. KC appears to be about the same team, too.

34 This has always been my gripe

About garbage time. You don't know until the game is over and even then. The Falcons have blown a lot of games (and not just leads) where they had 99% WP or whatever. Was the first Dallas TD that dropped it garbage? Turned out it wasn't and is what started the comeback. 

37 Problematical Modelling

The problem about modeling is if you ignore human nature (to let up with a big lead and try harder in order to not get embarrassed) and worry about the 1% of the time your model might fail you end up with a model that doesn't do so well the other 99% of the time.

So in hindsight all the times Atlanta lost with a 99% chance weren't garbage but that doesn't mean there is no such thing as garbage time.

I will point out again as I did in my much longer post: Chiefs vs Texans Game 1: With the chiefs up 3 TDs and 3 2 point conversions (31 -7) with 11 minutes left the Texans went on two long TD drives that ate up most of the remaining clock. They missed the 2 pointer on the first TD so they never got to a one score game.

That was garbage time and despite being relatively even in Yards Gained and VOA the Chiefs went 13 - 2 the rest of the way the Texans 4 - 11. If you ignore garbage time you realize after this game the Chiefs were superior, if you don't ignore it you are shocked the Texans didn't make the playoffs.

39 Oh I know it exists

But it's kinda hindsight. Like the Bears TD vs the Saints this month was for sure. But it's hard to tell during a game. Gotta start that comeback at some point. And if you start letting up you better close out real well as to not get Falcons'd.

I get why people/models do use em but also understand if they don't. 

40 There is an objective…

In reply to by ImNewAroundThe…

There is an objective garbage time -- the time in which a comeback is not mathematically possible in the time remaining.

This is a very small window. The original analysis was for basketball. The window for a 7 point lead is something like 3 seconds. If you're up 7 with 3 seconds left, you've won. Your opponent can't hit two threes plus a free throw in that amount of time. The 99th percentile window is quite a bit larger, though.

47 Nobody is doing that...

The filters are always something like 20/80 or, at best, 5/95. Yet we've seen multiple games comeback from 95%+. No one has made one that is .5/99.5 or greater that I've seen.

45 Worse than cherry-picking,…

Worse than cherry-picking, really.

To try to do that, I created playoff odds based giving Kansas City their final offensive DVOA from 2018.

You’re using two-year old data? That’s very close to “just making shit up”. I suspect you first tried several ways of excluding plays, quarters, games, etc, to get the result you wanted. Which would have been bogus, too.

The honorable thing would have been to post the playoff odds based on your system, and to add commentary to the effect that you think the Chiefs will try harder in the Super Bowl. Or maybe you could have computed the DVOA for first half of games only, and posted that in the commentary.

In the end, it doesn’t matter. Only the game matters. But it’s too bad you’ve pushed FO statistics further into irrelevance. You might as well call them “Power Rankings” and drop the statistical methods entirely.

Sigh.

48 I'm shocked they have done…

I'm shocked they have done this. Now, when a rating system spews out a result so significantly out of line with the betting market (which implies KC ~61%), on such a big event as the Super Bowl, there should indeed be a search for explanations before putting our trust in the numbers. But that isn't really the point. No rating system is perfect. And readers can be trusted to think for themselves. It is entirely obvious that the Chiefs' sustained success over 3+ seasons with this regime might be worth something additional on top of single season DVOA. 

As you say, write a commentary. Make it an off-season project to look for improvements. Don't just blatantly fudge the f...kin numbers. 

52 Not responding directly to…

Not responding directly to you just prompted by your thoughts. I agree with what you say this just feels like a good spot to elaborate a bit.

Not to mention that DVOA is a measure of PER PLAY EFFICIENCY. Just because you are more efficient on a per play basic does not necessarily mean you will have a better outcome for a drive or a game. It means it is more likely, yes and that is the main predictive power of DVOA, but it isn't necessary.

I can't find it but a few years ago, I think for a Jets game they actually showed some of the play by play DVOA data. As I recall there are situation where the results are the same (yardage, number of plays, and score) but due to how the teams got there one team has a better DVOA. It's not a massive effect, but it is there. Some of it stems from longer plays getting "discounted". Basically getting 60 yards on 1 play is not worth as much DVOA as getting 15 yards are 4 plays in a row. 60 yard plays are less repeatable than 15 yard plays. They are both worth a lot but since historically a 15 yard play is much more likely to work it gets more DVOA.

THIS IS NOT HOW IT WORKS JUST AN EXAMPLE.
I don't actually know if the weighting works this way but I'm putting up an example that gives 3 results to 3 different different 75 yard TD drives. I'm leaving out cases where a team gets negative value on a play because of a sack or something. Just 3 different sets of positive plays. Again values are arbitrary they also ignore the fact that values are different on different parts of the field, etc.

Basically on first and 10 you'll get DYAR (which is not exactly DVOA but is way easier to show) like this:
Yards 1-2 are worth 0.25 each because while something is better than nothing 1 or 2 yards on first down is not great.
Yards 3-5 are worth 0.50 each
Yards 6-12 are worth 1 each because you have been successful (55% of needed yards on 1st down). You'll also get a little "extra" for yard 10 since it converted a first down, shown later.
Yards 13-20 are worth 0.75 each because they are still of great value but less predictive
Yards 21-40 are worth 0.65 each
Yards 41-60 are worth 0.55 each
Yards 61-100 are worth 0.50 each because it's just rare to consistently get 60+ yards but they are still about as valuable as those last few yards you needed for a successful play.
A FD is worth 2
A TD is worth 6
Getting a TD gets you bonus points just like getting the first down.

First a 75 yard play on first and 10 from the 25 TD gets you.
1-2 = .5
3-5 = 1.5
6-12 = 7
13-20 = 6
21-40 = 13
41-75 = 19.25
FD = 2
TD = 6
Total =  55.25 DYAR

Second lets see what three 25 yard plays gets you (this isn't quite right as 25 yards from the 25 isn't worth quite the same as 25 yards from the 50 or 25 yards from the opponents 25).
1-2 = 0.5
3-5 = 1.5
6-12 = 7
13-20 = 6
21-25 = 3.25
FD = 2
Play = 20.25
3xplay = 60.75
TD = 6
Total = 66.75

Again that is purely made up by me kinda guessing at the weights based off memory. The 75 yard TD play is still significantly more valuable than the 25 yard FD play but not as valuable as three 25 yard plays in a row.

7 sets of 
1-2 = 0.5
3-5 = 1.5
2x plus the FD = 4 + 2 = 6
7 of those = 42 putting you at first and goal from the 5
2 more for the 5 yards
6 for the TD and you get
Total 50 DYAR

It's very valuable to get 15 plays of 5 yards in a row, but that too is difficult to sustain so the 1 play of 75 yards is worth a bit more and it's not as valuable as 3 plays of 25 yards.

Again the 5 yard marching team would have some slightly different values because of how field position comes into things.

 

The weighting system I have there is biased towards mildly explosive teams. It really loves a 6 - 12 yard play and likes everything up to 60 yards quite a bit. It is not a fan of 1-2 yard or 3-5 yard plays, and anything over 60 is good but too rare to get bonus points. 

The common complaint by the good statisticians on the boards
Nat has commonly complained about the black box nature of DVOA and DYAR precisely because we don't know if the system really favors something an inordinate amount. My system would give the most value to a team that got 10 yards a play because of how the FD bonus works (though a team that got 6-9 or 11-12 yards a play would be close assuming that my quick set-up to test was correct). I'm not going to do all the work but you could actually calculate the break even points and see what various combos looked like and match it to various real games and see if it makes sense and commentators could argue over the weights etc, etc. 

All we know of DVOA is the inputs and the outputs and we can reverse engineer a decent amount, but it's rough. I intentionally set my system up the way I did because it feels like that is how DVOA would rate things, it does seem to favor lots of 6-12 yard plays over a handful of 25 yard plays or one or two 75 yard plays or lots of 3-5 yard plays.

How all that works and how it relates to "eyeball tests" and W-L predictions is trick. 

What this article feels like
Well my eyes and standard stats and the last 3 years say if the team = KC and they are up by at least 22 points then only assign the opponent 75% of the DYAR for plays. That would then improve the KC defense and offense for the whole game and we'd get the "true" KC. Or better yet let's just use the KC offense from 3 years ago.

What we expect to see
Historically teams that get 6-12 yards a play with no negatives have the most success. That is what Tampa Bay does. KC tends to get bunches of 8 - 20 yard plays and then bunches of 2 - 5 yard plays. They are successful but because of what we've historically seen that doesn't look quite as repeatable so our numbers say it's not quite as efficient. We will look at our model and see if the situations they get the bunches of 2 - 5 yard plays and bunches of 8 - 20 yard plays merits an adjustment. Perhaps if we adjust things for every team in those situations we'll find that DVOA and W-L records have a better correlation. Of course that takes a lot of work. So let's just realize that our model does favor certain things a little more and TB does more of that than KC. Our model also says that both teams should be very successful based on what they do. 

54 The Jets game you reference …

The Jets game you reference - if I recall, it wasn't so much "discounting" plays with long yardage, but that the team with the unexpectedly high DVOA clustered their bad plays.

It was something like they had 9 drives, 5 of which were three-and-outs, but the other 4 all had steady progression downfield, with mostly successful plays.  So it was something like 10 negative plays, 5 positives on the "bad" drives, and 30 positive, 10 negative on the "good" ones.  That would average out to quite positive per play, but negative per drive.

EDIT: I think this is the game you're thinking of.

55 Here's the tail end of that…

Here's the tail end of that weird Jets-Patriots game controversy for anyone who is wondering what you are referring to:

https://www.footballoutsiders.com/dvoa-ratings/2011/absolute-final-word-jets-patriots

(although there may have been more discussion in later weeks)

Aaron let us peek into the black box of DVOA back then. I would say that over the years, he's been pretty good about explaining how DVOA and DYAR work. Officially, DVOA is a proprietary black box. But when you've been around here since the DPAR days, it's more like a glass box with tinted glass. We can see a lot, just not everything.

It's possible that KC's season was affected by the same thing as affected that old game: teams that play poorly with big leads are treated harshly by DVOA because they are compared to other teams that are dominating their game. Teams that play well when down multiple scores are doubly rewarded because they are compared to other teams that are getting walloped, which are usually bad teams to begin with. There is a lot of variation in the baselines that each play is compared to.

Me, I doubt that's all that is happening here. Teams do sometimes play above themselves. There is a lot of variance in how teams play week to week. And it's quite common to have two great weeks in a row without actually changing as a team. 

In other words, while it's possible that DVOA has gotten KC a bit wrong, it's also likely that they are on a hot streak and playing above themselves, too. The hot streak may continue. Who knows? That's why they play the game, instead of just exchanging stat sheets at the fifty yard line.

51 "I have to admit that the…

"I have to admit that the Chiefs' performance over the last two weeks has convinced me that the "flip the switch" theory is more accurate than not, and that we need to correct for this tendency in our playoff odds. To try to do that, I created playoff odds based giving Kansas City their final offensive DVOA from 2018."

As other posters have said, this was a very bad call. Point out the model's weaknesses, sure. Say your own prediction is different, sure. This is way too far.

53 When I read that, I assumed…

When I read that, I assumed Aaron meant he created a second set of playoff odds based on the 2018 Chiefs offense, to be published alongside the "standard" ones.  I thought that was fine.  I didn't realize it was going to be the only set of playoff odds published - that is something I also think is wrong.

56 Unless the score gets really…

Unless the score gets really out of hand, garbage time isn't something that can be cleanly defined. You can think about behavior changes from the team, maybe they go super soft on defense but that still tells us something about the defense and how it performs when playing soft. That's what makes it predictive.

 

In general though, this just feels like more fan grumbling and the desire for numbers to show that this is some all-time team. Believe me I've read this all the way through that 2011 season with the Packers. why can't fans just be happy that their team is in the super bowl and has a chance to win it.

59 I get the hand-wringing, but…

I get the hand-wringing, but fivethirtyeight did the same thing with their models after they kept predicting Lebron would lose in the first round/the Warriors would lose to the Rockets.  There is evidence that teams hold things back/coast in the regular season when their overall position isn't threatened and its beneficial to 'hide' their optimal strategies, we just don't see teams that can 'coast' to 14-2 in the NFL that often;  as an aside, if the critics are correct, it'll turn out this season resembles 1997, when a previously dominant champion somewhat easily 'flips the switch' until they don't

62 nah, I was thinking of the …

nah, I was thinking of the '97 Packers, heavy favorites all season (partly because the NFC had won something like 13 SBs in a row) and defending champs, but only 4th in dvoa coming into the playoffs (although very close to the top ranked Broncos), won their playoff games by multiple scores led by the acknowledged 'best qb in the world', but were shocked by a similarly great Broncos team  (guess the '98 Broncos fit too, except they closed the deal)

64 Whatever happens in the…

Whatever happens in the Super Bowl won't really prove/disprove this theory; it's just a one-off game, with all the associated variance that entails.

This is a question of objectivity. I want DVOA to be published with out bias, so that I can then decide on its value - for myself.

65 On the flip side -- all…

On the flip side -- all models are wrong; some are useful.

It's reasonable to point out when you suspect the inputs into your model are flawed, and thus the output is suspected to be garbage.

66 And this isn’t modeling at…

And this isn’t modeling at all. It’s literally saying “I saw Kansas City be better than this two years ago so I’m using those numbers instead”. That is *literally* what they’re doing here. There is no methodological or rational basis for doing so.