Writers of Pro Football Prospectus 2008

13 Dec 2011

FO Mailbag: DVOA Without Fourth Quarter

OK, it's time for a quiz.

Each of these teams sees its offensive DVOA drop in the fourth quarter: Green Bay, Houston, New England, and Pittsburgh. Can you put them in order from the team with the biggest drop to the team with the smallest drop?

While you work on that, here's the mailbag question from today's DVOA discussion thread.

Paul M.: If there was a "First Three Quarters" DVOA, I would think the Packers would dominate and begin to show they are more of a historically dominant team than they are currently credited for, but then again that would discount some teams (any one in particular that comes to mind??? Hmmmm..... maybe they play in the Mountain Time Zone??) and their ability to rally late.

Nat: Aaron, could you publish the rankings/numbers for "First Three Quarters" DVOA? ... In theory, this all-but-late DVOA should avoid the prevent-defense, garbage time, hail-mary, shut-the-offense down, play-the-backups issues -- while still being a large enough sample to characterize each team pretty well. Pretty please?

OK, so, first let's answer the quiz question above. The answer, in order from biggest drop to smallest, goes: Pittsburgh, Houston, New England, Green Bay.

Didn't expect that, I bet?

An idea that came up in the DVOA discussion thread today is that the Packers take their foot off the gas in the fourth quarter, and that's the biggest reason they don't have a historically dominant DVOA that compares with teams like the 2007 Patriots and 1998 Broncos. Well, if DVOA is to believed, this is simply not true. In general, compared to this year's other top offenses, the Packers don't drop off much in the fourth quarter. This week against the Raiders was a dramatic exception, with the Packers putting up 33.4% DVOA in the first three quarters and then -155.4% DVOA in the fourth quarter (on only eight plays, compared to 53 plays in the first three quarters).

Actually, the offense which drops off the most in the fourth quarter is Miami, which is slightly above average for three quarters and then the worst offense in the league in the fourth quarter. Apparently, the Dolphins take their foot off the gas even when they are losing the race. And of course, we know which team improves the most in the fourth quarter.

TEAM Q1-3 OFF RK Q4 OFF RK DIF
MIA 7.1% 14 -38.4% 32 -45.5%
PIT 28.8% 4 -16.0% 27 -44.8%
HOU 25.7% 5 -4.2% 24 -29.9%
CHI -2.3% 21 -25.5% 29 -23.2%
NE 41.0% 1 19.1% 8 -21.9%
SD 19.7% 7 -0.7% 19 -20.5%
STL -22.1% 31 -36.9% 31 -14.9%
CIN 9.0% 10 -3.4% 23 -12.4%
KC -16.0% 29 -26.6% 30 -10.6%
GB 37.2% 2 26.9% 3 -10.4%
BAL 10.8% 8 4.5% 14 -6.3%
BUF 6.6% 15 0.6% 17 -6.0%
OAK 2.8% 18 -1.7% 20 -4.5%
DET 7.2% 13 3.3% 16 -3.8%
ATL 10.1% 9 7.0% 13 -3.1%
WAS -8.5% 24 -10.4% 26 -1.8%
TEAM Q1-3 OFF RK Q4 OFF RK DIF
SF -2.0% 20 -3.2% 22 -1.2%
CAR 19.8% 6 19.6% 7 -0.2%
MIN -2.5% 22 -1.8% 21 0.7%
JAC -22.1% 32 -19.0% 28 3.1%
PHI 6.3% 16 9.6% 11 3.3%
NO 32.4% 3 37.2% 1 4.9%
IND -15.7% 28 -9.9% 25 5.8%
CLE -9.9% 25 -0.1% 18 9.8%
DAL 8.8% 12 19.8% 6 11.0%
TEN 4.6% 17 18.9% 9 14.3%
TB -8.3% 23 7.6% 12 15.9%
NYG 8.9% 11 30.2% 2 21.3%
NYJ 0.7% 19 23.5% 4 22.8%
ARI -19.9% 30 3.8% 15 23.7%
SEA -12.6% 26 13.7% 10 26.2%
DEN -12.9% 27 22.8% 5 35.6%

Actually, Green Bay seems to take its foot off the gas more on defense; its defense would rank 17th if we didn't include the fourth quarter. But San Francisco and New England see their defensive DVOA ratings decline even more in the fourth quarter than Green Bay's.

TEAM Q1-3 DEF RK Q4 DEF RK DIF
CAR 9.1% 24 42.7% 32 33.6%
SF -20.9% 2 9.9% 20 30.8%
PHI -3.0% 10 23.8% 29 26.8%
NE 8.0% 21 32.9% 31 24.9%
NYG 4.4% 16 28.0% 30 23.6%
MIA -5.6% 8 15.8% 23 21.4%
SD 8.6% 22 21.8% 28 13.2%
GB 5.7% 17 17.7% 25 12.1%
MIN 8.8% 23 20.2% 26 11.3%
DET -10.1% 5 -1.9% 13 8.3%
WAS -0.6% 12 7.2% 17 7.8%
ARI 6.7% 18 14.2% 22 7.5%
BUF 13.1% 29 20.4% 27 7.2%
DAL 1.6% 13 8.8% 19 7.1%
HOU -10.0% 6 -3.8% 11 6.2%
BAL -22.2% 1 -18.0% 2 4.2%
TEAM Q1-3 DEF RK Q4 DEF RK DIF
TB 13.0% 28 16.1% 24 3.1%
NYJ -12.9% 3 -10.5% 8 2.5%
OAK 7.0% 19 7.5% 18 0.5%
JAC -10.6% 4 -11.6% 7 -1.0%
TEN 2.4% 15 -0.4% 15 -2.8%
NO 17.6% 31 12.7% 21 -4.9%
STL 11.4% 27 5.0% 16 -6.4%
DEN 7.1% 20 -1.2% 14 -8.4%
CHI -7.1% 7 -16.9% 4 -9.8%
SEA 2.2% 14 -10.5% 9 -12.7%
PIT -0.8% 11 -17.6% 3 -16.8%
CLE 13.8% 30 -4.8% 10 -18.6%
ATL -3.1% 9 -22.6% 1 -19.5%
KC 10.7% 26 -13.9% 6 -24.6%
CIN 10.5% 25 -15.9% 5 -26.4%
IND 25.1% 32 -2.1% 12 -27.1%

Here is what the overall ratings would look like if we just included the first three quarters -- except in special teams, where frankly I'm too lazy right now to go do a whole new set of "first three quarters" special teams ratings.

RK TEAM Q1-3 O Q1-3 D ALL ST TOT
1 HOU 25.7% -10.0% 1.1% 36.8%
2 NE 41.0% 8.0% 3.0% 36.0%
3 GB 37.2% 5.7% 2.9% 34.4%
4 PIT 28.8% -0.8% 2.3% 31.9%
5 BAL 10.8% -22.2% -3.5% 29.5%
6 SF -2.0% -20.9% 8.4% 27.3%
7 NYJ 0.7% -12.9% 4.7% 18.4%
8 CHI -2.3% -7.1% 10.0% 14.7%
9 NO 32.4% 17.6% -0.4% 14.4%
10 MIA 7.1% -5.6% 1.2% 13.9%
11 ATL 10.1% -3.1% 0.4% 13.6%
12 DET 7.2% -10.1% -5.9% 11.3%
13 PHI 6.3% -3.0% 0.5% 9.8%
14 SD 19.7% 8.6% -2.4% 8.8%
15 TEN 4.6% 2.4% 5.1% 7.3%
16 DAL 8.8% 1.6% -1.3% 5.8%
RK TEAM Q1-3 O Q1-3 D ALL ST TOT
17 NYG 8.9% 4.4% 1.2% 5.8%
18 CAR 19.8% 9.1% -6.0% 4.7%
19 CIN 9.0% 10.5% 2.0% 0.6%
20 OAK 2.8% 7.0% -1.0% -5.3%
21 BUF 6.6% 13.1% -1.8% -8.3%
22 WAS -8.5% -0.6% -0.5% -8.5%
23 SEA -12.6% 2.2% 1.0% -13.8%
24 MIN -2.5% 8.8% -3.0% -14.3%
25 JAC -22.1% -10.6% -2.9% -14.4%
26 DEN -12.9% 7.1% 3.7% -16.3%
27 TB -8.3% 13.0% 1.0% -20.3%
28 CLE -9.9% 13.8% -0.5% -24.2%
29 ARI -19.9% 6.7% 2.3% -24.3%
30 KC -16.0% 10.7% 0.7% -26.0%
31 STL -22.1% 11.4% -2.8% -36.3%
32 IND -15.7% 25.1% -5.6% -46.3%

Posted by: Aaron Schatz on 13 Dec 2011

66 comments, Last at 20 Dec 2011, 1:41pm by Keir

Comments

1
by TMoney (not verified) :: Tue, 12/13/2011 - 7:48pm

Heh, look at Tebow Time in the 4th quarter!

8
by Karl Cuba :: Tue, 12/13/2011 - 8:29pm

Sanchise too.

27
by Aaron Brooks Go... :: Wed, 12/14/2011 - 9:13am

Who saw Tarvaris Time coming?

29
by Podge (not verified) :: Wed, 12/14/2011 - 9:45am

I'd be curious to see whether that's driven by pass offense or run offense? It feels more likely that it would be more driven by Marshawn Lynch being effective in the 4th quarter.

33
by ChaosOnion :: Wed, 12/14/2011 - 10:44am

Skelton time?

43
by jimm (not verified) :: Wed, 12/14/2011 - 3:51pm

for all the grief Jackson seems to have taken - he's certainly better than the typical QB taken in the 2nd or 3rd round (Jackson was taken last pick in the second round).

47
by Aaron Brooks Go... :: Wed, 12/14/2011 - 7:30pm

I was pleasantly surprised watching the Monday Night game. While I wouldn't exactly call Jackson good, I also wouldn't exactly call him bad. He's at least settled into mediocre NFL QB territory.

Which while not amazing, is still probably 99.95th percentile performance. =)

2
by merlinofchaos :: Tue, 12/13/2011 - 7:52pm

You know I knew there'd be a huge 4th quarter improvement for Denver, but somehow I still managed to be surprised by just how ridiculous it is.

What's worse is, there are 5 Kyle Orton games in there where they didn't improve in the 4th quarter dragging that number down.

9
by Stats are for losers (not verified) :: Tue, 12/13/2011 - 8:38pm

I was wondering about this too since the Week 12 DVOA article, noted an improvement in the run offense after switching from BT to AT. What I really wanted to see, though was their 1-3/4 Defensive DVOA split.

Excluding special teams, Denver appears to improve in the fourth quarter (+44.0%) just as much as New England tanks (-46.8%). This could be a fun game to watch.

22
by PerlStalker :: Wed, 12/14/2011 - 1:03am

The offensive improvement is insane. I'm trying to decide if it's a sign of a good team waiting to break out or the sign of a fluky team waiting to crash back down to Earth.

As others have said, the Pats/Broncos game could be all sorts of fun if it's still close in the fourth quarter.

24
by tunesmith :: Wed, 12/14/2011 - 2:53am

It's because the 4th quarter is the only time the Broncos truly play the spread. They're scary good at it, and hopefully someday McCoy and Fox will start letting the team do it earlier in the game.

26
by Jimmy :: Wed, 12/14/2011 - 8:49am

This Bears fan's anecdotal impression is that the Tebowfence wears teams out really badly (well it did the Bears but part of that has to be a function of the Hanie and Co not moving the ball at all). Constantly having to work to maintain gap integrity on every single down and then chase Tebow around the field on third downs - and then the bugger runs over a crowd to pick up another third down - seems to really take it out of a team's legs. The variety in the running game combined with the spread looks and Tebow constantly trying to roll out to his left means that on every play defenders have extra responsibilities; extra stuff to counter against means more effort expended to ensure you are responding to the correct play and haven't missed your keys and gotten lost. Playing at Mile High can't hurt either (and not because it brings Tebow that bit closer to God).

28
by Aaron Brooks Go... :: Wed, 12/14/2011 - 9:15am

Welcome to the theory behind the option offense.

39
by Dima (not verified) :: Wed, 12/14/2011 - 12:03pm

What if teams just played their second-stringers to start the game on defense, then slowly started bringing out their starters more and more in the second quarter?

I think it might be worth a shot.

Patent pending.

40
by Jimmy :: Wed, 12/14/2011 - 12:08pm

My response was to a Broncos fan that thinks the reason for the Broncos doing better in the fourth quarter last week was down to the use of the spread. My point is that if Denver had played the spread all game I don't think they would have scored any more points. The issue was fatigue, I have watched pretty much every snap of Urlacher's career and rarely seen him looking so gassed.

3
by CraigoMc (not verified) :: Tue, 12/13/2011 - 7:52pm

Who was in the NYT's Fifth Down blog who showed that, adjusting for score and time remaining, GB was the pass-happiest team in the league? The data was only for the first half of the season, but he certainly didn't find any conservative late game pattern.

4
by drobviousso :: Tue, 12/13/2011 - 7:56pm

"Didn't expect that, I bet?"
As a Pittsburgh fan, yes, I did expect that.

5
by Anonymous1 (not verified) :: Tue, 12/13/2011 - 8:03pm

Same here from a Patriots perspective.

38
by Ben Stuplisberger :: Wed, 12/14/2011 - 11:54am

Ditto, I would expect the same pre-Tomlin as well. Saving it for the playoffs/next week perhaps?

6
by CraigoMc (not verified) :: Tue, 12/13/2011 - 8:13pm

Lazy or not, it really seems like FO is making an effort at addressing these sorts of nagging questions more this season. We appreciate it.

41
by 0tarin :: Wed, 12/14/2011 - 2:53pm

I came in to post a similar sentiment. It may just be me (or just short-term memory), but it seems like this year stands out in FO's overall willingness to come out and address these types of complaints when they're voiced over several weeks and from several ... vocal commenters. I also appreciate this effort, and have found these posts very enlightening. Thanks to all behind the curtain.

7
by Karl Cuba :: Tue, 12/13/2011 - 8:28pm

That the 49ers decline in 4th quarter defence doesn't surprise me, they don't substitute at all on defence, partly because they don't have great depth and partly because the starting ends move to tackle to pretty good effect. I would also point out that the one pass rusher that they do substitute in, Aldon Smith, produces at a pretty high level. Increasing their defensive depth should be an offseason priority.

10
by zenbitz :: Tue, 12/13/2011 - 9:00pm

even simpler, their run defense is much better by DVOA than their pass defense, and they have been leading most games in the 4th quarter, so their opponents are not running.

15
by Karl Cuba :: Tue, 12/13/2011 - 10:34pm

Yeah, I was about to add something like that. By the 4th quarter most teams seem to have worked out that running the ball is pretty much a waste of a down against the niners.

11
by Yup (not verified) :: Tue, 12/13/2011 - 9:08pm

'Foot off the gas' implies conservative play calling to me. Does this address that question?

12
by nat :: Tue, 12/13/2011 - 9:22pm

Very, very cool stuff.

I tend to trust this more than full game DVOA as a measure of the teams' strength. But for some of these teams, the difference between the fourth quarter and the rest of the game is the story of the year.

As bad as the Patriots defense is, their prevent defense is far worse. Denver really does take off the training wheels in the fourth quarter.

14
by poboy :: Tue, 12/13/2011 - 10:26pm

I think you're overestimating the number of games that are out of hand in the 4th quarter (and therefore should have discounted 4th quarter performances).

18
by nat :: Tue, 12/13/2011 - 11:32pm

A game doesn't have to be out of hand. It just needs a large enough margin that teams no longer play for what DVOA thinks they play for - the best next score (on average) regardless of time consumed.

I wouldn't suggest cutting the fourth quarter out of DYAR - the plays really happen after all. But DVOA is an average of a measurement with known issues in the fourth quarter. Why not use a sample that doesn't suffer from those issues? Do you think teams are trying to play badly in the first three quarters? Do you think that the rules of the game change?

By the way, drives with two score leads or deficits are very common in the fourth quarter. But these are precisely the drives where coaches are tempted to alter their schemes to save or burn time. It varies from coach to coach. Some coaches go full-prevent way too early. Others wait too long. On the other side of the ball, some teams get desperate early and others aren't desperate soon enough. But it's only the fourth quarter that has this problem in a meaningful way.

21
by Intropy :: Wed, 12/14/2011 - 12:48am

If those scenarios are so common, and if coaches are so tempted to alter their schemes, wouldn't you expect the 4th quarter to have somewhat different outcomes than the others and wouldn't that imply that it's important to consider? Frequent and different seems like a bad reason to discount a situation.

31
by nat :: Wed, 12/14/2011 - 9:54am

True. But that argues for tracking two separate DVOAs: most-of-the-game and fourth-quarter.

But here's the rub. We don't know if fourth quarter DVOA even works. There are a number of known problems with DVOA that are either specific to the fourth quarter or much worse in the fourth quarter.

(1) DVOA's success formula doesn't model correct fourth quarter strategic goals in many games
(2) The baselines for different down-distance-time-margin situations suffer from selection bias. (that is, teams that are desperate are more likely to be bad - polluting the baseline used for comparison)
(3) Fourth quarter plays often include scrubs put in to "try out" for a higher spot in the depth chart
(4) Fourth quarter plays include many meaningless plays, where neither team has incentive to do more than just practice something, much like preseason games

To simplify and overstate the case about the fourth quarter:
DVOA values the wrong things, compares to the wrong mix of teams, both grades and compares to the wrong players, and mixes in meaningless noise.

It doesn't do this all the time, because many fourth quarters are competitive. But it's often enough to make fourth quarters look as different from the first three as one season looks compared to the next.

32
by White Rose Duelist :: Wed, 12/14/2011 - 10:24am

Aaron has mentioned a number of times that he ran the numbers without the garbage time possessions, and the accuracy got worse. Certainly there can be improvements - he frequently admits to this as well - but it's not like the difference in the way the game is played late is being ignored.

35
by nat :: Wed, 12/14/2011 - 11:09am

Sure. Eliminating just garbage time by whatever criteria introduces more selection bias, and makes things worse. That's why it's better to just look at the quarters. That way you eliminate the known issues while maintaining an unbiased sample of plays.

The tests that Aaron needs to do are these:

(1) To what extent is DVOA in each quarter "predictive" of other quarters?
(2) To what extent is DVOA in each quarter predictive of DVOA in the same quarter in the next season?

I suspect that quarters 1-3 are predictive of each other pretty well, and reasonably predictive of their next seasons, but that fourth quarter DVOA is not predictive of other quarters, and not as good at predicting its next season.

I believe this because while the first three quarters DVOA is determined primarily by your team's skill, fourth quarter DVOA is dictated by both skill and the mix of game situations you face. I think the mix of fourth quarter situations is determined largely by factors that have nothing to do with skill (luck, small sample size, opponent strength to name three).

13
by RickD :: Tue, 12/13/2011 - 10:21pm

I really don't want to watch the 4th quarter of Pats@Broncos, do I?

16
by JIPanick :: Tue, 12/13/2011 - 10:35pm

Depends on 1. how predictive you think these numbers are and 2. how much you like the Broncos.

23
by merlinofchaos :: Wed, 12/14/2011 - 2:21am

Very possibly you ONLY want to watch the 4th quarter of Pats/Broncos.

25
by Stats are for losers (not verified) :: Wed, 12/14/2011 - 8:43am

Can't be much worse than the Bills game, right?

30
by Podge (not verified) :: Wed, 12/14/2011 - 9:50am

I think you want to watch the first 3 quarters so you can see the matchup of Mortal Tebow V Pats Secondary.

Extremely stoppable force V Readily moveable object.

17
by Paul M (not verified) :: Tue, 12/13/2011 - 10:44pm

Well, thanks Aaron. I appreciate the digging. I turned out to be half-right-- Packers are clearly a better 1-3 Quarter team and more dominant viewed that way, but they're not alone. So I am duly chastised. And as for the Broncos, Holy Crunch Time, Batman!! And the Giants get all silly at both ends of the field, don't they??

34
by CeeBee (not verified) :: Wed, 12/14/2011 - 10:55am

Eli being insane in the 4th quarter this year is largely due to the fact that the defense usually collapsing at the same time.

I think this is the year that Giants/Eli most closely resemble the past couple years of Peyton/Colts. Both have shitty defenses and no running game. QB and WR play getting it done almost entirely by themselves.

42
by dmstorm22 :: Wed, 12/14/2011 - 3:12pm

Do people just not remember the 2009 season? Before rest-a-palooza, the Colts had the #9 DVOA defense, and #2 scoring defense in the league. Sure, in 2010 the defense was quite bad, but you have to go back to 2004 for the defense to be that bad again.

48
by Aaron Brooks Go... :: Wed, 12/14/2011 - 7:33pm

I think part of it is that the Colts became so infamous for pulling starters in the last two weeks and losing games to inferior competition that everyone started discounting them as a potential 16-0 team.

The Manning Colts could lose week 17 games to the Little Sisters of the Blind.
The Painter Colts could lose that game in weeks 1-16, too.

36
by NYMike :: Wed, 12/14/2011 - 11:21am

Maybe it's the lack of depth on defense that's killing the Packers (I think the Pats have the same problem). Injuries mean the d-line rotation is short; the last game was played with two middle linebackers with a total of one full and two partial seasons between them; Nick Collins means Charlie Peprah is playing all the time, and Pat Lee is missing, too; Matthews and Woodson don't practice a lot because they're dinged up.

The 4th quarter drop-off would not be unexpected in this case.

63
by LionInAZ (not verified) :: Fri, 12/16/2011 - 10:51pm

What you didn't guess was that the Packers can't afford to "take the foot off the pedal" because their defense is collapsing in the 4th quarter.

It's a joy to see the myths promulgated by Packer fans dispelled.

19
by Joseph :: Tue, 12/13/2011 - 11:44pm

Well, based on this, the Saints & Packers are going to have a 4Q shootout if they meet in the NFCCG. If the Saints can keep it close in the 1st 3 Q's, their D gets better in the 4th, & GB's goes down--enough that they pass each other in the rankings.

20
by johonny (not verified) :: Wed, 12/14/2011 - 12:08am

OK maybe Tony Sparano deserved to be fired after all.

37
by Arkaein :: Wed, 12/14/2011 - 11:32am

This is good stuff, but I have to say it leaves me wanting more.

What would really be enlightening is separating out garbage time vs. non-garbage time in the 4th quarter. This is the big question concerning the Packers because they have had more garbage time than probably every other team. Yet they have had some close 4th quarters as well, so this data doesn't show if the offense drops off uniformly across all 4th quarters, or if it drops off a lot in blowouts and is offset by strong play in the fewer close games.

50
by 'nonymous (not verified) :: Thu, 12/15/2011 - 12:25am

You already know the Packers are undefeated-- that means they've done well (enough) in the 4th quarter of all games close at the end. I'm not sure there's much more to learn.

Including selected 4th quarter plays would certainly boost the Packers' DVOA; but it won't give them a dominant first place DVOA ranking, if that's what you're looking for, since the Q1-3 DVOA puts them right in the midst of BAL, HOU, NE, and PIT.

44
by nat :: Wed, 12/14/2011 - 4:47pm

Here's a cool finding...

The correlation of this year's 3Quarter Offensive DVOA to last years Offensive DVOA: 0.51

Same check of 4thQuarter Offensive DVOA to last year's OffDVOA: 0.14

It looks like 4th Quarter DVOA 'predicts' very little about a team's overall quality in an adjacent season, even when that quality factors in the fourth quarter.

Same check on the defense...
3Quarter correlation: 0.11
4thQuarter correlation: -0.04

Once again, 4th Quarter DVOA is worse at 'predicting' performance. In fact, it's essentially useless as a predictor. (3Quarter Def DVOA isn't that great a predictor either)

What's it mean? Either fourth quarter football requires a separate kind of football skill, or something other than football skill is being measured by fourth quarter DVOA. Either way, they should be reported and judged separately.

45
by Jimmy :: Wed, 12/14/2011 - 5:40pm

I would suggest fatigue on defense. Defensive coordinators seem to realise (and have been paraphrased as having said so) that they can only expect their defensive players to last 55 snaps or so (their number not mine - although FO reader Kal studied the Oregon offense and found the 55 number there as well so maybe there is something to it). It might be that some schemes suffer more than others either in terms of wearing down faster or the scheme not working as players slow down.

I would love to see if either offensive or defensive DVOA vary with the number of snaps played.

54
by RichC (not verified) :: Thu, 12/15/2011 - 5:41pm

DVOA has no concept of game time, which is what most of the issue comes from.

DVOA thinks a team, down by 30 points, going on a 10 minute, 15 play TD drive is a very strong drive. Everyone watching the game knows that it made that team less likely to win.

There are situations where points are only valuable if they can be had quickly, and DVOA has no way of understanding that. Thats the primary reason the 4th quarter numbers are a bit silly.

58
by White Rose Duelist :: Fri, 12/16/2011 - 11:31am

From the "Our New Stats Explained" page:

Every single play run in the NFL gets a "success value" based on this system, and then that number gets compared to the average success values of plays in similar situations for all players, adjusted for a number of variables. These include down and distance, field location, time remaining in game, and current scoring lead or deficit.

DVOA certainly does know how much time is remaining. There is room for improvement, but a lot of people on this thread don't seem to understand that the system does not think 5 yards on 1st and 10 from the 20 on the first play of the game is the same as 5 yards on 1st and 10 from the 20 with 2 minutes to go and down 9.

59
by nat :: Fri, 12/16/2011 - 12:11pm

To clarify:

DVOA's success value concept has no idea of time remaining.

DVOA uses time remaining only to establish its buckets of "similar plays". This does nothing to fix the underlying issue that DVOA can mistake bad plays for good plays when the clock becomes a major factor. (See Barber, Marion: brain-fart thereof)

The use of time remaining in defining the buckets of similar plays also makes the selection bias in the average success values even worse, which biases both the success over average for each individual play and the percentage calculation used for VOA and DVOA.

If you didn't understand those two facts, count yourself among the "lot of people" who don't understand the implications of time remaining in DVOA.

60
by White Rose Duelist :: Fri, 12/16/2011 - 12:54pm

The bucket of plays run with 2 minutes left by teams up by 9 is not the same bucket at plays run with 2 minutes left by teams down by 9.

61
by tuluse :: Fri, 12/16/2011 - 1:06pm

It's still just comparing teams to average. Sure, with that little time remaining the average gain is probably longer, but DVOA still doesn't account for ending the play in or out of bounds. It doesn't know that an incomplete pass is better than pass for a short gain. There are probably some other things too.

62
by nat :: Fri, 12/16/2011 - 1:36pm

Did you even read the post you responded to?

You've just shown why the success value used by DVOA is a problem in so many fourth quarters. Teams run different plays when clock becomes an issue because they have different goals when clock becomes an issue. Success value and thus DVOA doesn't know that, and thus mis-characterizes plays as good or bad or in between. In these situations, "success value" measures the wrong thing. No amount of adjustment will fix that problem. The necessary data is already lost.

Comparing to an average play result does not address this problem. It just adds a second problem: the "average" for such plays is heavily biased in terms of the quality of the teams in each bucket. Bad teams get in more bad situations; teams with leads are more likely to be good teams. That bias pollutes both the "success value over average" and the "total baseline" used to calculate VOA and DVOA.

In theory, these problems happen in every play of the game. In practice, they are a big problem in the fourth quarter, and a smaller one at other times... so much so that 3-quarter DVOA is a good predictor of future success, while 4th quarter DVOA is a bad predictor, and in the case of defensive DVOA, no predictor at all.

46
by Intropy :: Wed, 12/14/2011 - 6:22pm

How does total all four quarter DVOA correlate?

49
by nat :: Wed, 12/14/2011 - 9:31pm

Correlations of 2010 DVOAs to 2011 DVOAs (so far) using all four quarters....

Offense: 0.47
Defense: 0.08

So if you want to project next year's DVOAs, you're better off working from this year's three-quarters DVOA rather than you are using the DVOA from complete games.

Yet more evidence that you should be using DVOA for three quarters to project a team's strength. The fourth quarter seems to have issues that make it much less predictive. (It would be worth checking other years.)

51
by nat :: Thu, 12/15/2011 - 9:48am

What now?

What do we as consumers and fans of DVOA do this season? And what should Aaron and the rest of FO do to improve DVOA now or during the off-season?

Aaron is right to resist making mid-season changes to DVOA. DVOA is what it is for a season. It takes time and focus to propose, prototype, and test DVOA changes; time that Aaron does not have until February. Quick fixes are likely to cause more problems than they solve. But there are things which can be usefully done with little risk.

For DVOA, it might be nice to repeat this 3Quarters/4thQuarter DVOA report, perhaps as a lead-in to the playoffs.

For DYAR, it would be great if unusually divergent fourth quarters could be called out in the comments, as already happens sometimes.

For us fans, we just need to be aware that there is some noise in DVOA and DYAR due to the wonky fourth quarters. Some of that wonkiness is real differences on the field, and some is probably due to soft spots in DVOA that affect the fourth quarter more than other parts of the game.

Off-season, there's work to do. I have no idea how Aaron could include clock management in the DVOA success formula. But the issues regarding the baselines can be fixed.

(1) Correct for selection bias in the baselines, so the "A" in DVOA approximates an average team playing against an average team.
(2) Use a stable denominator when calculating VOA as a percentage, rather than summing the baselines for all plays in the game. That way each play's value over average would depend only on itself, and not be measured on a scale that varies by the mix of other situations in the game.

And if Aaron can't come up with a fix to the success formula for clock management time...
(3) Make fourth quarter DVOA and DYAR a regular part of the charts, so fans can judge for themselves whether garbage time plays or baselines are skewing the results.

52
by c0rrections (not verified) :: Thu, 12/15/2011 - 4:16pm

Nat,

You seem to have several clear problems with your analysis. First, I have no idea why you compared DVOA for just the 4th quarter to DVOA for the first 3 quarters. I would imagine that DVOA for a randomly selected 3 quarters would correlate better than any individually selected quarter. Second, it is not clear to me why you are willing to make conclusions based on one season's worth of correlation data. Finally, it seems to me that the difference between all 4 quarter DVOA and first 3 quarter DVOA in correlation is so trivial (.04) as to be not particularly meaningful. I will also add that in order to test your thesis you'd also have to look at other 3 quarter selected periods and check their correlation.

56
by nat :: Thu, 12/15/2011 - 7:19pm

I used three quarters vs one quarter because that's what Aaron posted.

If all four quarters were equally predictive measures of football skill, I doubt the fourth quarter correlation would have been so much lower. But, yes, it would be easier to follow if I had checked each quarter separately. I don't have that data.

I did one season because that's what Aaron posted. I'd love to have Aaron run the same check in other years. But truly, there are enough other reasons to distrust fourth quarter DVOA, I wasn't at all surprised by the result.

Finally, if I mix one quarter of poorly correlated data with three quarters of better correlated data, its no surprise that the final correlation is between the two, and closer to the three quarters portion of the data. If I mix a teaspoon of manure with two liters of Coca Cola, it's still mostly Coke. But that doesn't make it a good idea.

53
by Intropy :: Thu, 12/15/2011 - 4:26pm

Hold your horses there. You've shown that for two seasons 3Q DVOA correlates about as well as full game DVOA (the difference is negligeable) and that 4thQ DVOA correlates less well. This is all interesting and good data, but it's not enough to go into wholesale change mode.

First, I'd expect one quarter to correlate less well. TO me that result isn't the interesting one. It's the 3Q vs full game that's interesting. But the numbers are so close that it doesn't seem to matter. How would looking at more season affect the correlation. Most importantly, while good to know, DVOA isn't attempting to correlate current DVOA against previous season DVOA. Teams do get better or worse. DVOA is attempting to correlate to winning football games.

55
by nat :: Thu, 12/15/2011 - 7:00pm

I was just being brief. Elsewhere I included the usual "do more study" caveats. Please don't give me a hard time because I don't put them in every single post.

One quarter should correlate less well, I agree. But by this much? I doubt it. The good check to do is to see how the different quarters correlate with each other during a season. I'd bet the fourth quarter is the odd man out.

And lastly, you are wrong about DVOA.

It's VOA that is supposed to correlate with winning, although for offenses and defenses it's designed to correlate with maximizing the next score. That validates the success factor and baseline concepts.

DVOA itself is supposed to correlate with the next season's DVOA. That validates the use of opponent adjustments to arrive at a predictive measure of repeatable skill. I merely used last year's DVOA as if it were next year's DVOA. I did put 'predict' in quotes when I remembered to. But if DVOA is working well, it should correlate with the previous season as well as it correlates with the next. Correlations have no "arrow of time".

57
by Jerry P. :: Thu, 12/15/2011 - 10:21pm

Green Bay has had a decent lead over the entire league for the entire season in estimated wins. You have to go back to week 15 of last season before Green Bay drops out of the top 5 of estimated wins for the regular season.

Estimates wins uses a statistic known as "Forest Index" that emphasizes consistency as well as DVOA in the most important specific situations: red zone defense, first quarter offense, and performance in the second half when the score is close. It then projects a number of wins adjusted to a league-average schedule and a league-average rate of recovering fumbles. Teams that have had their bye week are projected as if they had played one game per week.

This seems like the statistic the Pack fans are looking for.

64
by LionInAZ (not verified) :: Fri, 12/16/2011 - 10:58pm

I think that *some* Packer fans here are simply badgering for adjustments that will magically turn the 2011 Packers into the undisputed best team this year, if not the best team of all time.

65
by JimmyJJ (not verified) :: Sun, 12/18/2011 - 6:14am

#1 4th quarter offense takes on the top 4th quarter defense in football boxing day!

66
by Keir (not verified) :: Tue, 12/20/2011 - 1:41pm

Looking at defense in the 4th vs 1-3rd quarters you see that the average change across all teams is about -2.5%. This makes me wonder what DVOA is doing. If the baselines were consistent across the whole game then one would be fine with a defense getting more tired than an offense and performing worse, but the baselines are supposed to move, in some mysterious way, with the progression of the game.