Formation Analysis: Number of WRs Part II
by Danny Tuccitto
Last Monday, we looked at how well offenses performed in 2010 with various numbers of wide receivers in the formation. Now it’s time to evaluate the defenses.
Last season, defense DVOA got worse, on average, as the number of wide receivers in the formation increased, going from -3.8% against 0/1WR-formations to 6.2% against 4/5WR-formations. On a team-by-team basis, 22 of the 32 NFL defenses followed this general pattern.
Of the two extremes, however, it was a team’s ability to defend the 4/5WR formation that seemed to separate winners from losers in 2010. Indeed, DVOA against this formation was highly associated with team wins (r = -.454) even though teams only had to defend it about eight percent of the time. Compare this to the correlation for 3WR sets (r = -.142) -- which defenses faced five times as often -- and it tells us that success (or failure) at defending the 4/5WR formation had a disproportionately large influence on winning or losing last season.
The same can be said if we compare playoff teams to non-playoff teams. Playoff defenses had an average DVOA against 4/5WR sets of -7.1%, and an average ranking of 12.0. Non-playoff defenses, on the other hand, had an average DVOA of 18.1%, and an average ranking of 19.2. These differences between playoff and non-playoff defenses (25.2 percentage points and 7.2 ranking spots) dwarfed the differences against the other wide receiver formation groups.
With that trend in mind, the Packers’ playoff run is a perfect case study. Essentially, in repeated battles of strength versus strength, Green Bay emerged victorious each time. Just to get to the Super Bowl, Green Bay’s defense, ranked seventh in DVOA against 4/5WR sets, defeated two of the top four offenses in DVOA on plays using 4/5WR sets (i.e., Chicago and Atlanta, respectively). Similarly, the Packers’ own offense, ranked sixth in DVOA on plays using 4/5WR sets, beat the second-best defense against 4/5WR sets in the NFC Championship game, and then followed that up with a win against the number one unit in the Super Bowl.
Finally, it should be pointed out that 2009 didn’t exhibit this pattern at all. Statistically speaking, everything made sense. Team wins and overall defensive efficiency tracked alongside efficiency against 2WR- and 3WR-sets because of the prevalence of these plays, and playoff teams weren’t particularly strong or weak against any specific formation in comparison to non-playoff teams; they were just better overall. Therefore, more game charting is definitely in order. Nevertheless, for the purposes of reviewing the 2010 season only, defense against 4/5WR formations seemed to play an important role in deciding who took home the Lombardi trophy.
In the table, I’ve ranked teams’ formation-specific DVOAs provided that they faced that formation on at least 5 percent of their defensive plays. Because there’s actually a trend this time, I’ve sorted the table according to DVOA against 4/5WR formations.
|TEAM||0-1 WR DVOA||Rk||2 WR DVOA||Rk||3 WR DVOA||Rk||4-5 WR DVOA||Rk||0-1 WR Freq.||2 WR Freq.||3 WR Freq.||4-5 WR Freq.|
|TEAM||0-1 WR DVOA||Rk||2 WR DVOA||Rk||3 WR DVOA||Rk||4-5 WR DVOA||Rk||0-1 WR Freq.||2 WR Freq.||3 WR Freq.||4-5 WR Freq.|
34 comments, Last at 24 Jul 2011, 6:58pm
#1 by Phyrre564 (not verified) // Jul 18, 2011 - 3:20pm
"It was a team’s ability to defend the 4/5WR formation that seemed to separate winners from losers in 2010. Indeed, DVOA against this formation was highly associated with team wins (r = -.454)."
It seems logical that a losing team late in the game would employ a 4/5 WR strategy (hurry up offense, little attempt to bluff the run) and that the winning team could commit to pass defense against it. Wouldn't that skew these results?
I think to draw this conclusion, you'd have to look at DVOA against the 4/5 WR formation when the offense was not behind big or late in the game.
It's like the finding that teams are more likely to win when they call 25+ running plays. The truth is that once they establish a lead, they elect to run to kill the clock. Your inputs and your outputs may be backwards here as well.
#4 by trill // Jul 18, 2011 - 3:52pm
I could be wrong, but I think DVOA excludes "garbage time" plays when the game is out of reach for either team. Now, if someone has a one-possession lead in the 4th quarter, I don't think it's dishonest to include that data even though it reflects a particular strategy.
#7 by Jerry // Jul 18, 2011 - 4:55pm
There's still DVOA for garbage time plays; it's just compared to other garbage time plays.
#2 by Dean // Jul 18, 2011 - 3:48pm
So if we go back further than last year, which is the anomoly? 2009 or 2010?
#3 by trill // Jul 18, 2011 - 3:49pm
What a weird table. For the 4/5WR sets, we're almost definitely seeing some distortion from small sample sizes (ARI, DEN). What's most interesting to me is the difference in DVOA from 2WR to 3WR. This isn't a wholesale change in personnel like 23/32/10/00 packages; at most defenses are sending their nickel DB on the field. So what conclusions can we draw from teams like NE (40% difference), SF (24%), and IND (25%) sucking it up against 3WR? Lack of DB depth is probably the main factor.
SF's poor performance in sub packages must have shown up in the raw, non-opponent adjusted numbers, but opponents used 2WR just as often as 3. But I guess the Rams and Seahawks don't really have the WR's to take advantage?
#9 by Aaron Brooks' … (not verified) // Jul 18, 2011 - 5:13pm
I like how Arizona was totally backwards -- inept against goal line formations but totally shutting down 5 WRs sets. Practice effect, considering their own offense almost solely ran 4-5 receiver sets?
#5 by Kibbles // Jul 18, 2011 - 4:11pm
With that trend in mind, the Packers’ playoff run is a perfect case study
Funny, I never realized that one data point now qualifies as a trend. I had chicken for lunch today- if this trend holds, America's dairy farmer will be out of business by Friday. My neighbor is 6'2". If this trend holds, his grandchildren will be 18 feet tall.
Basically, you say that a statistic with an admittedly small sample size (8% of total plays faced, 1/5th as many plays as are run from 3-WR sets) just happens to have a disproportionately strong correlation with winning this season, and then later admit that it did not have a disproportionately strong correlation last season, and your conclusion isn't "hey, small sample sizes lead to strange results" or even "hey, correlation does not imply causation", it's "hey, defense against 4/5 receiver sets is somehow a disproportionately meaningful statistic".
This piece had some interesting data, and I would have enjoyed it if it was presented more from a "wow, look at this statistical quirk from last season" perspective, or even a "hey, here's an interesting way to frame the Packers' SB run in terms of defense against 4/5 receiver sets (you know, without actually checking to see the results of plays from 4/5 receiver sets during GB's superbowl run)". Instead, it was presented as... I don't know what, some sort of meaningful trend or harbinger of things to come. The tagline was particularly nauseating, suggesting (however cheekily) that we substitute "defense wins championships" with "defense against 4/5 receiver sets wins championships" while ignoring the fact that as poor as the evidential support for "defense wins championships" might be, it's miles better than the evidential support for "defense vs. 4/5 receiver sets wins championships".
Look, I like Football Outsiders. I really do. I've been reading and participating in discussions around here since the B.E. days (Before Easterbrook). I think that for the most part their analysis is forward thinking and probably the best that's available to the general public. At the same time, there's a large sentiment outside of the FO community that they're frequently guilty of overreaching the statistics, and I think this is a perfect example of the phenomenon.
#6 by Neoplatonist B… (not verified) // Jul 18, 2011 - 4:29pm
Meh. 8% of 55x16=880 plays is about 70 plays. That's not a great sample, and you'll get some silly results. But those results themselves are often informative. And you'll have 32 teams to compare.
For instance, see how some teams have really bad defenses against 3WR sets but do much better against 4WRs? New England and Denver had very different seasons, but they both had very bad DVOA against 3WRs and very good DVOA against 4WRs? In Denver's case, it seems to be that they just didn't face many 4WR sets, which could easily be attributed to the composition of the AFCW and AFCS opponents (lots of run game and a terrifying slate of receiving TEs). New England, not sure.
#8 by Thomas_beardown // Jul 18, 2011 - 5:01pm
Yeah, his characterization of this as a single data point is really odd.
All but 3 teams faced 4-5 receivers on at least 5% of plays which is 55 plays for the average team (teams ran 1010 plays on average per pfr).
#20 by Kibbles // Jul 19, 2011 - 1:24pm
I'm characterizing the fact defense vs. 4/5 receiver sets was somehow disproportionately meaningful as a single data point, one which was admittedly unsupported by any other data from any other season. Saying "we don't have any reason to believe that defense vs. 4/5 receiver sets was particularly meaningful in any other year ever, but since they were correlated in 2010 then it must have magically become meaningful last season for wholly unexamined reasons" seems pretty weak to me. As I said, it seems like a clear case of overreaching the statistics.
#30 by poops (not verified) // Jul 20, 2011 - 1:57pm
Maybe it's because more teams were successfully running 4/5 WR sets?
#31 by Kibbles // Jul 20, 2011 - 4:53pm
If that's the working theory, then it should be relatively easy to verify (or discount). As I said in another post, other possibilities might include an increase in WR depth, a decrease in DB depth, a decrease in the quality of top receivers, an increase in the quality of top DBs, or an increase of 4/5WR usage in "high leverage" situations (situations that pretty much by definition have a disproportionate impact on winning, such as plays in the red zone, or plays in the final 5 minutes of a close game, etc). All of these theories are testable- and indeed, if someone wants to suggest that 4/5WR stats were disproportionately meaningful, it's on them to present a theory and test it. You can't just say "hey, here's a random correlation we found. It's strong, so it must be meaningful". You have to establish a causal relationship. Otherwise you wind up with situations like I mentioned elsewhere where you can say "hey, team nickname strongly correlated to team wins last season, so obviously team nickname played a strong role in determining the outcome of games".
#32 by trill // Jul 21, 2011 - 11:55am
"... an increase of 4/5WR usage in "high leverage" situations (situations that pretty much by definition have a disproportionate impact on winning, such as plays in the red zone, or plays in the final 5 minutes of a close game, etc)."
I am entirely too dumb and lazy to manipulate the data on this, but based on anecdotal observation from last year, I think it's a hypothesis worth investigating. Specifically, Jets/Steelers playoff game, ATL/GB from the regular season, and a couple NO games, I remember several situations of this type occurring.
Somebody crunch some numbers so we can get Kibbles blood pressure down to a healthy level.
#10 by Danny Tuccitto // Jul 18, 2011 - 5:48pm
I'm a calmer, gentler Danny now that I'm over here, so this will be my only reply (and a sedate one at that):
It sounds like you would have preferred more of what I did do (i.e., frame stat in context of GB SB run, mention "quirkiness" in relation to 2009, offer sample size caveats), less of what I didn't do (i.e., describe the behavior of this stat in 2010 as a meaningful trend or harbinger of things to come outside of 2010), and less cheekiness in my tongue-in-cheek tag line.
#21 by Kibbles // Jul 19, 2011 - 1:55pm
I appreciate your restraint, and will try to reply in the least inflammatory manner possible. I understand that when someone criticizes my work, I immediately feel that it is instead a broader criticism of me, and therefore suspect that there's probably nothing I can say that will prevent you from taking this criticism personally. Despite that, I want to make it abundantly clear that this criticism is not personal- it is strictly about the conclusions you are drawing and the manner in which you are framing the data.
I would have preferred less of describing the behavior of this stat in 2010 as a meaningful trend or even strong descriptor of what happened in 2010 (as summed up in this sentence: success (or failure) at defending the 4/5WR formation had a disproportionately large influence on winning or losing last season.). As the old saw goes, correlation is not causation. If you had found that there was an equally strong correlation between the ordinal rank of the first letter of a team's nickname (a = 1, b = 2... z = 26) and the team's final record, would you have presented "team nickname" as a variable that had a "disproportionately large influence on winning or losing last season"? When there's no history of a statistic ever being particularly meaningful, and then one season it just happens to appear meaningful, the first reaction should not be to write an article about how meaningful it was... it should be to question whether that meaning was reality, or simply an illusion.
For more details, see: http://xkcd.com/882/
If you were really going to put forth "defense against 4/5 receiver sets" as a powerful causal element behind team success, I would have liked to at least seen some reason why that might have been the case last year if it wasn't in years past. It doesn't seem that the quality or depth of WRs was significantly better in 2010 than in 2009. It doesn't seem that the quality or depth of DBs was significantly worse in 2010 than in 2009. Did teams run 4/5 receiver sets on more plays in 2010 than in 2009? Did teams run 4/5 receiver sets in higher leverage plays in 2010 than in 2009 (e.g. more 5 receiver sets in the red zone, or when holding a 1 score lead late in the 4th)? What explanation might exist for the dramatic jump in "significance" of 4/5 receiver performance from 2009 to 2010, other than the standard null hypothesis that it was just random fluctuation in the data?
Finally, if you're going to present Green Bay's SB run as a series of "repeated battles of strength versus strength" where "Green Bay emerged victorious each time", a really nice place to start would be actually looking at Green Bay's offensive and defensive DVOAs when 4/5 receiver sets were deployed in those games. Saying "Green Bay had a top offense and defense vs. 4/5 receiver sets, and they faced a lot of teams that could make the same claim" is a lot different than saying (or suggesting, or implying) that Green Bay won the SB because it outperformed its opposition in those areas. You presented enough evidence to make the former claim, but instead you made the latter claim. If you wanted to support that latter claim, you should have provided some evidence that Green Bay did, in fact, outperform its opposition when 4-5 receivers were on the field, and then provided further evidence that that performance was, in fact, somehow integral to Green Bay's victories (possibly by showing that Green Bay underperformed its competition in all other situations combined).
I realize that I'm ruffling feathers, and I promise that's not my goal here. My goal is to point out the difference between "good use of statistics" and "bad use of statistics". Much as I hate to turn to cliches, there's a grand lesson in that whole saying about drunks, statistics, and lamp posts. In this article, you're clearly using statistics to support rather than to illuminate. I like Football Outsiders, but as I said, there's a reason why intelligent, highly educated, mathematically inclined people outside of this community (and even a few within it) are so frequently accusing Football Outsiders of overreaching the statistics. In this article, you made several sweeping claims that you simply did not do enough to substantiate, starting with the joking "defense vs. 4/5 receiver sets wins championships" in the tagline, continuing on through the "defense vs. 4/5 receiver sets was disproportionately meaningful last season", and ending with "defense vs. 4/5 receiver sets was key to Green Bay's SB run".
#23 by jimbohead // Jul 19, 2011 - 5:59pm
I think you may be misreading the article, and specifically what was said about GB. As I read it, it seemed that the point he was making is, "hey, check it out, there's a relatively high R^2 value here. This almost certainly means nothing."
Then, with GB, he continues on that point by showing that, while they had a high-valued def. v 4/5, every team they played in the playoffs from div. series on had a better ranking in that category, and yet they won. GB was taken as a case study for exactly the point you're trying to make: def against rec. 4/5 is really not that significant.
#26 by Kibbles // Jul 19, 2011 - 9:20pm
I'm open to the possibility that I'm misreading the article, but I really don't think the text supports your interpretation. Consider two key sentences:
#1- "Essentially, in repeated battles of strength versus strength, Green Bay emerged victorious each time." This hardly seems like he's presenting the data in the context of Green Bay being outmatched in the 4/5 stats and winning anyway.
#2- "for the purposes of reviewing the 2010 season only, defense against 4/5WR formations seemed to play an important role in deciding who took home the Lombardi trophy." You claim that the author was presenting Green Bay to demonstrate the point that, since GB beat teams that ranked higher in 4/5WR stats, 4/5WR stats were therefore not significant in determining who took home the Lombardi trophy. This sentence directly contradicts that interpretation by stating that 4/5WR stats were, in fact, significant in determining who took home the Lombardi trophy.
#15 by Whatev // Jul 19, 2011 - 6:18am
You know what a case study is, right?
#16 by Anonymouse (not verified) // Jul 19, 2011 - 9:46am
a case study is an anecdote that supports your opinions;
where an anecdote is a case study that doesn't.
#24 by Whatev // Jul 19, 2011 - 6:12pm
My point is, whether by your definition or a less cynical one, it's neither a trend nor an assertion of a trend. In fact, given the context of their use in business and law school, one might think of them as a substitute for a trend in situations where the complexity of the individual cases and the small number of data points virtually guarantee that trends cannot be reliably identified.
#29 by Aaron Brooks' … (not verified) // Jul 20, 2011 - 9:59am
I think those fall into the category of "damned lies". The medical industry *loves* cases studies, and it's a nasty, lazy habit. They are fine when they are presented as "we saw this, we did this, and it worked -- try it too" or "we saw this, we did this, and it didn't work -- and this is why." But when it's used as a surrogate for actual statistics or epidemiology, you get a lot of agenda-insertion, lazy analyses, and bad medicine.
#27 by Dan // Jul 19, 2011 - 10:25pm
Good posts, Kibbles. In particular, this sentence in the article is not correct:
"it tells us that success (or failure) at defending the 4/5WR formation had a disproportionately large influence on winning or losing last season"
If you run a bunch of correlations, some of them are going to be inflated just by chance, not because of any causal relationship (not even a temporary one). The noise in one variable just happens to line up with the noise in the other variable. And that's very likely to be what happened here. The true strength of the causal relationship between defense with 4/5 WRs and wins is probably around |r|=.1 or so, but this year the noise happened to match for these two variables (the teams that had good luck defending against 4/5 WRs just happened to be the better teams) so the correlation came out larger.
With 32 data points, about one in every twenty correlations will be inflated (in the expected direction) by .3. One out of every twenty may seem rare, but in preparing these two articles Danny ran at least 8 correlations (between wins and the 4 formations for offense & defense) and possibly more. Considering all of the other 32-data-point correlations that FO runs, correlations that are inflated by .3 or so are going to come up all the time. Looking at more years of data is a good next step to take, but if the correlation doesn't hold up over multiple years that probably means that it's just a statistical fluke, not a real causal influence that mattered for just that one season.
#11 by thok // Jul 18, 2011 - 6:37pm
So New England was strong against even numbers of wide receivers and weak against odd numbers?
#12 by Stewart (not verified) // Jul 18, 2011 - 6:51pm
Maybe NE's coverage concepts work better with symmetry?
#13 by trill // Jul 18, 2011 - 7:27pm
Bill Belichick would like to know when Symmetry is reporting to training camp. He's currently #2CB on the depth chart.
#14 by thok // Jul 18, 2011 - 8:00pm
Wait, Symmetry beat out Hole in the Zone?
#28 by Neoplatonist B… (not verified) // Jul 20, 2011 - 1:29am
And Jason David. Where have you been?
#33 by Cyrus // Jul 24, 2011 - 6:36pm
I am really confused how NE can be that different depending on the number of WR's. I would understand if it meant that McCourty was really good, but this makes me think that our CB2 is really bad, but our CB3 is better than average. Or something like that.
Maybe having Bodden back from injury and Ras-I Dowling our secondary will get better.
#34 by Thomas_beardown // Jul 24, 2011 - 6:58pm
It could mean that CB2 was bad, but Bellichick had great blitzes or zone coverages that confused QBs when they tried more receivers.
Or it could just be random fluctuations that don't mean much.
#17 by Moosebreath (not verified) // Jul 19, 2011 - 10:20am
Something seems wrong here. The player-by-player charting for the Eagles suggested that they had an excellent 1st CB (Samuel), an excellent 3rd CB (Hansen), and terrible 2nd CB's (Hobbs and Patterson). Yet the only formation they were above average was against 2 WR's.
#18 by Aaron Brooks' … (not verified) // Jul 19, 2011 - 10:38am
Were they just rolling a lot of help at Hobbs on 2WR sets, but couldn't do that one 3-5WR sets when they needed the extra DB to cover a guy instead of bail out their loser CB?
#19 by trill // Jul 19, 2011 - 11:43am
Might have something to do with their tendency to blitz A LOT on passing downs.
#22 by Kibbles // Jul 19, 2011 - 1:58pm
#1 CBs frequently aren't matched up against #1 WRs. Teams use a lot of zone, or they might have their #1 take one side of the field, or they might put the #1 against the opposing #2 WR while double-teaming the opposing #1. Lots of reasons why we shouldn't expect a team's #1 CB stats to be identical to their stats against #1 WRs.
#25 by Joseph // Jul 19, 2011 - 8:00pm
I'm wondering what the data points to for my Saints. #2 overall against 0-1 WR's, and pretty good against 3 WR's (#9); #25 & #19 against 2 and 4-5 WR's, respectively. IMO, it means that the LB's arent' good against the pass, nor is the SS Harper. And against 4-5 WR's, it means that 1st rounder Patrick Robinson was a rookie. (They also picked up a CB in the 4th this year for depth purposes). Anybody else have some ideas about what the data could relate to, at least for their favorite team?