Rushing Success and Play-Action Passing
Guest column by Ben Baldwin
Why do teams run the ball so often? The average pass play gained 6.2 yards in 2017, compared to 4.1 yards for the average rush play. And yet, on first-and-10, teams ran the ball 53 percent of the time. On these first-and-10 runs, 44 percent of rushes and 52 percent of passes were successful.
A common response to the question of why teams run so frequently is that teams need to run the ball in order to maintain the effectiveness of play-action passing. When play-action is based on a committed rushing attack, the argument goes, the pass rush has to slow down and the linebackers have to respect the threat of the run, pulling them towards the line of scrimmage and away from their coverage responsibilities. Examples of this argument abound: The Ringer's Robert Mays recently wrote that "as much as our understanding of the sport has shifted in recent years, the belief that a play-action game's effectiveness is linked to a strong, high-usage running offense has remained steadfast." Former NFL lineman Geoff Schwartz stated that "teams have to commit to running the ball first to open up [play-action]."
Despite the pervasiveness of this argument, I have not seen much evidence about the extent to which the effectiveness of play-action passing is dependent on a team's rushing attack. With thanks to play-by-play charting from ESPN Stats & Info (from 2011 to 2013) and Sports Info Solutions (from 2014 to 2017), this piece takes a deep dive into measuring the empirical relationship between rushing and play-action passing.
Before we move forward, a few notes on the data:
- This uses play-by-play charting of play-action passes from the 2011 through 2017 NFL seasons.
- I group all dropbacks together when discussing the efficiency of passing, meaning that "yards per passing play" counts pass attempts, sacks, and scrambles.
- I do not count fake end-arounds as play-action unless also accompanied by a fake handoff to a player who began the play in the backfield.
- I count a successful rush as one that gains at least 45 percent of the yardage needed on first down, 60 percent on second down, and 100 percent on third or fourth down.
- This piece looks at the specific question of whether a team's play-action effectiveness is related to its rushing. For information on play-action generally, such as how often teams run it and how good they are at it, check out the annual Football Outsiders play-action articles.
- And finally, while it cannot be run without access to the underlying data, I am posting the code I used to generate the figures in this piece here for transparency.
This section counts one team-season as the unit of observation and asks the question of whether teams that run frequently or successfully tend to be better at play-action passing. Every team from every season is one dot on the plots below. For example, there is one dot representing the 2011 Denver Broncos and another representing the 2012 Broncos:
|Denver Play-Action Passing, 2011-2012|
|Season||PA Yards/Dropback||Rushes||Rush%||Rush Success Rate|
Recall, as noted above, that these numbers differ slightly from the official totals because I have re-classified scrambles and sacks as pass plays. The massive improvement in play-action efficiency for Denver between 2011 and 2012 is largely due to the quarterback switch from Tim Tebow to Peyton Manning.
In each figure below, the horizontal axis is a measure of rushing data, and the vertical axis is the average yards per play-action dropback. In the upper left figure below, using (X,Y) notation, the point (515, 7.0) represents the 2011 Broncos and the point (446, 10.5) represents the 2012 Broncos. There are also points for every team's season from 2011 to 2017.
In each figure, the R^2 is printed in the lower right. In the upper left figure, for example, the R^2 of 0.02 means that only 2 percent of the variation in a team's play-action effectiveness (as measured by yards per play-action dropback) in a given season can be explained by the number of rushes in that season.
The three plots above show that whether looking at total number of rushes, the proportion of plays that are rushes, or the proportion of rushes that are successes, no R^2 exceeds 0.03. In other words, knowing a team's rushing frequency or effectiveness in a given season tells one almost nothing about how effective that team was at play-action passing.
Because a team's season-long numbers can be influenced by game script (for example, a team might compile a bunch of rush attempts while salting a game away, when play-action is no longer relevant), I also checked to see whether there is any relationship between play-action effectiveness and these three rushing measures (total rushes, ratio of rushes, or rushing success rate) for plays that occurred while the game was within seven points. The results were similar.
The results above suggest that a team's rushing frequency and effectiveness in a season do not predict how successful that team was at play-action passing. But what about within an individual game? Are teams that have not run the ball recently or successfully able to take advantage of play-action passing?
The primary challenge here is constructing measures of rushing frequency and effectiveness for a given point in a game. There is no one number that captures everything about how well and how often a team runs the ball. Because a perfect measure doesn't exist, I tried a bunch of alternatives. As will be seen below, the good news is that they all tell the same story. Here is what I settled on:
1. The number of a team's rushes in the previous five plays (for plays six and beyond in a given game).
2. The number of rushes in the previous 10 plays (for plays 11 and beyond).
3. The number of successful rushes in the previous five plays (for plays six and beyond).
4. The number of successful rushes in the previous 10 plays (for plays 11 and beyond).
5. The ratio of rushes to total plays at that point in the game (for plays 11 and beyond).
6. The rushing success rate at that point in the game (for plays 11 and beyond).
Here are the relative sample sizes of play-action passes given each distinct measure of rushing, with each measure labeled at the top of each individual plot.
To take one example, the far left bar in the upper left chart shows that there were 1,585 play-action dropbacks from 2011 to 2017 in which the offensive team had not run the ball in the previous five plays. Most play-action dropbacks happened when one to three of the previous five plays had been rushes. As we go through the results, keep in mind that some of the sample sizes are very small (for example, there were only 265 play-action dropbacks where the team had zero rushing attempts in the previous 10 plays, about 1 percent of the sample).
Let's begin by looking at whether previous rushing has any relationship to the likelihood that a team showing a handoff will actually drop back to pass rather than handing off the ball. Expressed as a formula, this is:
(play-action dropbacks + rush attempts)
As background, about 25 percent of shown handoffs end up as pass plays on first and second down, compared to 20 percent on third down and 15 percent on fourth down. Fifty-seven percent of play-action passes are thrown on first down, 36 percent on second down, six percent on third down, and one percent on fourth down.
Here is how each of the six measures of rushing are related to whether a team rushes or uses a play-action pass:
From here forward, each graphic will contain six different figures that show how various characteristics of play-action passing vary with the six different measures of rushing. For example, the far left point in the upper left box shows that teams pass about 22 percent of the time when they show a handoff in the case where they have not rushed once in the past five plays (meaning that they hand the ball off 78 percent of the time).
Looking through the six individual plots, there is some evidence that teams with fewer recent successful rushes are more likely to pass rather than hand the ball off (see lower left box). Pass rate also modestly decreases with rush ratio (lower middle box).
Next, here is the percent of play-action dropbacks that are from a shotgun formation, broken down by previous rushing:
With the exception of teams that have run the ball very infrequently in the last five or 10 plays (who are more likely to operate out of shotgun), there is not much of a relationship. The increase in shotgun rate with very few recent rushes is partially a game-script effect; the median play-action pass where the team did not rush in the previous 10 plays came with the team trailing by 11 points, compared to a median of zero for all play-action passes. But even with the score within seven points, teams with no recent rushes were more relatively likely to run play-action from shotgun.
The next outcome measured is pressure rate on play-action dropbacks. As described above, one of the arguments for rushing affecting play-action passing is that if the defense has to respect the run, then it will affect their pass rush on play-action dropbacks. However, there does not appear to be any support for this hypothesis, as pressure rate is mostly constant regardless of previous rushing, hovering in a range of 27 to 29 percent. Note that for pressure rate alone, I exclude 2017 because I do not have data from that year.
Next, depth of target. For this and the subsequent chart, I show the median (lower edge of each bar), 75th percentile (upper edge), and mean (circle). This is a modification of a typical box plot where I do not show the 25th percentile because it is zero for the yards gained plot and thus uninformative.
The mean and median depth of target is mostly constant regardless of a team's previous rushing statistics.
And finally, yards per play on play-action dropbacks:
This is the main relationship of interest. Regardless of which of the six measures of rushing one chooses, there is no meaningful relationship between the effectiveness of play-action passing and a team's rushing statistics in the game to that point. Aside from a couple extreme cases with very small sample sizes (zero rushes or eight rushes in the previous 10 plays), there is no relationship in the data between the median, mean, or 75th percentile of yards gained and a team's previous rush attempts. This is consistent with the scatterplots of team rushing versus team play-action passing for entire seasons that were displayed in the beginning of this article.
Here is the data in the lower middle figure in table form:
|Rush Frequency and Play-Action Passing, 2011-2017|
|Rushes in Prev 5 Plays||Plays||Yards Per Play||Standard Deviation|
Between 2011 and 2017, 93 percent of play-action passes occurred when the offense had between one and four rushes in the previous five plays. In this range, yards per play and its standard deviation are remarkably similar for all values of previous rush attempts. Looking at the graph of rushes in the previous 10 plays, teams are at least as successful at play-action when they have rushed one time in the previous 10 plays as when they have run seven or eight times in the previous 10 plays.
Putting this all together, I cannot find any support for the success of play-action passing being related in any way to a team's rushing statistics, whether measured by frequency or effectiveness.
[ad placeholder 3]
Coming into this, I did not know what to expect. Since play-by-play data on play-action passing is not readily available, it was something I had long wondered but never been able to look into. After measuring this every way I could think if, it appears that the conventional wisdom that running is necessary for play-action passes to be effective should be questioned. We have a lot of evidence that play-action passing is more effective than non-play-action passing, so the big question that remains is why teams run play-action so infrequently (the percentage of passes that are play-action has hovered around 20 percent since 2011). What would happen if teams started devoting a higher share of their plays to play-action passing? Would the advantage persist or would defenses adjust?
The recently-concluded 2017 playoffs may provide a glimpse into a future where play-action is more common. The Eagles attempted 21 play-action passes in the Super Bowl on 43 dropbacks (49 percent). Frequent use of play-action (33 percent of dropbacks against the Patriots and 54 percent against the Steelers) also helped the Jaguars score 65 points across two playoff games and nearly reach the Super Bowl. In the constant search for advantages in a competitive league, play-action passing appears to be an under-utilized edge.
An economist by trade, Ben Baldwin uses large datasets to try to learn about human behavior. His work can be found on Field Gulls and Grid Fe. Reach him on Twitter at @guga31bb.
31 comments, Last at 20 Feb 2018, 1:01pm
#2 by billprudden // Feb 12, 2018 - 2:59pm
Thanks for running these numbers every which way. This was as interesting as the injuries piece last year. Well done.
As to "why?" - Could it be as simple as LBs and SSs simply cannot help themselves?
And also that the QB ends up with more time b/c the would-be pass rushers are being blocked in ways that take them out of their pass rush mindset, spot, or momentum?
Is the question really something more like "How much more effective could our passing game be if all of the underneath defenders were distracted and delayed and the DLs took longer than normal to get to our QB?"
Because I think we know the answer...
#4 by guga31bb // Feb 12, 2018 - 3:57pm
Thanks for the nice words. This is hard to get a handle on, but I suspect it has more to do with coaching than LB/S not being able to help themselves. If a team practiced all week in order to, for example, take away the run game, it's hard to adjust mid-game even if the other team is using play action a lot.
#17 by jtr // Feb 13, 2018 - 9:43am
I suspect that they actually can't help themselves. Linebackers spend their entire lives, starting in middle or high school, training to seek out and tackle the tailback. It would take way more than one game plan and one week of practice to train that out of them.
I think if a few NFL teams really went all-in on showing play action on the majority of their dropbacks, we would eventually see rushing effectiveness creep upward as defenses really started to prioritize having their linebackers stay home in their zone rather than attacking the run. As it is right now, with only about 25% of shown handoffs resulting in play action, defenses aren't getting punished quite enough for coaches to start to move on from preaching the old "stop the run" cliches.
#5 by Aaron Brooks G… // Feb 12, 2018 - 4:05pm
The 3rd figure -- passing likelihood per show of handoff -- is interesting and might explain a couple of things.
1. Teams will mix in a run if all they do is pass. This probably accomplishes a couple of things: keeps defenses honest, gives the QB a break, gives the linemen a break.
2. Teams that haven't been stopped while running just keep running.
It's in the middle, where teams have had some rushing success, but not unstoppably (and some passing success in the same conditions) who go to play-action, because by making the defense guess which it is you might shake at least one of the two free.
RPOs largely accomplish the same thing, but might do it faster.
#3 by Pen // Feb 12, 2018 - 3:56pm
Did you look at QB pressures with play action vs non PA? Hits, sacks, etc. might all go down because the PA causes that momentary hesitation. This could explain why PA is more successful. It's not that the run game needs success to make PA work, it's that the play in question would be wildly successful if the defenders didn't respect the RB and the QB did indeed hand it off to him. So they HAVE to hesitate. In which case, it's irrelevant how successful the run game had been up to that point.
Which goes back to your initial question: why run so much? I think the answer is: if you don't run at all, they'll stop hesitating and you'll lose the PA advantage.
This article kinda doesn't focus on the question it posed initially, however. It winds up asking why don't teams use PA more often, not answering why don't they run less.
#6 by Aaron Brooks G… // Feb 12, 2018 - 4:09pm
I want to cry foul on your use of 2011/2012 Denver. Rushing/option QBs are known to increase yards per rush for an offense.
The implication of the table is that better running doesn't indicate better PA performance, but this is a hugely cherry-picked examples. An option QB only rarely options to himself (it's so infrequent it's largely a trick play). Also, you went from a replacement level to a HOF QB. I would expect many parts of the offense suddenly performed at a higher level.
#10 by guga31bb // Feb 12, 2018 - 4:24pm
Yep, there are 224 points on those graphs (32 teams * 7 years). The point of that section is to look for a relationship among those 224 points collectively- you can't learn anything from 2 points (except that Peyton Manning was a better passer than Tim Tebow).
#7 by Pat // Feb 12, 2018 - 4:10pm
Are the labels in the first set of graphs blocking some points? There's definitely *some* correlation between rush success/play action yards per play. I mean, the r-squared's tiny, but the data is noisy as hell, and even with that, there's clearly a deficit in the bottom right and upper left quadrants. Virtually no teams with under-6 ypa and over 40%, and virtually no one over 10 (maybe ~9-ish) ypa and under 40%. Whereas the other two outer regions look populated fine. The other two graphs look basically uniformly populated.
Problem with R-squared in this situation is that you don't really have a dependent/independent situation. It's not like you can ask teams "excuse me, do you mind running a few play action plays even though you probably think you're going to suck at it?" So it's not terribly surprising that there's a lot of scatter.
It'd be interesting to see what'd happen if you just binned it in quadrants (below average/above average) and ran Fisher's exact test on it. I'd kinda guess that you might end up with a more significant result.
"Looking through the six individual plots, there is some evidence that teams with fewer recent successful rushes are more likely to pass rather than hand the ball off (see lower left box)."
So... this part I look at and say "situation." If you rush, and you're unsuccessful at it, that means you're more likely in a passing situation, rather than a rushing situation. That's the problem with looking at a rush/pass choice. It's only a neutral choice for a range of situations. I think it's really, really hard to measure a team's rush/pass choice, because you'd really have to correct for the situation somehow. And even then I'm not sure that's the right thing to do.
#14 by Mountain Time … // Feb 12, 2018 - 11:43pm
So you're saying that, among your other suggestions, it might be edifying to look at specific case studies? I agree in general, but how do you pick the teams to look at? Ones with the greatest difference between PA DVOA and straight passing DVOA?
#16 by Pat // Feb 13, 2018 - 9:41am
No, I'm mainly saying you need to find a way to correct for how likely the *average* team is to run or pass in a specific game situation. Looking at how likely a team is to pass on 3rd and 8 when they're down by 10 isn't the same as looking at how likely a team is to pass on 3rd and 4 when tied.
And of course we *know* that coaches think like this, because it's exactly what they say. They know that there are passing situations, rushing situations, and more neutral situations.
Still not a perfect comparison, though, because you're only seeing what teams *do*, not what teams originally *intended* to do. It's really not an easy thing to measure, because you're looking at the team through the filter of a game. But at least looking at run/pass neutral situations (or *nominal* run/pass situations) would be a good start.
#24 by LionInAZ // Feb 14, 2018 - 5:53pm
The correlations are statistically insignificant. It's a common human error to eyeball a correlation into existence -- and to rationalize it afterward. That's why we have statistical tests in the first place.
#25 by Pat // Feb 15, 2018 - 12:17am
Which is why I suggested one! R-squared doesn't test significance (just the power of the correlation) and really, testing for independence (via Fisher's exact test or similar) is a safer way to test for an unknown relationship than presuming the form of one.
Even if the p-value is high, that doesn't mean it isn't real. It just means you can't tell yet, which isn't surprising since the two measures are really crude.
#28 by LionInAZ // Feb 16, 2018 - 6:27pm
No, what you said was that there was definitely more high points on one end of the graph and low points on the other end. That's eyeballing a bias. That's why we have statistical measures, because the numbers at either end of the graph are insufficient to determine a regression in the presence of noisy data. Believe me, I've dealt with this enough in my career.
There are problems inherent to statistical tests, yes, but there aren't nearly enough data points in this particular situation to show that you're right, unless you can do a Monte Carlo analysis.
#29 by Pat // Feb 20, 2018 - 12:40pm
"No, what you said was that there was definitely more high points on one end of the graph and low points on the other end. That's eyeballing a bias. "
Exactly! And then I followed it up with suggesting using Fisher's exact test. Your eye can suggest biases fine. You just don't want to use it to confirm one. And note how I suggested testing it: I didn't say "use these cherry-picked values." I said just split it up into quadrants: above/below average in each direction.
It's not like you can trust statistical tests blindly, either. It's easy to construct situations where linear correlations fail horribly and you can end up seeing the problem by eye.
"but there aren't nearly enough data points in this particular situation to show that you're right,"
... which is why it makes sense to use a simple test to determine if the variables are independent. Part of the problem here is that you're trying to extract more than just "is there a relationship" - you're also saying that the relationship is *linear*.
If instead you abandon that and say "hey, are these independent" - you need less data to support that. If they're independent, then the spread in the "Y" variable at any point will be unrelated to the "X" variable. And you can quantify how big the deviation is pretty easily. Uncorrelated doesn't mean unrelated.
#9 by zenbitz // Feb 12, 2018 - 4:20pm
It would be interesting to see what the success rate of PA is (relative to dropback/no fake) when the baseline chance of running the ball is very low (3rd-and-long). Presumably the sample size is small though.
Is there a significant autocorrelation function between run/pass (ignoring PA), or can it be modeled as a poisson distribution that is just drawn from the team or situation average R/P ratio?
#13 by Mountain Time … // Feb 12, 2018 - 11:35pm
As suggested by contrasting the 2011 to 2012 Broncos, it seems like the most likely way to have a strong play action game is to have a strong passing game in general.
I bet that if teams started using play action on most passes, it might slightly decrease PA effectiveness, but would also increase rushing effectiveness. Assuming the QB isn't overly distracted from his duties reading the coverage, which is quite possible.
How would the numbers look the other way around: does a good play action game open up rushing opportunities more than just a good passing game generally?
#18 by Dan_L // Feb 13, 2018 - 11:47pm
First of all, I second the thanks for all of the good work here.
As far as the why play action works, I think we should consider the possibility that most coaching concepts of "balance" are closer to the game theoretic concept than the football talking head concept of this term. I think it is important, for instance, to have some running plays in your 4 wide personnel in order to keep the defense honest (and a little can go a long way here). That these plays exist and have been used historically is much more important than your last 5-10 plays in this game (in this school of thought). What little I've seen of game planning by coaches seems to reflect this, part of defensive player book learning is how much they run and pass out of a given formation.
Furthermore, an important concept could be how well your running plays and passing plays look alike for the first 0.5 a 1.0 seconds. This is part of why I like the RPO based offense so much. Al Michaels was criticized here somewhat for calling every slant throw an RPO, but I think this criticism was half too harsh. A lot of the Eagles slants and RPOs looked like the same play, and they achieved balance by making it difficult to determine what to defend most.
Because coaches don't often talk in terms of GTO, we tend to assume they aren't near it, but it could be that they've evolved towards reacting to more subtle concept of balanced offense and defense than you would expect in interviews.
#19 by jtr // Feb 14, 2018 - 9:29am
The evidence that coaches are not particularly close to a game-theoretic equilibrium is the fact that passing plays are on average much more successful than running plays. That means that offensive coaches are still putting too much emphasis on running the ball, and defenses are still putting too much emphasis on stopping the run. The Nash Equilibrium in game theory is only achieved if no player can increase their own utility by changing their chosen strategy. As it is, virtually all offenses can gain in probability of success by choosing more passes, and virtually all defenses can gain in probability of success by further emphasizing pass defense.
#20 by ChrisS // Feb 14, 2018 - 10:47am
I agree with what you say. I think running the ball can be the Platonic ideal strategy if it is succesful (a big if). Running is simpler, less likely to result in a turnover, less likelly to have penalties, blocking is not as complicated, uses up the clock, rests the defense. So I think the potential strategic benefits of running prod coaches to try running more oftten than would seem to optimal to see if this the game they will get it right and win with ease. Even though it is unlikley to succeed coaches feel compelled to try it because it is perceived as a 'superior' strategy.
#21 by Pat // Feb 14, 2018 - 11:28am
You'll only reach a Nash equilibrium if each instance is independent from the others, which definitely isn't true in football (the fact that players do film study and identify tendencies demonstrably shows that it's not true). If certain plays become less effective the more often they're used (and raising the effectiveness of other types of plays), you can get an equilibrium point which doesn't seem to make sense from the raw effectiveness of each one.
It's kinda-sorta-similar to Braess's paradox, which is typically shown as a traffic-flow example where removing a road can lead to better traffic flow. The key to Braess's paradox is having a road ("play") whose efficiency depends on its usage (in our bad analogy, this is passing) and another road ("play") whose efficiency is constant (bad analogy=rushing). In situations like this, when you chain those roads ("plays") together, if you try to maximize the output of each chain of plays, you can decrease the *overall* efficiency of the whole thing.
edit: You can 'recover' that efficiency in Braess's paradox by everyone collectively doing something which is *against* the "game theory best option" - again, in the traffic example, everyone agrees to not use the new road in order to improve times for everyone. In our example, the coach decides to not use his best plays (passing) all the time in order to make them more effective when used.
I actually made a silly simulation of this a looong time ago (10+ years, so the interwebs have totally lost it). It's pretty simple - you just assume passing depends on the frequency of passing in the past few plays, and rushing depends on the frequency of rushing in the past few plays. If you screw around with the "penalty factors" on each one (how much each one goes down), you can end up with optimums roughly at where the league's at in terms of how effective a rush/pass is, and how often each are used.
The situation actually gets more complicated when you realize that, in fact, each *game* isn't independent. And your goal isn't to maximize the output of each *game*, but to win the most games possible. So, in fact, it can make sense to not even use your "best" plays very often at all, in order to maximize their usage in a game where it'll have the most utility.
And of course, that's *exactly* what happened with Philly Special in the Super Bowl.
#31 by Pat // Feb 20, 2018 - 1:01pm
Holy crap, I found the simulation. Different website than I thought it was on.
I apparently never posted an update on this, but I also modeled the cost functions and searched to try to find a case similar to the NFL, and it's pretty easy. The specific case just moves the unstable equilibrium there around, which leads to an interesting problem: teams might be too "run-heavy" because being optimally "pass-heavy" is an unstable equilibrium. That is, at the 'optimum' point, defenses can do better by becoming either more run or more pass heavy. But as soon as they do that, offenses can change to improve, but that improvement *doesn't* take them back to the optimal point.
But in any case, in any of these models where you've got more than just play-level optimization, you can easily end up with the run/pass payouts not being equal.
#27 by Aaron Brooks G… // Feb 15, 2018 - 5:24pm
How does Nash handle local vs global maxima problems? Nash's equilibrium is about one player using unilateral strategies. There's no guarantee that a given player can arrive in a global maxima if getting there requires another player to make a beneficial (either as part of mutualism or parasitism) counter-move.
#30 by Pat // Feb 20, 2018 - 12:44pm
It doesn't. At all. In traffic-flow problems, the "global" optimum you're talking about is called the "social optimum", and it absolutely does not have to be the Nash equilibrium (which is the 'player optimum').
In the football context, this is the idea that optimizing the result of a single *play* may not result in the optimum for a single *game* (or *season*). I have *never* understood why people think that play-level results should look game-theory optimal, when it's not what coaches are actually trying to maximize.
#26 by Vincent Verhei // Feb 15, 2018 - 4:02am
Al Michaels was criticized here somewhat for calling every slant throw an RPO, but I think this criticism was half too harsh. A lot of the Eagles slants and RPOs looked like the same play, and they achieved balance by making it difficult to determine what to defend most.
There were some plays where Foles would play-fake, hang in the pocket, go through his progressions, scramble, and then throw, and Michaels would say "another RPO." I was banging my head on the table.