Pete Carroll, Andy Reid, and the Psychology of Play Calling

Seattle Seahawks HC Pete Carroll and Kansas City Chiefs HC Andy Reid
Seattle Seahawks HC Pete Carroll and Kansas City Chiefs HC Andy Reid
Photo: USA Today Sports Images

NFL Offseason - Guest column by Cole Jacobson

As NFL offensive playcallers scramble to select their next play in the minimal timeframe they have to do so, there are a myriad of factors that influence which direction they go in. Down-and-distance are an obvious pair; a pass is far more likely on third-and-15 than it is on second-and-3. Game score is another one, as trailing teams have more incentive to pass than leading ones. Field position, game clock, quarterback pre-snap improvisations, and each team's timeouts remaining are also factors that even a casual football fan would acknowledge are significant influences in whether a team decides to run or pass the ball. The list of these factors could go on and on—but is the type of play the offense ran on the preceding snap one of them? I used R programming to find out.

I started with all of the play-by-play data available via NFLFastR for the past 16 completed seasons, including playoffs (2006 to 2021). I removed all pre-snap penalties, QB kneels, spikes, and special teams plays, as well as two-point conversions, resulting in a set called pbp_ProjectPlays (size of 559,019). I split this data set into plays that were (size = 95,452) and were not (size = 463,567) the first offensive play of a given drive, then augmented the latter subset with information about that team's preceding play. I performed some further manipulations as well, including isolating short-yardage plays or first-and-10 situations. I won't show full R code here to save space, but I can share with anyone interested.

The "too long, didn't read" summary of the project: pass plays indeed have been more common immediately following run plays by a statistically significant margin when controlling for external factors such as down, distance, and score. This distinction at the end is particularly important, as if we strictly looked at preceding play type without considering any other factors, passing would instead appear more common immediately following pass plays, which demonstrates the dangers of disregarding lurking variables. When digging into more specific situations such as short-yardage plays or plays following a first-down conversion, the same concept holds true, with playcallers being inclined to stray from their previous decision. As a secondary conclusion, the success of a play also makes an impact on the ensuing play call. While it's not as big of a factor as "preceding run vs. pass," coaches also have been slightly more likely to switch their play call after a failed play than a successful one, after controlling for other variables.

Breakdown of All Plays/Intro of Pass Rate Over Expected

At its absolute core, the question of "are passes more common immediately after other passes?" is an extremely easy one to answer. The below table and graph show the rate at which teams passed on any given play, sorted by the immediately preceding play type. Keep in mind that I use NFLFastR's distinction between "pass" vs. "rush" for the entire project, which is based on the intent of a play rather than its result (i.e., QB scrambles and sacks still fall under the "pass" category, even though the ball wasn't thrown).

Table 1

 

Graph 2

Looks like a pretty open-and-shut case; passes are more common when the preceding play was also a pass, right? Unfortunately, as I alluded to in the introduction of this piece, it isn't that simple. There are several lurking variables at play here, with game clock being a particularly notable one because there is often autocorrelation involving consecutive passes from trailing teams in the second half. In simpler terms, suppose a losing team throws an incomplete pass on second-and-12, and then throws again on third-and-12. Obviously, the team did not throw the ball on the latter play because the prior play happened to be a pass; rather, it did so because third-and-12 while facing a deficit is an obvious passing situation, regardless of the prior play call.

How do we adjust our calculations to account for these variables such as down, distance, and score? Thanks to the work of The Athletic's Ben Baldwin, NFLFastR includes two variables called "xpass" (expected pass rate) and "pass_oe" (pass rate over expected) for every play since 2006 (which is why my project's data begins there). You can check out this link for a thorough description of what they entail, but the basic summary is that they give an estimation of how likely a pass is on any play based on several surrounding factors. For example, the average third-and-15 play from 2006 to 2021 had an "xpass" of 0.918, while the average second-and-3 play had an "xpass" of 0.449. Subsequently, "pass_oe" is calculated based on what the actual play call was, with a zero-to-100 scale instead of zero-to-1. If a play had an "xpass" of 0.918 and it turned out to be a pass, the "pass_oe" would be (100.0 - 91.8), or 8.2%. One key aspect to note is that NFLFastR does not account for coach tendencies. As a result, Pete Carroll's Seahawks and Andy Reid's Chiefs would have the same "xpass" on a first-and-10 from their own 25 on a game's opening play, even though we all know that one side is far more likely to throw than the other.

With this formula in our back pocket, we can use it for the remainder of the project to reach some real conclusions of substance. How do the above table and graph look if we used "pass rate over expected" in place of pass rate?

Table 3

 

Graph 4

The black brackets represent 95% confidence intervals, which I did not include on the first graph because they were too negligible to see clearly. These intervals give us the answer that we are seeking. When controlling for external variables such as down, distance, game clock, score, etc., a pass is more common when following a run play than when following a pass play, by a statistically significant margin. For those with a real background in statistics, the following two-sample T-test gives us the same conclusion:

Equation 5

The "true difference in means" refers to the gap between the average "pass_oe" values for plays immediately following a pass and plays immediately following a run. The fact that our p-value is approximately 2 * 10-16, a comically small number, means we can confidently say that the difference in those mean values is significant rather than being due to random chance. Don't worry about this jargon if statistics is not an interest of yours, as the point remains the same: passes are more likely to come after a run play when we control for all other factors at hand. Another way to exemplify this is by looking at how the accuracy of the "xpass" variable changes based on what the preceding play type was:

Equation 6

As a brief example of what these numbers mean: "mean(pbp_ProjectPlays_PriorRush$pass)" is the average pass rate for all plays that came after a run, which is approximately 61.4%. In contrast, NFLFastR's average "xpass" rate for those plays—i.e., how often NFLFastR expected a pass in those situations—was approximately 59.5%, which is roughly 2% too low. These numbers show that "xpass" is noticeably more accurate at predicting the actual pass rate for the opening play of a drive than it is for plays that came after a pass or after a run. Specifically, the "xpass" value is too high for plays that came after a pass, and too low for plays that came after a run, which exactly aligns with the rest of the project thus far. NFLFastR overestimates the likelihood of a pass when the preceding play was a pass, and underestimates the likelihood of a pass when the prior play was a run.

One natural question emerges from this conclusion. If we have established that prior play type is a significant indicator of what the ensuing play call is, couldn't the "xpass" variable become more accurate if it accounted for prior play type? Turns out, the answer is yes. I built the following linear model, which I lazily named "ADJXPass," to predict the pass rate for any given play when accounting for both the play's "xpass" via NFLFastR and what the prior play type was:

Equation 7

When comparing the accuracy of this model to the accuracy of the actual "xpass" variable, it turned out that my model had a lower average error, and an evenly matched Root Mean Square Error (RMSE):

Equation 8

Apologies again if I have bored anyone with these statistical terms, but the conclusion that they led us to doesn't require any such lingo. When trying to predict the chances of a pass on any play, the accuracy of our prediction will be increased if we account for the prior play type.

Does Success of the Prior Play Matter?

We have answered our primary question of whether "run vs. pass" on the prior play makes an impact on the ensuing call, but there's still more digging to do. Does it matter how successful the prior play was? To look into this, we use a similar method, helped out by NFLFastR's "Expected Points Added" (EPA) feature. For all below charts, we use NFLFastR's definition of a "success," which refers to any offensive play with positive EPA. The below table and graph display the rate at which teams switched their play call on any given play, sorted by whether the prior play was successful or not. To clarify, a "switch" is exactly what it sounds like; if Play 11 of a drive is a pass and Play 12 is a run, then Play 12 gets assigned a "switch" value of 1, whereas if both plays were passes, Play 12 gets a "switch" value of 0.

Table 9

 

Graph 10

This graph (with confidence intervals omitted because they are extremely narrow) suggests that offenses are slightly more likely to switch their play call after a failed play. But, similar to the first graph of this piece, there are lurking variables at play such as down, distance, and score. As an example, suppose a team runs for 1 yard on first-and-10, then attempts a pass on second-and-9. The team didn't necessarily switch its play call because the prior play happened to be a failure; rather, it did so because second-and-9 is generally a pass-friendly situation regardless of what happened right before it.

To control for these external variables, we use a similar method to what we did in the first section of this article. Strictly deriving from NFLFastR's "xpass," I added a new variable called "Switch Rate Over Expected," which sounds far more complicated than it actually is. As a sample calculation, if Play 1 of a drive was a pass and Play 2 had an "xpass" of 0.450, this would mean that NFLFastR's "expected switch rate" for Play 2 would be 0.550 (because it's considered a switch if, and only if, it's a run play). If Play 2 then ends up being a pass, the "switch rate over expected" will be -55.0%, because NFLFastR expected a switch to happen 55.0% of the time, but a switch did not occur. With this variable at our disposal, we can now re-create the same table and graph while controlling for outside factors:

Table 11

 

Graph 12

There are a couple of key observations here. The first and most obvious is how much taller the left column is than the right, with 95% confidence intervals not very close to one another. This leads us to a very direct conclusion: when controlling for other variables, a switch in play type is significantly more likely after a failed play than a successful one. This is also reflected in the below two-sample t-test:

Equation 13

But if we dig a little deeper, there's a second-level interpretation that is also noteworthy. Notice that both columns have a positive "switch rate over expected," and recall that this "expectation" is based solely on NFLFastR's "xpass" formula. If the "xpass" formula was theoretically perfect, then the average "switch rate over expected" should be close to zero, because the "expected switch rate" is meant to be an approximation of the actual rate of play switching. Clearly, this is not the case. In other words, if we strictly use "xpass" to estimate the "expected switch rate," without accounting for the prior play type in any capacity, then we end up underestimating the actual switch rate by a statistically significant margin. As a result, we reach a two-level conclusion that builds on what we saw in the first portion of this project. NFLFastR generally underestimates chances of a switch for all plays because coaches have a tendency to deviate from their prior play call, but this phenomenon is largest when the preceding play is a failure.

To help hammer this point home, check out how the average "expected switch rate" (calculated solely based on "xpass") compares to the actual switch rate for all plays from 2006 to 2021:

Equation 14

These numbers show us that from 2006 to 2021, the average switch rate (i.e., how often the type of a play was different than the one that came before it) was 46.6%, while the average "expected switch rate" (calculated solely based on "xpass") was 44.6%, which is roughly 2% too low. This is another way to demonstrate that, just by using "xpass," NFLFastR tends to underestimate the chances of a play switch. As such, "xpass" on its own isn't as accurate as it would be when incorporating prior play type.

If we want to get more in depth, we can stratify by both "success vs. failure" and "pass vs. run" simultaneously to see that these conclusions still hold true even in more cherry-picked scenarios.

Graph 15

This graph excellently lays out the full landscape of our findings. Both columns with preceding run plays have noticeably higher "pass rate over expected" values than the ones with preceding pass plays, which matches the main point of this project: passes being more common after runs than after other passes, relative to expectation. But at the same, we also see that the success of the prior play matters, as the two biggest deviations from the red line (i.e., the two situations where a play switch was most common) were both after failed plays. Prior play type is still a stronger indicator of the ensuing play call than prior play success is, but we can see that both make an impact. As a side point, it's valuable to see that "pass rate over expected" is very accurate for the opening play of a drive, which makes sense given that there's no prior play call to skew the data.

For those who prefer a more traditional stat such as first downs compared to the relatively abstract concept of EPA, the same conclusion still holds. Even after a first-down conversion—which generally entails a successful play—"pass rate over expected" is lower on first-and-10 when the preceding play was a pass.

Graph 16

Deeper Dive into Short-Yardage Situations

Aside from first-and-10 situations, another interesting group of snaps to isolate is short-yardage plays. If the offense gets stuffed on third-and-goal from the 1, should we be more likely to expect a pass on fourth down than we would be if the offense had thrown incomplete on third down? While this shouldn't surprise you if you have read any other portion of this article, the answer is yes.

For the purpose of this section, I defined "short-yardage plays" as any scrimmage play with 2 or fewer yards to the opponent's end zone, or any third-/fourth-down play with 2 or fewer yards to gain a first down. In other words, second-and-goal from the 1 counts as "short yardage," but second-and-1 from the team's own 40 does not. Additionally, when I discuss "prior play type," it only includes the prior play if the prior play was also a short-yardage situation. This means a fourth-and-1 where the preceding play was an incompletion on third-and-1 would be considered to have the prior short-yardage play as a pass, but a fourth-and-1 where the preceding play was a 9-yard completion on third-and-10 would have "N/A". This is because we specifically want to see whether the play type of a failed short-yardage play impacts the play type of the next attempt. With those parameters out of the way, check out the following data:

Table 17

 

Table 18

Because the sample size is so much smaller with short-yardage plays, we do have confidence intervals that nearly overlap here for the "prior pass" and "prior rush" columns. But even with that caveat, we still see the same ultimate answer regarding short-yardage plays that we did for the earlier portions of this article. That conclusion being: In short-yardage situations, passes are more common following a rush than they are following a pass when controlling for external factors. For the sake of formality, here's a two-sample T-test that displays the same point:

Equation 19

Feel free to disregard that if statistics is not your forte, but the point remains clear. Even when we take a situation as specific as plays with 1-2 yards to go, prior play type is still a valuable indicator of what's to come next.

Conclusion/Possible Sources of Error

Like any football analytics project, this shouldn't be blindly obeyed in all possible contexts. Analytics are used properly when they're helping teams make informed decisions in the moment rather than forcing coaches to disregard all other factors at play. As it pertains to this project specifically, coaching personnel makes a major impact. This project looked at the NFL as a single entity rather than dissecting any individual teams or coaches, but just because a trend exists for the league collectively doesn't mean that it's true for each individual playcaller in that league. Additionally, general scouting plays a major role too. If one team has discerned that the opposing defensive coordinator almost exclusively plays Cover-0 when backed up inside his own 5-yard line, it might exploit that aggressiveness more often than "xpass" or any similar models would predict.

Another important trait to point out is the unfortunate necessity of classifying every play as either a run or pass. Needless to say, not every play call is that black-and-white, particularly with the explosion of RPOs in recent seasons. It's not fair to label every play as a pass or run as if the categories are fully binary, but we do the best we can with the information supplied to us.


Cole Jacobson is a Next Gen Stats Researcher at the NFL Media office in Los Angeles. He played varsity sprint football as a defensive lineman at the University of Pennsylvania, where he was a 2019 graduate as a mathematics major and statistics minor. With any questions, comments, or ideas, he can be contacted via email at jacole@alumni.upenn.edu and @ColeJacobson32 on Twitter.

Comments

40 comments, Last at 09 Aug 2022, 12:16am

1 Humans and randomness

Haven't had a chance to read the whole article beyond theTLDR version (looking forward to doing so later), but the conclusion seems to be that humans are very bad at being random.  We see and create patterns everywhere we go and in everything we do.

6 They aren't necessarily…

They aren't necessarily trying to be random.

Swapping after a failure strongly resembles a tit-for-tat philosophy, which is strongly advocated in game-theory for iterated prison's dilemmas, which is basically what NFL play calls are. The slight aberration (bias towards repeating success, at about a 2% level) is an occasionally-better model than pure tit-for-tat in repeated games.

Coaches may already be calling plays, in a binary sense, in a reasonably optimal way. A rational one, anyway.

10 Yeah, regarding MJK's…

Yeah, regarding MJK's original message, the ideas of humans being imperfect at randomization was really the genesis of this project. My original draft had the following paragraph in the introduction, though I cut it to keep word count slimmer:

 

"In William Poundstone’s 2014 book “Rock Breaks Scissors”, named after the notion that “Rock, Paper, Scissors” players subconsciously tend to avoid repeating the same move on two consecutive turns, he briefly discussed a parallel concept as it relates to NFL play-calling. Specifically, he hypothesized that NFL coaches had the same psychological urge to avoid play repetition, meaning that pass plays were more likely when the preceding play was not a pass (and vice versa). But while the book provides a great starting point to this discussion, it simultaneously is limited; it only considered data from the 2001-2005 seasons, and it did not discuss a methodology for how to control for our aforementioned factors such as down, distance, score, etc. As a result, I used data from NFLFastR to dive into this idea further."

 

I agree with Poundstone's notion that, in general, humans still generally demonstrate patterns even when trying to be random. However, I also agree with the point brought up by both Pat and Aaron Brooks G., that football play-calls are far more nuanced than something like a coin flip. If you were calling heads or tails on a coin, or "odd or even" on a dice, there should be no reason that the result of your first attempt should impact your decision on the second attempt (as long as we have true certainty that neither the coin nor dice is imbalanced in any way).

But football is way different in the sense that "Play X" and "Play X + 1" aren't fully independent like that. It's reasonable that "Play X" can display a defense's strengths, weaknesses, or play-calling tendencies, to the point where that information could slightly influence the offense's choice on "Play X + 1". Ultimately, the primary point of this project was to look at whether pass plays were more common (relative to expectation) immediately following run plays than following pass plays, instead of why that necessarily was the case. But the fact that switches were more common following failed plays than successful ones helps us dig into the "why" at least somewhat: while there's somewhat of a desire to be random, there's also at least a slight inclination to stray from what wasn't succeeding on the field.

16 I'm not really convinced …

I'm not really convinced "random" is really something you would want, anyway.

I mean, if you think about it, the flipside of more changes after failure is less changes after success. And from a "rock/paper/scissors" standpoint, you get why you would want to be random. Success on play X doesn't affect success on play X+1.

But it's even more than you're saying: it's not just that play X can *affect* play X+1: it's also just correlated with it because *it's a contest* and the players are the same.

There's a line in Invincible that comes to mind: "Let's run that play where Vince runs by everybody" (or close enough). If you're running plays they can't stop... of course you keep doing it.

29 Because regardless of…

Because regardless of strengths or weaknesses, you're always going to be better off if the defense doesn't know what's coming. You could call the same run play successfully 6 times in a row but on the seventh time, the defense still has to respect the rest of the field. If they don't, that play isn't going to work as well.

35 Because regardless of…

Because regardless of strengths or weaknesses, you're always going to be better off if the defense doesn't know what's coming.

It's a relative balance, though. Even though the defense might be less prepared for it, if you're less capable of executing it, the total chance of success is lower. 

So long as the plays are working in an absolute sense (as in... you're moving the ball) there's not really a strong need to change.

37 I think all of these ideas…

I think all of these ideas align with the mentality seen in some other comments under this article. We've established that play-calling isn't truly "random" on a play-to-play basis, but is there any reason to believe that true "randomness" is the optimal strategy? As I mentioned in Comments #13 and #17, I think the next step of research here would be to answer that question. And that's certainly something I hope to be able to dive into before the 2022 regular season starts.

15 If you see that the coin is…

If you see that the coin is a trick and only has tails, should you call heads just because golly gee, you have to be random?

Not all play calling is a zero-sum game. Sometimes people are dumb and learn.

23 In fencing, there are basic…

In fencing, there are basic strategies, counters based off those strategies, and secondary counters based of the first-level counters. (etc.) It was taught to me as The Wheel. Once your opponent caught up to a given strategy, or their reactions strongly indicated they had something in mind, you would turn the wheel to the next counter. Eventually this would return to the original strategy -- hence the wheel.

The trick was *when* to turn. Because your opponent probably knew about the wheel, too. The game then became -- do I counter now that he's set up for my simple move? Or do I assume he also knows what we've been doing and is going to strategy shift at the same time I do? If so, do I go to the 3rd option, or stay at the 1st? (This didn't work at all if your opponent was too dumb to fool.) Is he really left-handed?

You can see how this devolves into madness. Generally, you stayed at what worked until it didn't, and then shifted. You would sprinkle in a few change-ups or early shifts to a counter-strategy just in case they read your timing. It was basically modified tit-for-tat. Almost regardless of the specifics of the guessing game, that is a pretty robust strategy. It's not surprising we see it here, too.

The irony, of course, is that the best person I ever beat, I beat by obstinately sticking a single strategy that kept working. Which is still tit-for-tat, because my opponent never shifted, either.

27   (This didn't work at all…

 

(This didn't work at all if your opponent was too dumb to fool.)

This is what I'm saying. A good portion of NFL play-calling is strategic. However an equally significant part is "I found the guy that sucks, keep going after him until he screws up."

If your offensive line is just shoving people around, there's no reason to stop. You see this all the time: same types of plays, over and over, defense can't stop it. Why does it work? Because the defense isn't adapting or the offense is just that creative? No, it's because they can't adapt.

If you put me into an MLB baseball game as a pitcher, I can get as creative as I want with pitch selection. Won't matter in the slightest: I'm just outmatched. Obviously, in baseball the solution's simple: don't use me. But there are 11 guys on defense in the NFL, and every fan knows there team has at least one guy who sucks. Half of playcalling is "find the guy that sucks."

There's a flipside to this too: what strategy do you use in fencing if your opponent is just better than you?

The cynical person might say "you cheat" (OK... 'change the game' is probably a better phrase). You do everything you can to try to even the playing field. You do stupid, crazy things. In football, you might just go "eff it, cover-0, all out blitz." Stupid strategy? Might be - but better than everything else, because you're just worse than your opponent.

30 There's a flipside to this…

There's a flipside to this too: what strategy do you use in fencing if your opponent is just better than you?

The cynical person might say "you cheat" (OK... 'change the game' is probably a better phrase).

You lose. 

Fencing, like all combat sports, is really influenced by style. There are guys you are markedly better than who just own you because you can't stop their thing and conversely guys who are much better than you who are utterly fooled by your one weird trick. 
(Similar to the Bills against running teams or New England in late-season games in Miami)

I had the best success by sticking to what I did reasonably well and hoping a) they didn't take me seriously and would make sloppy mistakes or b) I might luck into finding their kryptonite.

B happened occasionally and I snuck a few wins against guys I had zero business beating. A would result in a respectable score in a loss against someone who should have curbed-stomped me. If C happened, you took the occasional 15-0 loss in about a minute, like I did against an Olympian.

As for cheating -- 90% of the rulebook is about detecting cheater. Everybody cheats.

32 You lose. No, that's the…

You lose. 

No, that's the result. I'm asking what do you do. You don't just concede immediately without even playing. What you described before are strategies for what you do when you're equally matched. To bring Princess Bride back in, it's Vizzini's problem - I know that you know that I know that... etc.

But there are also strategies for when you're just better than the opponent - in that case, your goal is to keep things simple and avoid any extra factors that might cause you to lose (again with Princess Bride, is he really left-handed). And there are also strategies for when you're just worse than your opponent - in that case your goal is to introduce as many extra factors as you can. To complete the analogy, this is Westley bluffing Prince Humperdink into surrendering.

"Cheat" didn't actually mean "really cheat" there, I meant "change the game" (which is why I said it). "The game" that's being talked about here is play selection - targeting defenders close to the line versus targeting defenders far from the line. As a defense, do you put your defenders close or do you put your defenders far. Do you have them arranged this way versus that way, etc.

'Changing the game' there would be like "OK, screw this, get the QB. Disrupt everything." You're not doing something that's not allowed by the rules, you're just not playing the same game as your opponent. Many NFL plays actually have multiple plays going on at the same time. On offense, protection vs pass routes, formations/motion versus run blocking, etc. On defense, initial alignment versus assignment, etc. If you can't actually play coverage, you shift and stop trying.

33 It depends.Guys can be good…

It depends.

Guys can be good in two different ways -- you can be really athletic or you can be really technical. As you go up the ladder, it starts to converge into guys who are both. With those guys, it doesn't matter what you do. It's like trying to outbox Superman. I usually go full-offense, just trying to score a point, because I'm not defending myself into any points. That does tend to result in losing faster, though.

(Olympic caliber fencers are playing a completely different sport. They are so long and so fast and so precise that the game becomes two blurs trying to beat the other to the same 1/2" while launching attacks from 20 feet away. As a beginning, you didn't even see what they did. You may not have noticed that they were no longer 20-ft away until it's over.)

With the guys who are humans, I usually tried to appeal to the thing they were not -- technique against fast guys, speed against technical ones. I lucked into out-techniquing a very-good technique guy once, basically via naivety. He was an extremely crafty and precise fencer who was utterly unremarkable physically, so his style was based on duping guys into overcommitting, then murdering them. An ambush hunter. I had wonky rhythm and distance for that weapon, and it utterly defeated the reflexive responses he had trained. I was too stupid to even notice his bait. I basically drunken boxed him. That might have worked again, but would not have worked a third time.

34 I usually go full-offense,…

I usually go full-offense, just trying to score a point, because I'm not defending myself into any points. That does tend to result in losing faster, though.

Yup. Same goal in football: if you're outmatched, shorten the game. Fewer plays, more variance, higher chance that dumb luck plays a role. Onsides kick to start the second half. Patriots-Eagles in 2007, games vs the Colts in the mid-2000s. Good chance all that happens is you lose faster, but a loss is a loss anyway, so who cares.

Again, this is my point: part of playcalling is not necessarily "play strategic." If the other team can beat you in coverage each time, don't bother playing coverage. And if the defense lines up and says "we know we can't cover your guys, so we're not going to try, but we're just going to force you to execute" then... you'd be an idiot to be like "ha, I will defeat your cunning scheme to let me win if I execute by trying this crazy gambit!"

Talent disparity in the NFL is about half of the game, so you'd expect it to be about half of the playcalling, too.

36 Definitely learned a good…

Definitely learned a good amount about fencing from this thread, pretty entertaining stuff. But yeah, I agree with the overall main point here: personnel matters with play-calling, significantly. I mentioned in my "Possible Sources of Error" that general scouting is something that plays a major role but wasn't accounted for in the project. And while the example I gave centered around scheme (a DC's tendency to favor Cover 0), the same concept also applies to player personnel too.

While NFLFastR's data doesn't account for the specific 11 players on the field (at least as of the this article was published) and what they are good or bad at, I imagine that the more advanced models used by specific NFL teams can do so. So it would be fun to toy around with data of that caliber and really see how much of an impact comes from talent disparities at certain positions.

19 In the real world, yes.  Or…

If you flip a coin and call heads, and are wrong -- should you call tails next time?

 

In the real world, yes.  Or more accurately, you should call whatever result you've observed from that coin most frequently.  Because we have no way of knowing just how "fair" any particular coin is without flipping it a large number of time and so should err on the side of going with whatever tendencies it's displayed so far.

38 I like the mentality here…

I like the mentality here. If I was super bored one day and flipped a coin 100 times, and it came up heads 70 times and tails 30 times, then yeah, I'm calling heads on attempt #101. I think we all agree on that. But if I had just flipped twice and they were both heads, should that impact the choice on attempt #3? If you're at a place like a carnival or casino where you suspect the house is trying to take advantage of you, then perhaps it should. So it's a fun way to think about how real life differs from the problems in your statistics course, where it's assumed that everything is completely fair.

2 It took me forever to figure…

It took me forever to figure out what LCI and UCI were -- lower and upper confidence interval. I kept thinking "LCI" was some sort of variable handling last called play.

Did you explore whether changing or not after a prior success/failure was a beneficial approach?

13 Sorry for the lack of…

Sorry for the lack of clarification there. Just a by-product of the habit of keeping variable names reasonably short when writing the code. I was hoping the black brackets representing the confidence intervals would make it clear, but perhaps the written text could've used the definition as well.

As for your second question, I did not, but I actually really like the concept. Originally I was strictly focused on diagnosing what the play-calling tendencies were at all, but I think this is a great way to dig even deeper into the project's data. Perhaps I will dive into this more over the next couple of weeks, and see if the FO editors will allow me to add some type of "P.S." to the article if the findings are notable.

3   Another important trait to…

 

Another important trait to point out is the unfortunate necessity of classifying every play as either a run or pass. Needless to say, not every play call is that black-and-white, particularly with the explosion of RPOs in recent seasons. It's not fair to label every play as a pass or run as if the categories are fully binary, but we do the best we can with the information supplied to us.

It's not just RPOs, though - quarterbacks have been calling audibles for years, and it's extremely common for plays to have run audibles built into them based on looks.

In which case some of the "run/pass" sequencing isn't actually the play caller, it's the defense.

4 Good point, some of the…

Good point, some of the switching after unsuccess could be due to defenses.  Though I would expect the converse effect... defenses would push offenses to switch more after successful plays.  (I.e. if a defense just got burned on an 8 yard run, they're more likely to immediately overplay the run and therefore get caught by playaction...

5 Yeah, I was just more…

Yeah, I was just more thinking of the (stronger) tendency to just swap. The (weaker) tendency to swap after failing would logically come from the offensive side.

Another point that I often bring up is that we are talking about humans: WRs can't run 40-yard go routes play after play after play. So in some sense assuming it's a choice is a bit naive.

12 Good point. It could be that…

Good point. It could be that defenses tend to key on whatever they just saw, which would make it smart for the offense to switch more often than random.

Looking at the success of the second play (avg EPA) could shed some light on this. If switching is the offense being overly predictable, then EPA/play after switching should be worse than EPA/play when not switching. But if switching is a response to the defense keying on what they just saw, then EPA/play after switching should be at least as good as EPA/play when not switching. (Holding all else equal.)

17 Great points here, and they…

Great points here, and they align with what Aaron Brooks G. had to say in Comment #2 under this article (the one that begins with "It took me forever to figure…"). So I think the next frontier of research here is what I mentioned in Comment #13: seeing if being "un-random" has actually been a good strategy for offensive play-callers. I hope and expect that I can dive into this before the upcoming regular season starts, and that I can add any new findings as some type of "addendum" to this piece.

18 Unfortunately NFLFastR doesn…

Unfortunately NFLFastR doesn't have detailed enough info on each play to actually dive into when play-action passes are most common. My assumption (which could probably be verified by a website with human charting, like PFF or SIS) is that Dan's comment is right, if we're thinking about offensive play-callers' potential logic. Offense has success in the run game --> offense anticipates defense over-correcting to stop the run --> offense chooses to go with a play-action pass.

On the topic of HitchikersPie's comment, if we were to strictly base it off of my project's data, the optimal time for a play-action pass would theoretically be right after a failed pass. As we can see on the second-to-last graph before the "Short Yardage" section, "Pass Rate Over Expected" is lowest after a failed pass. This means that the defense should be most likely to expect a run play after a failed pass, making the time right for the offense to strike with play action.

However, there's a reason I used the word "theoretically" there, as it's definitely unfair to come to such a conclusion when I haven't done any specific studies of play-action attempts at all. Some good public research on the effectiveness of play action is already out there, though (https://www.footballoutsiders.com/stat-analysis/2018/further-research-play-action-passing, https://fivethirtyeight.com/features/can-nfl-coaches-overuse-play-action-they-havent-yet/). 

 

25 My assumption (which could…

My assumption (which could probably be verified by a website with human charting, like PFF or SIS) is that Dan's comment is right, if we're thinking about offensive play-callers' potential logic. Offense has success in the run game --> offense anticipates defense over-correcting to stop the run --> offense chooses to go with a play-action pass.

The problem here is that you're thinking about it from a blind game perspective. Like rock/paper/scissors or odds/evens, where you have no idea what your opponent is going to do.

And football isn't a blind game: when you run a play, you see their reaction, and you also see how they're lining up pre-play. Obviously it's not a fully visible game - some of that information is muddled and hidden. But it's not purely based on an offense's beliefs. They can partly see the results of their actions. Same for the defense.

From a "math theory" perspective, when you've got an additional observable like that, you can get a phase lag - a delay - between inputs and outputs, and get correlations that don't make sense - that are in fact totally reversed - because you're missing the confounding variable. Anyone who's programmed a PID controller might know what I'm talking about here - you'd think thermostats would be easy, like "if cold, turn temperature up. If hot, turn temperature down." Go ahead and try to do that and watch the entire system go crazy.

OK, that was math wordy: but the point here is that you can actually get any correlation you want in this case because you don't know what information the offense is using to make the decisions. If the defense lines up favorable to play action... you just run play action. You don't go "hey, we need to have run success first!" You just do it. If a corner overplays a route you didn't run on a previous play, you just take advantage of it. You don't actually have to run the route.

This is the same criticism I've given every time for those play-action analyses: you don't know why they work, so you can't analyze when they're used. You have to start using information on what the defenses are doing and what they have been doing. That's the big thing missing from NFL play-calling analysis. You have to start using information on the defense, their alignment, and their reaction. (Which you don't have, obviously).

The entire existence of RPOs demands this, of course.

31 This is why I really like…

This is why I really like Ben Muth's column and I wish he wrote more and longer articles.

Ben discusses both what the line was doing and makes a pretty solid estimate of what they were trying to do, and therefore can figure out who was responsible for a bust or who did more than they needed to. It's super educational, because I can't do that.

40 I really like the points…

I really like the points made in Comment #25, and on a more philosophical scale, I think they are indicative of the state of analytics in football (and really, sports in general). Advanced metrics do a really good job of telling us what is going on -- in Pat's example, the success of play action. And while I think raw data is gradually doing a better of job explaining the why (e.g., new player tracking data that can see how far certain players moved during a play-action fake), it's always going to take some raw film study, and communication with players and coaches, to fully understand the why. 

Too many people think that embracing analytics and merely watching the game are completely opposite viewpoints, but the reality is that if you're using either one of these correctly, it's by pairing it with the other idea, instead of trying to prove to yourself that you only need one and can survive without the other. Sean Clement wrote a great piece about this for Football Outsiders around a year ago (https://www.footballoutsiders.com/ramblings/2021/pigskin-principia). Analytics should generally be supported by what shows up on tape, and play-calling is no exception. E.G., a QB pulls the ball away from an Inside Zone handoff towards the field side, and instead throws at a slant-flat combo toward the boundary side, because he sees the boundary side LB crashed to support the run and therefore he has 2-on-2, no-help man coverage to the slant-flat combo. NFLFastR's data just says "pass", but the film shows who the conflict defender was and why the QB made the choice he did.

39 Good memory, yeah that was…

Good memory, yeah that was me. Ultimately it helps defenses somewhat, as there's no situation where new information is a bad thing. But still, in practice, traits like down, distance, and score will always be the main factors in how a defense chooses to play; you're gonna load the box on 3rd and 1, and you're gonna play with a heavy cushion on 3rd and 15. Still, if a project like this has even the slightest impact on how coordinators approach the game, that's worth something.