by William Krasker
Throughout the 2004 season, I analyzed coaching decisions at footballcommentary.com [1]. In this article I re-examine a couple of those decisions, and also analyze some decisions that I originally overlooked or lacked the tools to evaluate.
With 3:32 remaining in the Monday night game between Minnesota and Philadelphia [2] in Week 2, the Vikings scored a touchdown to close the deficit to 24-15 before the try. They had one timeout. Minnesota kicked the extra point to make the score 24-16, and then kicked off deep to the Eagles. I believe that every NFL coach would do the same thing. But is it the correct strategy?
The starting point for the analysis is the following result: If you score a touchdown to trail by 9 points prior to the try, and you have time for at most one more score, then it doesn't matter whether you kick the extra point or go for two. Your probability of winning is the same either way. I showed this formally in the appendix [3] to an article [4] that was reprinted [5] here at Football Outsiders last year, but the result is intuitively clear. You'll have to attempt (and make) a two-point conversion in order to tie, so it doesn't matter whether you go for it now or later. The only exception is if you have reason to believe that your success probability will be different later in the game. (In addition, if your success probability on a two-point conversion exceeds 0.5, you should go for two after both touchdowns.)
Of course, in the case we're considering, the key assumption for that result is false. It's possible for Minnesota to score twice. The problem with kicking the extra point at 24-15 is that the Vikings don't learn whether they need one more score or two, and if it turns out that they need two more scores, they probably won't find out until it's too late. Minnesota should attempt a two-point conversion, so they can tailor their strategy to the outcome of the try. If they're successful on the two-point conversion, they should kick off deep, but if the two-point conversion fails, Minnesota should attempt an onside kick. If the onside kick succeeds, Minnesota can win with a touchdown and a field goal.
One of the more heavily debated decisions of the season was Philadelphia's decision to kick onside with 1:48 left in the Super Bowl [6], after scoring to close to within three points. The Eagles had two timeouts. In my original analysis [7] of that game I concluded that Philadelphia made the correct decision, but the numbers that went into that analysis were very imprecise. In the spring I built a detailed model for the two minute drill [8], and as an application, I re-examined Philadelphia's decision using better estimates for some of the win probabilities. However, the analysis still used a questionable estimate of the likelihood of recovering an onside kick, and failed to account for an effect called "convexity."
The basic facts are clear, and I'll repeat them here for convenience. If New England gets possession and makes a first down, the game is over. Therefore, if the Eagles kick deep, they have to use both of their timeouts and force a three-and-out. Conditional on a three-and-out, the central case would be for the Patriots to return the kickoff to their own 27-yard line, gain 5 yards on their first three downs, and punt the ball a net 38 yards. The Eagles would then get the ball at their own 30-yard line with about 0:45 remaining and no timeouts. From there, according to my model for the two-minute drill, Philadelphia's win probability is 0.109.
However, the 30-yard line is only the central case. There is substantial variability around that yard line, because of the variability of New England's kickoff return and their net punt. That variability works in Philadelphia's favor. The intuition is that Philadelphia would be helped by starting at the 40-yard line more than they would be hurt by starting at the 20-yard line. This effect is called convexity. (I first invoked it in a football context in my analysis [9]of the Indianapolis-Detroit game in Week 12.) Convexity effects are usually small enough to be ignored, but not near the end of the game. The effect of the convexity in this case is to raise Philadelphia's win probability if they kick deep by 0.019, to 0.128.
If the Eagles attempt an onside kick and are successful, they get the ball at about their own 42-yard line, and have nearly 1:48 and two timeouts. In this case, the model for the two-minute drill says that Philadelphia's win probability is 0.306. (The intuition for such a large win probability is that the Eagles are not only in much better position for a field goal that would send the game to overtime, but also have time for a touchdown that would win outright.) Even if the Patriots recover the onside kick, Philadelphia has a chance if they can force a three-and-out. Calculations using the model for the two-minute drill show that New England would need to have less than a yard to go on 4th down in order for it to make sense to go for it, and that a field-goal attempt would be a mistake in any case. So conditional on the Patriots failing to make a first down on their first three plays, the overwhelmingly likely outcome is a pooch kick, following which Philadelphia takes over with about 0:45 left. Instead of using Philadelphia's average starting field position and correcting for convexity, it's better in this case to use the actual distribution for starting field position following a pooch kick. Combining that distribution with the model for the two-minute drill, I find that Philadelphia's win probability if the onside kick fails but they force a three-and-out is 0.046.
This leaves two key probabilities unspecified. Let p denote the probability of recovering the onside kick, and let q denote Philadelphia's probability of forcing a three-and-out. Then if the Eagles kick onside, their win probability is
p0.306 + (1p)q0.046,
whereas if they kick deep their win probability is q0.128.
Philadelphia's probability of forcing a three-and-out when it's clear that New England is going to run the ball is certainly larger than the usual probability. I'm still content to use
During the Monday Night Football broadcast in Week 1, ABC displayed a graphic that said that over the previous five seasons, 24% of anticipated onside kicks (and 61% of surprise onside kicks) were recovered by the kicking team. Since I couldn't vouch for the accuracy of those numbers, and 24% felt high to me, I used
In addition, it's not always clear which onside kicks were anticipated. Often the Gamebook contains language like "onside kick formation." However, no such language appears in the Gamebook [11] for Super Bowl XXXIX. Therefore, one has to look at the situation to try to determine if the receiving team was anticipating an onside kick.
Looking only at kickoffs in the 4th quarter, I came up with 133 presumed onside kicks that were, or at least should have been, anticipated by the receiving team. Of those, 20 (or 15%) were recovered by the kicking team. (One implication is that the 24% figure reported by ABC can't be right. It would imply that about 30% of anticipated onside kicks were recovered by the kicking team from 1999 through 2001, which is completely implausible.)
Of course, there are different degrees of "anticipated." So I also tried restricting the sample to cases in which (barring a turnover) it's impossible for the kicking team to win the game without recovering the kickoff. (For example, if there is less than 2:00 left and the kicking team has no timeouts.) This eliminates any uncertainty about what the kicking team is attempting, or what the receiving team anticipates. However, it's an extremely stringent filter. It excludes Philadelphia's Super Bowl onside kick, for example. In fact, one can argue that this filter excludes all the interesting cases, because it excludes every case in which there is a decision to be made. It also results in a hopelessly small sample: only 56 onside kicks, of which 7 (or 12.5%) were successful.
Even with a sample of 133 attempts, the sampling error is large. The standard deviation is about 0.03. Nevertheless, to estimate the success probability on an anticipated onside kick, we really have nothing to go on other than the data. Based on the data I collected, I intend to use
I have one final comment. Prior to 2003, if a kickoff went out of bounds before traveling 20 yards, there was a 5-yard penalty and a re-kick. But beginning with 2003, there is no re-kick in the last 5:00; the kick is treated like any other kickoff that goes OB. Strictly speaking, then, including years prior to 2003 introduces a bias. But the bias should be small, and in any case, there is so little data that one has no choice but to use every year available. (None of the successful onside kicks in my 2002 sample benefited from the old rule.)
With just 0:13 left in the Monday night game between Dallas and Washington [12] in Week 3, the Redskins trailed 21-18 and had the ball at their own 33-yard line. On the next play, quarterback Mark Brunell threw to Rod Gardner for a 46-yard completion. But Washington was out of timeouts, and the game ended when Gardner was unable to get out of bounds.
The television commentators pointed out repeatedly that the Redskins had squandered timeouts. Twice during Washington's second possession of the 3rd quarter, the Redskins were unable to get off a play in the time allowed. In both cases, Washington averted a delay-of-game penalty by calling timeout just before the play clock expired.
The first occurrence came with 5:38 left in the 3rd quarter, with Washington trailing 14-3 and facing 4th-and-1 at the Dallas 41-yard line. The second came with 3:42 left in the quarter, when the Redskins had 1st-and-10 at the Dallas 16-yard line.
In both cases, Washington's indisputable error was the poor execution that prevented them from getting off a play, and forced them to choose between two unpleasant alternatives: incurring a five-yard penalty, or expending a timeout. Were they right to call timeout?
To answer that question, we need to know the value of a timeout at any particular point in the game. In the spring I developed a model [13] for the most important component: the "clock-management" value, which is the value of the option to stop the game clock to conserve time. For the most part, the clock-management value of a timeout is small -- perhaps smaller than one would expect. To understand why, consider Washington's situation, trailing by 11 points with 5:38 left in the 3rd quarter. In many of the scenarios that could unfold from that point, Washington will still trail by more than one score late in the game. An extra timeout will then be helpful, but won't increase their probability of winning by very much. In some other scenarios Washington will take the lead, and won't even use a timeout for clock management. Even among scenarios in which Washington trails by one score late in the game, the timing has to be just right in order for a timeout to have a large impact. For example, along the scenario that actually occurred, if the Cowboys fail to make a first down on their final possession, the Redskins get the ball back with plenty of time to score; whereas if Dallas makes two first downs, they can run out the clock even if Washington has a timeout. Of course, as it happened, Dallas made exactly one first down on that possession, so that an extra timeout would have enabled the Redskins to gain possession with 1:01 left rather than 0:21. This is actually the timing for which the extra forty seconds have the maximum impact, and it raises Washington's win probability by 0.1, from 0.03 to 0.13. But if we take a weighted average of the value of a timeout over all the possible scenarios, weighted by the likelihood of occurrence, we get something much smaller.
According to the model, the clock-management value of Washington's first timeout, when they used it with 5:38 left in the 3rd quarter, was 0.007. By this I mean that Washington's probability of winning the game at that point is 0.007 higher if they have three timeouts rather than two. We have to compare this to the reduction in Washington's win probability if they incur a five-yard penalty. According to the footballcommentary.com Dynamic Programming Model [14], going for it on 4th-and-6 from the Dallas 46-yard line is as good as punting, so we can assume that the Redskins go for it whether they incur the penalty or not. Washington's win probability is about 0.07 higher if they make the first down than if they are stopped. Since the probability of making the first down on 4th-and-1 is about 0.7, compared to about 0.38 on 4th-and-6, a delay-of-game penalty lowers Washington's win probability by about
It's much harder to determine if the Redskins made the right choice when they used their second timeout. The problem isn't finding the clock-management value of the timeout. The clock-management value of Washington's second timeout, when they used it with 3:42 remaining in the 3rd quarter, was about 0.01. The problem is estimating the cost of a five-yard penalty when it's 1st-and-10. Here's the best I can do. A five-yard penalty reduces Washington's probability of scoring a touchdown on the possession by some amount, say δ, and increases their probability of settling for a field-goal attempt by about the same amount. It also increases the expected length of a field-goal attempt by about five yards. Calculations using the footballcommentary.com Dynamic Programming Model [14] then imply that a five-yard penalty would reduce Washington's win probability by 0.15δ. So it's right for Washington to call timeout if δ exceeds
With 0:10 remaining in the first half of the Divisional Round playoff game between Minnesota and Philadelphia [15], the Eagles led 21-7 and had 2nd-and-goal at the Minnesota 9-yard line. Though the Eagles were out of timeouts, they elected to run another play rather than kick an immediate field goal. Donovan McNabb threw a short pass over the middle to Dorsey Levens, and when Levens was tackled immediately for a gain of 5 yards, time expired and Philadelphia came away with no points.
Later that same afternoon, in the game between Indianapolis and New England [16], the Colts trailed 6-0 near the end of the first half, but had the ball deep in New England territory. As I explained in my analysis [17] of that game, the Colts could have called timeout with 0:17 left, and if they had done so, they could have run two more plays instead of one before settling for a field goal. I interpreted their failure to call timeout as a clear error, but the analysis is actually more complicated, and c comes down to the following hypothetical question. If it gets to 3rd-and-goal at the 5-yard line with 0:10 left, and Indianapolis has no timeouts, should they kick an immediate field goal or run another play? If it's right to run another play, the Colts erred by not calling timeout at 0:17. Otherwise, it doesn't matter.
The decisions Philadelphia and Indianapolis faced require a coach to weigh the benefits of a potential touchdown against the possibility of a sack, an interception, or (as in McNabb's case) a lapse in judgment that prevents a field-goal attempt. Notice that the probability of an incomplete pass is irrelevant, since an incompletion sets up a field-goal attempt. And a penalty just makes the subsequent field-goal attempt a little longer or shorter.
I examined a few such situations on an ad hoc basis during the 2004 season, beginning with my analysis [18] of the game between the Giants and Minnesota in Week 8. Here I'd like to do a more general analysis.
If the offense attempts to run another play, let pTD denote the probability of scoring a touchdown, and let p0 denote the probability of coming away without even a field-goal attempt (due to a sack or interception, for example). Let V7, V3, and V0 denote the offense's win probability if they get a touchdown, a fieldgoal, or no additional points respectively. These can be determined using a model like the footballcommentary.com Dynamic Programming Model [14]. Let pFG denote the success probability on a field goal from the current field position, so that the win probability associated with a field-goal attempt is
VFG = pFGV3 + (1 pFG)V0.
It's correct to run another play rather than attempt an immediate field goal if doing so gives a higher probability of winning the game:
pTDV7 + p0V0 + (1 pTD p0)VFG > VFG.
With a little algebra, this inequality and the preceding equation can be reduced to the more convenient condition
pTD/(pTD+ p0) > pFG(V3V0)/(V7V0).
The factor
Since pFG is about 0.94 when the line of scrimmage is the 9-yard line, the right side of the inequality is about 0.5 in Philadelphia's situation. This implies that Philadelphia should run another play if the probability pTD of scoring a touchdown on that play exceeds the probability p0 of coming away without even a field-goal attempt.
Unfortunately, it's hard to learn about either pTD or p0 from data. For example, one is tempted to try to estimate pTD by looking at the results of 3rd-and-goal plays from the 9-yard line. However, those situations are quite different from the one Philadelphia faced. On a typical 3rd-and-goal play from the 9-yard line, a 5-yard gain is better than an incomplete pass. For Philadelphia, it was equivalent to an interception. In Philadelphia's situation, you either throw into the end zone or throw the ball away; and since the defenders know that, the probability of a touchdown becomes lower and the probability of an interception becomes higher. Still, I believe pTD exceeds p0
in that situation, so I'm content with the decision to run another play. But I'd be more confident about the correctness of the decision if Philadelphia had been trailing in the game.
For Indianapolis's hypothetical situation the conclusion is easier, because the Colts trail and are closer to the goal line. Even if pTD were only 0.2 from the 5-yard line, p0 would have to exceed 0.3 for an immediate field goal to be correct. That's not plausible. So I'll stick with my original conclusion: the Colts should have called timeout with 0:17 left in the half, and made use of the extra chance for a touchdown.
Analogous situations can also arise at the end of the 4th quarter, although they are only interesting when the offense trails by 3 points. The offense might then have to decide whether to attempt a tying field goal immediately, or run another play and risk a game-ending sack or interception. In this case,
pTD/(pTD+ p0) > 0.5pFG.
Links:
[1] http://www.footballcommentary.com/index.htm
[2] http://www.nfl.com/gamecenter/recap/NFL_20040920_MIN@PHI
[3] http://www.footballcommentary.com/coltcomeback.pdf
[4] http://www.footballcommentary.com/decisions2003.htm
[5] http://www.footballoutsiders.com/ramblings.php?p=208
[6] http://www.nfl.com/gamecenter/recap/NFL_20050206_NE@PHI
[7] http://www.footballcommentary.com/analysis2004sb39.htm
[8] http://www.footballcommentary.com/twomindrill.htm
[9] http://www.footballcommentary.com/analysis2004week12.htm#indiatdet
[10] http://www.nfl.com/gamecenter/gamebook/NFL_20040926_TB@OAK
[11] http://www.nfl.com/gamecenter/gamebook/NFL_20050206_NE@PHI
[12] http://www.nfl.com/gamecenter/recap/NFL_20040927_DAL@WAS
[13] http://www.footballcommentary.com/timeouts.htm
[14] http://www.footballcommentary.com/dynamicprogramming.htm
[15] http://www.nfl.com/gamecenter/recap/NFL_20050116_MIN@PHI
[16] http://www.nfl.com/gamecenter/recap/NFL_20050116_IND@NE
[17] http://www.footballcommentary.com/analysis2004wcanddiv.htm#halftime
[18] http://www.footballcommentary.com/analysis2004week8.htm#nygatmn