Writers of Pro Football Prospectus 2008

Most Recent FO Features

HundleyBre15.jpg

» Futures: UCLA QB Brett Hundley

Beyond the immediate considerations of Hundley's potential, the quarterback's tape raises larger questions about the position.

26 Jun 2006

Quarterbacks and Fourth Quarter Comebacks

Guest column by Jason McKinley

Competitive balance defines the NFL. Most games are still in doubt in the fourth quarter. Since 1996, 1,474 out of 2,598 regular season and postseason games have featured a team trailing by eight points or less in possession of the ball in the fourth quarter. In 603 of those 1,474 games, the trailing team won. Therefore, nearly a quarter of all victories in the last decade have been the result of late and dramatic rallies.

Quarterbacks are associated most strongly with comebacks. How many times was it said that John Elway "willed his team to victory" following a come-from-behind rally? Johnny Unitas is often credited with the creation of the two-minute offense. In his Hall of Fame career, Joe Montana overcame multiple fourth quarter deficits. In fact, Joe Montana overcame multiple fourth quarter deficits in the postseason alone. Hell, Joe Montana overcame multiple fourth-quarter deficits in the postseason even if you only count his two seasons with the Chiefs. Today, quarterbacks like Tom Brady and Brett Favre are discussed in heroic terms mainly because they're able to pull out victories in situations where mere mortals would surely fail.

Obviously, many comeback attempts prove futile. While 603 games since 1996 have featured come-from-behind wins, another 1,322 games have ended with the close trailer still behind when the final gun sounded. Any instance in which a team had possession of the ball at some point in the fourth quarter and was trailing by eight points or less was considered for the study. This naturally would include any successful comeback regardless of the largest deficit faced – one can't complete a twenty-point comeback without getting the score under 9 points at some time. It also gives a reasonable cut-off for failed comebacks: where one drive could potentially change the lead or send the game to overtime. On occasion a team will get the ball very early in the fourth quarter, then again later, and (very rarely) a third or fourth time, and be within one score every time. In this study, that is counted as only one failed game opportunity.

While quarterbacks get accolades for the comebacks, head coaches often get blamed for failures. Why do we think of Tom Brady, and John Elway as responsible for leading come-from-behind victories, but not Bill Belichick, Mike Shanahan, and Dan Reeves? Is a quarterback more responsible for a comeback, or is it the head coach? Is a head coach more responsible for holding a small lead, or is it the quarterback?

We can start to figure this out by looking at the individual performance of quarterbacks in comeback situations. Over the past decade, no quarterback has had more fourth-quarter comebacks than Drew Bledsoe. Then again, no quarterback has had more fourth-quarter comeback opportunities than Bledsoe. Are Bledsoe's 19 wins in 61 comeback chances more impressive than Donovan McNabb's 12 in 27? A raw total says that Bledsoe is better, and a straight winning percentage says that McNabb is better. Neither seems like an ideal ranking tool.

A simple comparative ranking system can be formulated with the help of a statistical method known as a "t-test." A t-test is generally used to test a statistical hypothesis against some population parameter. The result is given as a "p value," where a lower p value indicates a more significant result. A t-test will usually reward a good average over a large number of trials more than a great average over a small number of trials, which is necessary in this study due to the wildly different and often very small sample sizes for each quarterback. A Shapiro-Wilk test was executed and it was determined that the population had an approximately normal distribution, which is necessary for this type of testing. Here the t-test is used to determine the significance of a quarterback's actual number of comebacks compared to the expected number given his opportunities.

(A word of caution is in order: Please do not use t-tests the way they are used in this article. For our purposes, the significance levels were actually used to rank order every quarterback that has been in a comeback situation since 1996. This is not in the spirit of what a t-test is supposed to do. However, saying someone is in a group that is significantly better than average, worse than average, or not significant versus the sample population at some p value is not nearly as fun as saying "This guy ranks 8th and this guy only ranks 37th!" Anyway, if you are a professional statistician, do not attempt this at work!)

With a test in place for ranking performance, we can then implement another test, an analysis of variance, to help determine responsibility for comebacks and holding leads. An analysis of variance can break down the components of variation between and within groups and help determine which factors (if any) are important. In this study all possible two-way combinations of quarterbacks and coaches were examined. All quarterbacks who have been in a comeback or lead-holding situation under more than one head coach comprised one group. All coaches that have had multiple quarterbacks in comeback or lead-holding situations comprised the other group. Analyses of variance were run on each group, examining comeback ability and the ability to maintain leads, using modified t-test results as the dependent variables.

The general picture from these analyses of variance is that quarterbacks are more important than coaches in coming from behind to win, and coaches are more important than quarterbacks in holding leads. For example, the results indicate that Tom Brady should maintain a similar ability to bring his team from behind to win regardless of whether or not his coach is Bill Belichick. Furthermore, Bill Belichick should maintain a similar ability to hold on to a one-score, fourth-quarter lead whether or not his quarterback is Tom Brady.

These results make intuitive sense. A team that is trailing needs to be able to move the ball and score. Calling the right plays in this situation is certainly important, but execution by the quarterback and his surrounding cast is paramount. Meanwhile, a team that leads by a small margin will require a defensive stop, followed by utilization of a clock-killing offensive strategy usually predicated on the running game. This largely negates the quarterback's role.

We looked at every game from the past ten seasons to see which recent quarterbacks have been the best at rallying their teams back from a deficit. And although he was known for comebacks early in his career, the top comeback quarterback might surprise you: it's Jake Plummer, slightly ahead of Peyton Manning and Vinny Testaverde. It's a surprising conclusion, but that's the kind of insight rational statistical analysis can provide.

Table 1. Top 10 quarterbacks at comebacks since 1996
Rank Quarterback Wins Losses
1 Jake Plummer 19 28
2 Peyton Manning 19 29
2 Vinny Testaverde 19 29
4 Tom Brady 13 8
5 Jon Kitna 15 23
6 Kerry Collins 17 30
7 Donovan McNabb 12 15
8 Marc Bulger 10 5
9 Jake Delhomme 10 12
9 Jay Fiedler 10 12

Plummer's comeback ability has drifted towards average since his first few seasons but his overall numbers still rate the best. Under Vince Tobin, he was a stellar 10-11 when trailing by one score in the fourth, and since then he is a solid 9-17. Peyton Manning holds the single-season mark with six comeback wins in 1999. Only the coldest, hardest, football fact-seeking Patriots fans would express surprise at Peyton Manning's high rank, but Vinny Testaverde is a bit of an eye-opener. However, Vinny was a comeback machine in 2000 and 2001, racking up nine wins in 17 comeback chances during those two seasons with the Jets. No other quarterback in the last decade had more than eight comeback wins in any two-year span.

Tom Brady is next on the list and, remember, these numbers do include the postseason. Brady is one of only a handful of quarterbacks with a winning record in more than three games with a fourth-quarter deficit. The others are Marc Bulger (10-5), Ben Roethlisberger (7-2), Steve Young (7-4) and John Elway (7-6). Bulger, who ranks eighth by this metric, is the only one in that group that has yet to win a Super Bowl.

(Elway holds the NFL record for fourth-quarter comebacks with 47, but this study only includes the final three years of his career.)

Donovan McNabb managed to crack the top 10 despite the handicap of not having Terrell Owens on his team for most of his career. When trailing by a close score in the fourth-quarter, McNabb's record is 10-12 without Owens and 2-3 with him. McNabb is one of three NFC Champion quarterbacks in the top 10, along with Jake Delhomme and Kerry Collins. Delhomme has had at least one successful comeback in every year he's had an opportunity. That includes his one chance playing for Mike Ditka's Saints at the end of the 1999 season. Another of Ditka's short-term quarterbacks with the Saints, Kerry Collins, actually did most of his comeback damage between 2000 and 2002 with the Giants. He had 11 comebacks in 23 opportunities in those three seasons, which is the highest number of comebacks for any quarterback over a three-year span.

The interesting cases of Jon Kitna and Jay Fiedler complete the top 10. Neither has ever been considered a franchise quarterback. Neither has a rifle arm. Neither was drafted coming out of college and both had to cut their teeth in NFL Europe before getting a chance in the NFL. Yet based upon their performance, it can certainly be argued that they deserved every one of the 139 NFL starts that they've racked up. In the last decade, Kitna and Fiedler combined for 25 comeback wins in 60 opportunities. This is six more come from behind victories than Drew Bledsoe, with one less opportunity. That is not meant as a condemnation of Bledsoe, who has been very average in this situation.

Among the 10 quarterbacks with the worst comeback records over the past 10 years, four have been to the Pro Bowl (Table 2). But most have spent more time holding a clipboard and signaling in plays than actually starting. Six have started a playoff game and three even have Super Bowl rings (OK, Banks and Griese were backups in their Super Bowls). Their career quarterback ratings range from 63.2 to 94.1. They came from all over the draft: two first-rounders including a number one overall selection, one each from the second, third, fifth and seventh rounds, two fourth-rounders, and two undrafted free agents. Despite these differences in background and performance, they have at least one thing in common: Each has lost at least one starting job during his career.

Table 2. Bottom 10 quarterbacks at comebacks since 1996
Rank Quarterback Wins Losses
153 Tim Rattay 1 10
154 Kelly Holcomb 3 13
155 Danny Kanell 1 11
156 Patrick Ramsey 0 11
157 Tony Banks 5 18
158 Jeff George 2 14
159 Kurt Warner 5 19
160 Brian Griese 4 20
161 Steve Beuerlein 5 23
162 Mark Brunell 14 39

Sometimes they lost the job because of injuries, other times because of poor performances. In other cases the starting job was clearly temporary, until their team's hot, new, highly-drafted prospect was proclaimed ready. And in Jeff George's case with the Falcons, a nationally televised, profanity-laced sideline tantrum directed at the head coach signaled the end.

The poor showing by these players doesn't mean they don't have the overall ability to play quarterback at a high level. It is, however, an indictment of their ability to consistently bring their teams from behind. Mark Brunell is a fine quarterback despite his 14-39 record since 1996 when trailing by one score in the fourth quarter. Brunell has enjoyed a positive DVOA in five of his last six seasons. Since 1996, his teams' combined record when he starts is 76-64. But when he's down in the fourth quarter, he stays down.

One of the more surprising names on the "worst" list is Kurt Warner. He started two Super Bowls and won one. In each of those years he was the league MVP. Yet that obviously had more to do with gaining early leads than it had to do with coming back. One of his five career comebacks came in the 1999 NFC Championship game against the Buccaneers. His touchdown pass to Ricky Proehl with 4:44 left gave the Rams an 11-6 victory and a trip to the Super Bowl. It was a great time for the first comeback win of Warner's career, but he would go on to have only four more in more than 20 opportunities through 2005.

As Warner's case shows, a team does not necessarily need a quarterback with gaudy comeback ability to make a Super Bowl. There are six quarterbacks who began their careers after 1995 that have made the Super Bowl. Brady, McNabb, Delhomme and Warner have been discussed as members of the top 10 and bottom 10. The other two are Ben Roethlisberger and Matt Hasselbeck. Big Ben ranks 16th out of 162, but Hasselbeck ranks 149th. Hasselbeck has the same comeback record (5-16) as David Carr. Roethlisberger's record is comparable to Bulger's: he also began his career with a 7-2 record in comeback situations. In addition, Brady is 7-2 in his most recent comeback chances.

Brett Favre is often cited as a master of the comeback. He has been above average over the last decade, but not by much. His 16-34 record is approximately one win better than would be expected – reasonably good, but not great.

Another surprise is the rating for Doug Flutie. Some call Flutie's comeback ability "Flutie Magic." Mozart called it "Die Zauberflutie." (Flutie has been around a long time.) The idea is that if Flutie is in the game, his team always has a good chance to come from behind and win. But that's not true -- in fact, he's actually below average. Six come-from-behind wins in 23 opportunities ranks Flutie 146th out of 162 eligible candidates.

It is probably not appropriate to judge Hall of Fame inductees by their rankings on this list, because of its limited time constraint. There are six Hall of Famers on our list, but all played most of their careers prior to 1996. Steve Young and John Elway still grade out very well, and Dan Marino and Jim Kelly are above average. Troy Aikman and Warren Moon had records of 6-16 and 4-12 when presented with a comeback opportunity after 1995. Those match the records of Joey Harrington and Scott Mitchell, respectively, and rank them in the bottom quartile. It's unclear from this study if this is a true indication of their total career performances. However, based on the analysis of variance results discussed earlier, it may be more reasonable to assume that this is somewhat representative of their total careers than to assume that it is not.

Turning our attention to head coaches, seven of the last 10 Super Bowls have been won by coaches who rank among the 10 best at holding a one-score, fourth-quarter lead (Table 3). Nine of the top 10 have either been to a Super Bowl or coached in multiple championship games, with the exception being Jim Haslett. Of course, Haslett spent his entire head coaching career with the Saints; he's good, but he's not a miracle worker. Interestingly, three of these coaches have been relegated to subordinate jobs: joining Haslett are Jim Fassel and media whipping boy Mike Martz. Readers of Pro Football Prospectus 2005 will not be surprised to see Martz ranked so highly, although Bob Ryan and Michael Wilbon may feel that his inclusion in the top five invalidates the entire study.

Table 3. Top 10 coaches at holding a lead since 1996
Rank Coach Wins Losses
1 Tony Dungy 51 14
2 Dennis Green 48 13
3 Bill Belichick 32 5
4 Bill Parcells 40 13
5 Mike Martz 27 5
6 Jim Fassel 35 11
7 Mike Shanahan 47 18
8 Jim Haslett 28 8
9 Bill Cowher 44 17
10 Brian Billick 27 8

The worst coaches at holding a one-score, fourth-quarter lead include Marty Schottenheimer and Mike Holmgren (Table 4). Both coaches are long-tenured and boast career records that are more than 50 games over .500. Holmgren has won a Super Bowl and coached in two others. Schottenheimer's postseason record is horrific, but his teams generally perform very well in the regular season. (He's the Flip Saunders of football!) Yet they both consistently field teams that get beaten in the fourth quarter more often than they should. George Seifert's appearance in the bottom 10 might come as a surprise as well. This study includes his time with Carolina, and just one of his glory years with the 49ers.

Table 4. Bottom 10 coaches at holding a lead since 1996
Rank Coach Wins Losses
67 Dom Capers 25 14
68 Ted Marchibroda 14 10
69 Mike Holmgren 44 22
70 Jim Mora, Sr. 19 12
71 Butch Davis 15 11
72 Marty Schottenheimer 33 18
73 Dave Campo 7 11
74 George Seifert 11 12
75 Dennis Erickson 15 13
76 Mike Riley 9 13

The Chiefs, in their coaching transition, have made a nearly-perfect swap according to this metric. Since returning to the NFL in 1997, Dick Vermeil has been almost perfectly average when it comes to keeping a small, late lead. Only one coach has him beat when it comes to mediocrity: Herman Edwards. That's right; the second most average coach by this metric is being replaced by the most average coach. Obviously, the Chiefs really did their homework in finding the best possible replacement for Vermeil.

Of course, not all comebacks are created equal. In the average come-from-behind, fourth quarter victory, a winning quarterback first gets the ball down by of 5.5 points with 11:47 to play. Meanwhile, in the average failed comeback, a losing quarterback first gets the ball down by an average of 4.6 points with about 8:10 to play. Those parameters were determined by looking at the point of the game where a team is down by the most points in the quarter with the most time remaining in instances of successful comebacks (whether or not the first drive yields any points), and by looking at the point of the game where a team is down by the least amount of points with the most time remaining when they get the ball in the case of a failed comeback. The drive with the smallest deficit faced (or in the case of multiple drives with the same deficit, the largest time remaining) was recorded.

Under these guidelines the vast majority of quarterbacks' comebacks and failed comebacks begin when they are down by between three and eight points (i.e., more than a field goal and less than a touchdown). But a few quarterbacks have had easier or harder comebacks and failed attempts.

Table 5. Smallest average deficits overcome, minimum 2 games, 1996-present
Rank Quarterback Deficit Games Time Remaining
1 Ryan Leaf 1.50 2 14:33
2 Joey Harrington 1.50 6 10:38
3 Kurt Warner 1.80 5 12:51
4 Paul Justin 2.00 2 15:52
5 Bobby Hoying 2.33 3 10:05
6 David Garrard 2.50 2 13:32
7 Kyle Boller 2.50 4 11:34

Seven quarterbacks have more than one successful comeback win with an average score to overcome of less than three points (Table 5). Yes, that's Ryan Leaf atop the list; he actually had two comeback wins in his NFL career. Eight quarterbacks with multiple career comeback wins have needed two scores on average to finish the deal (Table 6). Troy Aikman and Tony Banks didn't bring their teams from behind quite as often as the league average indicates they should have. But when they did, the degree of difficulty was "ALCOA Fantastic Finishes" impressive. "Time Remaining" in the tables represents the time left in the game when the quarterback got the ball for a drive that would be his first in the fourth quarter, regardless of whether or not that drive resulted in points, or whether or not the drive began in the fourth quarter.

Table 6. Largest average deficits overcome, minimum 2 games, 1996-present
Rank Quarterback Deficit Games Time Remaining
1 Sage Rosenfels 13.50 2 14:43
2 Shane Matthews 13.00 3 12:10
3 Tommy Maddox 10.33 3 10:41
4 Erik Kramer 10.00 4 14:06
5 Craig Whelihan 10.00 2 17:06
6 Jamie Martin 9.25 4 14:01
7 Tony Banks 9.20 5 11:10
8 Troy Aikman 8.83 6 11:56

Looking at the comebacks that did not materialize, six quarterbacks with more than one failed opportunity have an average points needed of less than three (Table 7). Ryan Leaf certainly had a lot of close calls in his career.

Table 7. Smallest average deficit not overcome, minimum 2 games, 1996-present
Rank Quarterback Deficit Games Time Remaining
1 Jeff Hostetler 2.00 3 6:10
2 Chris Redman 2.50 2 14:44
3 Kyle Orton 1.50 2 7:51
4 Ryan Leaf 2.57 7 7:44
5 A.J. Feeley 2.67 6 6:00
6 Rob Johnson 2.75 4 6:53

A complicating factor in this study has to do with a "near" comeback. For our purposes, a comeback was only counted as successful if it concluded with a victory (half-victories and half-losses were assigned to ties). What about a situation in which a quarterback brings his team from behind to take a lead only to see the other team march down and retake the lead with no time left? What about a quarterback putting his team in a position to win the game with a last minute field goal only to watch as the kicker shanks it? Including situations such as these would require a William Krasker-esque mathematical model which would assign appropriate win probabilities to different game circumstances. Such a model would be a wonderful achievement, but would be far too labor intensive in its creation to justify in this project.

The numbers tell a useful story as they are currently presented. For example, if every quarterback that is significantly "bad" at p < .001 (the bottom eight on the list) were given an extra comeback win, just for being nice guys, they would all still all be significantly bad at p < .05. Some would still be significantly bad at p < .05 if given two or three extra wins, and that is without adjusting for a new league where comebacks are decidedly more rampant because we're suddenly giving away wins or partial wins just for being close.

Football is a team game, and usually the culpability in a failed comeback lies with other players in addition to the quarterback. Quarterbacks receive a higher share of the glory regarding comebacks, but this is tempered by those times when they receive a higher share of the blame for losses. Jeff George was only 2-14 in games in which he had a chance in the fourth to bring his team back and win. But would anyone ever say, "Jeff George would have been great at bringing his teams back to win -- if his teammates hadn't constantly failed him"? Well, okay, would anyone besides Jeff George ever say that?

(Ed. Note: Jason Whitlock.)

Some quarterbacks have proven better than others in the art of the comeback. Some coaches have proven better than others at holding a small, late lead. These differences appear to be meaningful. Old coaches in new places like Dick Jauron and Herman Edwards should continue their career trends in terms of holding fourth-quarter leads, while quarterbacks with new teams like Daunte Culpepper and Aaron Brooks should retain similar abilities to bring their teams back and win.

So on October 22, if Mark Brunell gets the ball in the fourth quarter trailing by eight points or less to Tony Dungy's Colts, don't be surprised if the Redskins end up losing. And on November 19, if Jake Plummer gets the ball in the fourth quarter trailing by eight points or less to Marty Schottenheimer's Chargers, don't be surprised if the Broncos end up winning.

Full Results

162 quarterbacks have had at least one fourth-quarter comeback opportunity since 1996. There are 24 instances in which two quarterbacks saw action in a failed comeback game, usually due to injury or ineffectiveness of the primary quarterback.

Based on t-tests, 21 quarterbacks have been significantly "good" at fourth-quarter comebacks since 1996 and 18 quarterbacks have been significantly "not good" by the same test (all at the .95 level). The top 10 and bottom 10 are listed above in Table 1 and Table 2. Here are the others:

Significantly good: Aaron Brooks, Tim Couch, Trent Green, Kent Graham, Steve Young, Ben Roethlisberger, John Elway, Kordell Stewart, Daunte Culpepper, Elvis Grbac, Rich Gannon.

Significantly bad: Neil O'Donnell, Doug Flutie, Frank Reich (yes, the same Frank Reich who led the greatest postseason comeback in NFL history in 1992, prior to this study), Gus Frerotte, David Carr, Jim Harbaugh, Matt Hasselbeck, Billy Joe Tolliver.

Posted by: Guest on 26 Jun 2006

95 comments, Last at 09 Jan 2007, 4:33pm by Ray

Comments

1
by Israel (not verified) :: Mon, 06/26/2006 - 12:08pm

How did you count cases where the quarterback comes back from an eight point deficit to take the lead, only to have the coach blow it?

2
by Sunil (not verified) :: Mon, 06/26/2006 - 12:17pm

Really cool article Jason - I enjoyed the analysis.

I'm wondering if you were able to correct for home / away games in assessing 4th quarter comebacks. Does a top comeback QB perform better at a home stadium or is his performance independent of stadium?

3
by Jesse (not verified) :: Mon, 06/26/2006 - 12:29pm

Jim Harbaugh? aka, "Captain Comeback"

imagine that

4
by sublime33 (not verified) :: Mon, 06/26/2006 - 12:36pm

For years, I firmly believed that Brett Favre was one of the greates QB's at running the two minute drill - at the end of the first half. But he was nothing special at the end of the game despite what the pundits said. This study proves my point about his end of the game heroics. Nice job.

5
by NYCowboy (not verified) :: Mon, 06/26/2006 - 12:36pm

1st of all, the most surprising thing about this study was that we had the lead in the 4th quarter 18 times during the Campo Era! Good times. But that confuses me as to how if he won 15 games, he's only credited with 7 wins in that Table. Does that means the Boys were blowing away their opponents the other 8 games?

Also, don't u have to take the different defenses into account? Certain comeback attempts are more difficult than others.

Finally, the Boys started a precipitous decline in 1996, so I would guess if you looked at Aikman's comeback attempts before then, they would be much better, variances notwithstanding.

Good article!

6
by Matthew Furtek (not verified) :: Mon, 06/26/2006 - 12:37pm

Re: Table 2
Not good to be a Redskins QB at all...

7
by James, London (not verified) :: Mon, 06/26/2006 - 12:52pm

Nice article Jason.

One observation. While I'd generally agree with the premise that successful comebacks are the result of good QB play, I'd also say that this is less applicable to the games typical of tables 5 (and table 7).

When a team is down by less than 8 with lots of time remaining, the QB becomes less important, as a team is able to run its normal offense and not go "pass wacky". OTOH, the situations typical of table 6, big deficit and/or little time remaining would seem to be heavily dependant on the QB.

Like #2, I'm also interested by the home/away effect. Good stuff anyway!

8
by Hutz (not verified) :: Mon, 06/26/2006 - 12:52pm

Nice article. Well written and well researched. This type of article is why I love the site.

For any Broncos fans, do you think that there is any truth to the long held belief that Reeves would keep Elway (and the offense for that matter) on a leash for 3 quarters and then turn him loose in the 4th when the team was behind and needed a comeback?

9
by Adam (not verified) :: Mon, 06/26/2006 - 1:01pm

But whats going to happen when Ben has to step up and win it for the Steelers by himself? When he has to do it in "crunch time?" All he's shown to this point is he can hand it off.

Oh....wait.

10
by Dan (not verified) :: Mon, 06/26/2006 - 1:04pm

As Jason notes, this isn't really how a t-test is meant to be used. What the p value tells you is how confident we can be that a person's above average performance (or below average performance) is not just due to chance. The fact that we can be more confident about Peyton Manning than about Tom Brady (two pick two quarterbacks at random) that the above average comeback performance is not due to chance does not mean that Manning is better at comebacks than Brady. It's mostly just a consequence of sample size, the same way that you can be very very confident that a coin is weighted if you flip it a billion times and get 51% heads, even though it's only slightly biased.

Another way of ranking comeback QBs, analogous to PAR or VOA, is to look at how many extra wins a QB earned for his team, compared to how they would have done with an average QB. Just take the number of comeback wins a team got minus the number of wins that they would have been expected to get with an average quarterback, given the number of comeback chances that they had. The number of expected wins is just 31.3% of the number of comeback chances (since there were 603 successful comeback attempts and 1322 failed comeback attempts). Here are those numbers for the 20 quarterbacks whose records are given in the article. The quarterbacks' records and their rankings using the p value method are in parentheses.

+6.4 Brady (13-8, 4)
+5.3 Bulger (10-5, 8)
+4.3 Plummer (19-28, 1)
+4.0 Manning (19-29, 2)
+4.0 Testaverde (19-29, 2)
+3.5 McNabb (12-15, 7)
+3.1 Delhomme (10-12, 9)
+3.1 Fiedler (10-12, 9)
+3.1 Kitna (15-23, 5)
+2.3 Collins (17-30, 6)

-2.0 Holcomb (3-13, 154)
-2.2 Banks (5-18, 157)
-2.4 Rattay (1-10, 153)
-2.5 Warner (5-19, 159)
-2.6 Brunell (14-39, 162)
-2.8 Kanell (1-11, 155)
-3.0 George (2-14, 158)
-3.4 Ramsey (0-11, 156)
-3.5 Griese (4-20, 160)
-3.8 Beuerlein (5-23, 161)

11
by Robert Visser (not verified) :: Mon, 06/26/2006 - 1:05pm

Why would you open a hypothesis about where the greats such as Elway and Montana actually stood statistically speaking in leading comebacks only to then cut out the most important years of their respective careers for the analysis? In his last two years did Elway even have to play from behind? Great build-up to what appeared an interesting case only to completely miss the mark. Why not re-do the analysis looking at an average of say 30 - 50 career games trailing in the 4th quarter instead of limiting the study to post 1996 numbers. Love the FO insights but this one fell wide right.

12
by Peter (not verified) :: Mon, 06/26/2006 - 1:18pm

#11, I actually agree, this article was sooo close to being awesome. My first problem is why we're even made to look at a listing of smallest comeback average... when the sample size is 2 games. I'm interested in how quarterbacks I've heard of perform and will perform, especially in the next season, not how Ryan Leaf did in the two games where he got a field goal at the end and won.
Second, yes, I would like to know about famous QBs, though I understand there are huge problems with number crunching and adjustment (Maybe there was a 50% comeback rate in Unitas' time, I haven't a clue).
Third, the defense problem (not the defense faced, I mean the quarterback's defense) is enormous. I'm not convinced by the fact that little difference was made by adding one or two wins to the very worst... the natural compacting of players around the center in a bell curve could make those differences very big. If the 50th percentile people had won 3-4 more comebacks, I'm guessing they would have a very significant jump. Also there's a possibility that certain teams, cursed with atrocious defenses (Chiefs?) could have had the comeback-only-to-lose thing happen far more than 3-4 times. It's happened several times to Jake Delhomme, and he's only started a couple years! (Superbowl XXXVIII still weighs heavy on my soul... at least he gets some praise here).

13
by Peter (not verified) :: Mon, 06/26/2006 - 1:20pm

By the way, that came off very critical, but this was still a very interesting and well-put-together article, I agree with 8, this is why I read here.

14
by Dan (not verified) :: Mon, 06/26/2006 - 1:25pm

Begone, yellowman! That's Bulger (10-5, eight). Typos shouldn't be there either, of course.

Now for coaches, leads held above average (where average hold percentage is 68.7%). These are more similar to the p value rankings, since the sample size does not vary as much (fewer coaches have small sample sizes):

+6.6 Belichick (32-5, 3)
+6.4 Dungy (51-14, 1)
+6.1 Green (48-13, 2)
+5.0 Martz (27-5, 5)
+3.6 Parcells (40-13, 4)
+3.4 Fassel (35-11, 6)
+3.3 Haslett (28-8, eight)
+3.0 Billick (27-8, 10)
+2.4 Shanahan (47-18, 7)

-1.3 Holmgren (44-22, 69)
-1.8 Capers (25-14, 67)
-2.0 Schottenheimer (33-18, 72)
-2.3 Mora (19-12, 70)
-2.5 Marchibroda (14-10, 68)
-2.9 Davis (15-11, 71)
-4.2 Erickson (15-13, 75)
-4.8 Seifert (11-12, 74)
-5.4 Campo (7-11, 73)
-6.1 Riley (9-13, 76)

15
by GBS (not verified) :: Mon, 06/26/2006 - 1:29pm

I guess I'm about to reveal my ignorance of statistics here. I'm especially reluctant to do so as I'm a Colts' fan about to argue in favor of Tom Brady.

T-test values notwithstanding, how do Plummer, Manning, and Testaverde rank ahead of Tom Brady in Table 1? I understand the concept of a higher cumulative "good" performance being better than a smaller "great performance," but Jake, Peyton and Vinnie have an extra 6 wins in an extra 26 or 27 tries. Is that still considered good?

16
by GBS (not verified) :: Mon, 06/26/2006 - 1:33pm

Sorry about that. I guess my question has been largely answered already. There were only 8 comments up when I wrote mine, but I got, um, distracted, before submitting...

17
by Countertorque (not verified) :: Mon, 06/26/2006 - 1:52pm

I like the analysis and the article. But, it seems pretty unfair to rank people on only the end of their careers. Quarterbacks especially are going to do worse later in their careers, since their ability to come back is based on physical skills.

Perhaps head coaches will stay about the same. But, I'd like to know how Belichick looks when his years with the Browns are included.

18
by GBS (not verified) :: Mon, 06/26/2006 - 1:55pm

Sorry about the triple post, but I want to retract my retraction. Manning finished ahead of Brady by posting an additional 6 wins and 21 losses. 5 wins and 18 losses got Pretty Boy Tony Banks a spot in the Bottom 10. Something doesn't seem to compute.

19
by ABW (not verified) :: Mon, 06/26/2006 - 2:24pm

Re: 18

You can't split Manning's career into the "Brady" and "Banks" sections. Manning, Plummer and Testaverde are all somewhat above average, but since they have been consistently above average for a fairly large number of samples they are rated highly by the measure(a "t-test") being used in this article. This is a little like DPAR - since Manning, Plummer and Testaverde have been above average for a long time, this measure ranks them highly, just like DPAR ranks players who perform slightly above average consistently(although DPAR would rank them highly because it's a cumulative measure, whereas this ranks them highly because the sample size is large, giving greater confidence in the result).

I think you should read comment #10 - what that commenter is doing is a little bit more like DVOA, in that it will rate short-term success very highly while not rewarding consistent performance over a period of time. This is probably a better way to do a strict ranking of who is the "best" at 4th quarter comeback situations(although it gives us no idea of whether Brady is really good has just gotten lucky).

The purpose of this article was to determine whether there are certain QBs who we can confidently say are better at 4th quarter comebacks than others, so the t-test measure makes sense if you want to prove that in general, but I agree that the "wins above average" method makes more sense for a ranking.

20
by johnt (not verified) :: Mon, 06/26/2006 - 2:42pm

#19 sums it up well. T test is kind of an odd choice here because of the large emphasis it puts on sample size (hence 10-12 Fiedler making the top list and 7-2 Roethlisburger not - I sure know which one I'd rather have, at least pre-Evil Knievel).

It was still an interesting article, I'm just not sure how much I buy the statistical conclusions. I'm willing to trust the aggregate (QB = important for comeback, coach = important for holding lead) as an appropriate use of a t-test, but for individuals I think it's pretty close to worthless at providing any more insight than eyeballing the W-L record would.

21
by MarkB (not verified) :: Mon, 06/26/2006 - 2:48pm

If a cold hard Pats fan were to comment on this article, he might say that nothing here contradicts his position. His whole point is that PM is Mr October. Great production, inversely proportional to team need.

22
by Sophandros (not verified) :: Mon, 06/26/2006 - 2:50pm

Among Saints fans, I've brought up Aaron Brooks and his comeback ability in response to the "he's not a leader" comments.

I'm happy that the Saints have Brees, and I'm hopeful that AB will do well in Oakland.

But about this as a statistic, doesn't "fourth quarter comeback" imply that you led your team to a loss for 3+ quarters?

23
by Dan (not verified) :: Mon, 06/26/2006 - 2:55pm

I think GBS (#18) is right. Looking at them again, the rankings don't look like what you'd get from p values from a statistical test. Is it possible to make the author's data and statistical analyses available for other people to look over (or even just to email them to me)?

The p value tells you how likely it is that the results you have would happen by chance. Look at Mark Bulger: he has 10 successful comebacks and 5 failed comebacks in his 15 comeback opportunities. The p value is the answer to the question "how likely is it that an average quarterback would do as well as Bulger has in 15 comeback opportunities?" In other words, if a QB had a 31.3% chance of winning the game whenever he had a chance at a comeback, and he had 15 chances at a comeback, how likely is it that he would succeed in at least 10 of those comeback chances? Instead of using t-tests, I think it's easier to understand (and also a better model for the data) if we use a binomial model (which is basically a coin flip model). The probability that a QB with a 31.3% chance of winning each game would win at least 10 out of 15 games is the same as the probability that a coin with a 31.3% chance of landing Heads would come up Heads at least 10 times in 15 flips.

You can input those numbers into a Binomial Calculator on the internet (there's one linked under my name). For Bulger, the numbers you want to use are n = 15, p = .3132, and then you ask it to compute "Prob. X at least 10". The calculator gives you the p value, .0052, which tells you that there is a 0.52% chance that an average quarterback would have done as well as Bulger has at coming back to win in those 15 games. That's very unlikely, which means that Bulger's teams were probably doing something right, and not just getting lucky.

Using this method for all 20 QBs on the list, it turns out that Bulger has the second lowest p value (which means that his results are the second least likely to have occurred by chance). He's also one of only three quarterbacks whose p values are statistically significant (less than .05). Here are the binomial p values for the 10 "best" comeback QBs, given as percents (with Jason's rankings followed by my wins over average rankings from before in parentheses):

00.37% Brady (4, 1)
00.52% Bulger (8, 2)
10.53% McNabb (7, 6)
11.68% Delhomme (9, 7)
11.68% Fiedler (9, 7)
11.85% Plummer (1, 3)
14.08% Testaverde (2, 4)
14.08% Manning (2, 4)
18.07% Kitna (5, 9)
28.35% Collins (6, 10)

And the worst (where ranking of 1 = worst and 10 = least bad):

01.60% Ramsey (7, 3)
07.13% Kanell (8, 5)
08.15% George (5, 4)
08.71% Beuerlein (2, 1)
08.75% Griese (3, 2)
09.65% Rattay (10, 8)
18.92% Warner (4, 7)
21.16% Holcomb (9, 10)
22.59% Banks (6, 9)
27.10% Brunell (1, 6)

These rankings are very different from Jason's. The order of each list seems completely unrelated to his ordering, and for most players the p values are relatively large (greater than the traditional .05 standard for statistical significance), which means that it would not be surprising for these results to occur just by chance. I don't think that the difference between a t-test and a binomial model would be enough to account for these differences.

24
by Sophandros (not verified) :: Mon, 06/26/2006 - 2:57pm

21: I don't see how the month of the year means anything here.

25
by Dan (not verified) :: Mon, 06/26/2006 - 3:08pm

Again, misspellings (MarK Bulger) and yellow head unintentional.

I found a t-test calculator online, linked under my name, and found that it gave p values similar to the binomial values I got. It's harder to input data there (you have to include each win as a 1 and each loss as a 0), so I only looked at Brady and Plummer, and found that the t-test p values are 0.0053 for Brady (compared with 0.0037 for the binomial p value) and 0.1073 for Plummer (compared with 0.1185 for the binomial p value). (For those with statistical knowhow, I did a one-sample single-tailed t-test, comparing the data set of 0's and 1's to the expected mean of .3132.)

26
by Pat (not verified) :: Mon, 06/26/2006 - 3:48pm

to the expected mean of .3132.)

Why'd you use .3132? The values listed in the article are that 603 out of 1474 late-game comebacks were successful. That's a mean of 0.409, not 0.3132.

27
by admin :: Mon, 06/26/2006 - 3:51pm

"Why would you open a hypothesis about where the greats such as Elway and Montana actually stood statistically speaking in leading comebacks only to then cut out the most important years of their respective careers for the analysis?"

We've mentioned this many times, but I'll do it again -- we only have pbp data going back to 1996. That's why so many studies on FO only go back to 1996.

28
by AD (not verified) :: Mon, 06/26/2006 - 4:30pm

I think any cold, hard Pats fan would prefer to have Dan write this article....

29
by GBS (not verified) :: Mon, 06/26/2006 - 4:36pm

The .3132 probably comes from the third paragraph (603/(603+1322)). I never could square those numbers with the figures in the first paragraph, but assumed it had something to do with games in which BOTH teams trailed during the 4th quarter and then I stopped thinking about it.

30
by CaffeineMan (not verified) :: Mon, 06/26/2006 - 4:50pm

Since I have nothing to contribute to the statistical discussion, I'll just say: cool article! And cool posts, too, especially Dan, who gave a good explanation of what the p value means, understandable by a (well, this) layman. Very interesting.

31
by Pat (not verified) :: Mon, 06/26/2006 - 5:19pm

#29: Oh, I see. But I think you're right - I think the 1322 also included some of the 603 games due to multiple team overlap. So it should be 603/1474, I think.

32
by Dan (not verified) :: Mon, 06/26/2006 - 5:36pm

On .3132 vs. .409, what GBS said. Plus, there's this verification:

Kerry Collins, listed as one of the top 10 comeback QBs, has a winning percentage of only 36.2% (17/47), so the average QB must be worse than that. And the average QB must be better than 26.4%, which is Mark Brunell's winning percentage (14/53), since he's on the bottom 10 list. And looking at coaches Cowher & Holmgren, we can narrow that gap to 27.9%-33.3%. So it looks like the number that Jason is using is close to the 31% that came out of paragraph 3 of the article, and not anything as high as 40%.

For the record, I don't have a dog in the fight as far as QB rankings go - my team has had far too many QBs for any of them to crack any of these lists (besides Table 7), and none of their coaches made it either. Now that I think of it, though, it would be nice to see the data broken down by team rather than by QB or coach. It is an interesting article, and it's cool to see the players' and coaches' comeback records, I just think that something went wrong in the way that statistics were used to analyze the records.

33
by Insancipitory (not verified) :: Mon, 06/26/2006 - 6:03pm

As awesome as the article was I can't help but mention how much I would have liked to see the a digest of defensive contributions, and even special teams! in these situations.

34
by Vince (not verified) :: Mon, 06/26/2006 - 6:52pm

I agree that "wins above/below expected" in comment 10 is the simplest, most intuitive way to ranks players/coaches.

If a QB is not listed, can we just assume he is average? I'm thinking of Michael Vick here.

35
by PackerNation (not verified) :: Mon, 06/26/2006 - 8:03pm

This was a good article. I think it would be nice if you'd do a followup on QBs who have the ball in their hands with six minutes or less to go and a chance to win or tie.

Having watched Favre in every game he's played, I can guarantee you, he's infinitely more likely to turn it over than he is to win the game since about 1998.

36
by Travis (not verified) :: Mon, 06/26/2006 - 8:42pm

Were comebacks in which the offense contributed 0 points counted as a positive for the quarterback? Examples would include this for Donovan McNabb and this for Brooks Bollinger. I apologize if this was touched on in the article.

37
by MdM (not verified) :: Mon, 06/26/2006 - 9:35pm

Great article and great comments. I feel like I'm learning something (more about statistics than football :)).

38
by the K (not verified) :: Mon, 06/26/2006 - 10:10pm

This is a very good article. I don't have much statistical analysis to contribute, so let me offer something more philosophical to think about: The reason QBs like Flutie and Harbaugh are known for their penchant for comebacks, despite evidence that they are not really that good for comebacks, is because of the *nature* of the comebacks they did have. For example, I remember the entire final set of downs in the 1998 Jags-Bills game. I remember Moulds' catch to the 1, the stuffed run by Thurman Thomas, the incomplete passes on 2nd and 3rd downs, and the bootleg on 4th and goal with 17 seconds left in the game by Flutie for the win. When the comebacks are memorable, like that, the man becomes the legend. I'm guessing Chargers fans in particular would have similar recolletions of Jim Harbaugh.

The second thing I'd like to offer, which was touched upon in the article, is the reason the QB is the most looked at for the responsibilty for a 4th quarter comeback is the fact that most close comebacks are on the arm of the QB, as running is taken off the table with time winding down, and the defense playing pass because they know the run isn't coming. So the QB that can beat the pass defense with the pass is indeed the most responsible for the comeback. He's more responsible than the coach in the sense that the playcalling, against mostly predictable medium to deep pass defense, isn't as important as the QB's ability to make the right reads and make the good throws, as well as his ability to run the 2 minute drill, getting his team to the line, knowing when to quick snap another play, knowing when to spike the ball, and knowing when to use a timeout. At least, in games closer to the beginning of the study, before communication with QBs through their helmets became more commonplace.

Since this study considers the fourth quarter in general, and given the reasons I just mentioned above, I wouldn't be surprised to find the statistical results to be very different if recalculated to only include the final five minutes, two minutes, or even simply the final drive of the game. For example, how many of Plummer's 19 wins were secured in the final two minutes, and how many were in the final five? Ten? It would be interesting to see how much the results change. (I'd guess, though it's only speculation, the coaches wouldn't deviate as much as the QBs.)

39
by the K (not verified) :: Mon, 06/26/2006 - 10:15pm

Sorry if that's a bit incoherent and rambling. I'm a bit distracted, and really excited by the prospect of having NFL Head Coach here within the next day or two, perhaps even tonight. 8) Not to get too OT but are you guys still planning on a review? If you're not able I'd be happy to submit a guest article/review after a few days with the game.

40
by admin :: Mon, 06/26/2006 - 11:52pm

Here's the deal with NFL Head Coach. EA told us they would send a review copy. EA hasn't sent one. When they send one, we'll do a review.

41
by Bobman (not verified) :: Tue, 06/27/2006 - 2:56am

I hate to get all holistic and such, especially when I liked this article so much, but Harbaugh was Captain Comeback for one season, when the Colts had a particularly stout D. So sure, if you're pretty good and given the ball with 6 minutes to go, and know your D will not F-up or may even get you the ball back if you slip-up, you'll get that rep. As a Colt fan, I have a pretty high pain tolerance (1-15 season, anyone?), and am cynical about them finding ways to lose, but back in 1995, I NEVER had a doubt if it was within 8 and Harbaugh had it near the end. Even in the AFCC game vs Pitt, there was a Hail Mary at the end that could've won it, and they had no business being there.

Similarly, Manning's high number of opportunities are likely the result of Indy's formerly soft D, and his large number of losses in these situations could well be related to a porous D that could not stop the run (and therefore allowed the opponents to grid away for 9 minutes in the 4th qtr). (I'd like to see a breakdown on Manning pre2004 and since.) Note that Jim Mora Sr was on the loser coach list, but Dungy on the winner coach list. One wonders (well, maybe just me, a few folks in Indy, and Archie and Olivia M.) what Manning's numbers would look like had he worked with Dungy a few more seasons rather than Jim "We Sucked" Mora.

Likewise, Brady, McNabb, Delhomme and maybe a few others are boosted a bit by stout D's that probably held the opponents to 3-and-outs, or got the ball back near the end. I'd include Manning in that lucky latter group, but they haven't trailed often the past couple years, compared to his first few.

Is it August yet?

42
by Sophandros (not verified) :: Tue, 06/27/2006 - 11:18am

Players are noted for their comeback abilities for the same reason that guys are called "clutch". It's selective memory.

43
by AD (not verified) :: Tue, 06/27/2006 - 11:27am

Or maybe it's all those superbowl rings that the 'clutch' guys earned by not being an idiot when it counts.

44
by Jason McKinley (not verified) :: Tue, 06/27/2006 - 12:12pm

First off: Thank you to everyone that read the article. I was afraid that the section where I try to explain t-tests and so forth might put people off (or put them to sleep) before they got to the more fun stuff.

Dan, I like the way you think. Partly because it's the way I think as well. When I first was trying to figure out how to turn the data into a usable article I decided that it would be fun to try and rank everyone in some way. That might give a good starting point for discussion. The first way I tried was, as you suggested, with binomials. In fact, I had that in place for a couple weeks before I really noticed a problem with doing it that way. Using the binomial methodology led to guys with great rankings over a handful of games outranking guys with good rankings over a lot of games. For example, Sage Rosenfels at 2-1 grades out better than Trent Green at 15-28. Jamie Martin at 4-4 is better than Kent Graham at 8-9. How can this be, when Rosenfels and Martin would both need to be above average over their next 40 and 9 games (respectively) just to get to the levels of Green and Graham? It just didn't feel right; thus, enter the t-test. The t-tests were calculated in the traditional way, using the difference in actual versus expected wins, the sample variance and N. Excel performed the calculations, but I set up the tests within Excel, and I do trust that I set them up correctly.

The most important message from the t-test isn't rank. It's significance. Therefore, don't get too hung up on the fact that Roethlisberger isn't in the top ten even though he's 7-2. Do get hung up on the fact that Roethlisberger's teams have significantly more comeback wins than expected, even though he hasn't even been in fourth-quarter comeback situations ten times yet. That's pretty amazing.

For the people that wanted still more breakdowns of home/away or for team/defense faced, all I can say is that I would also like that, but I'm just one guy. I have a job and also take classes. When the work and studying are done, I like to spend time with my wife. When my wife goes to bed, I do research on this type of project. I entered almost 50 thousand data points by hand as it was. Accounting for other things would have added more data points and made the project too cumbersome for me to do "for fun."

Peter in #12 (and later Bobman had a variation on the theme) had a good criticism that I agree with to a point. He argues that giving a middle-of-the-pack quarterback 3 or 4 more wins could very well send them into the "significant" group. Absolutely. The only problem with this is that if we start to give credit to guys for games where they brought the team back and the defense blew it, the entire league rate for comebacks would also go up. The parameters change and suddenly it's harder to get significantly better than expected if the league average has gone up a lot. You can't look at a guy in isolation here - change it for no one or change it for everyone. After changing for everyone, we might find that adding three or four wins to one middle-of-the-pack quarterback's numbers don't carry him as far as they otherwise would. (By the way, for Bobman: I looked up the 1995 Colts and it looks like they had a 3-7 record when trailing by one score in the fourth quarter. However, I can't tell how much of that record would go on Harbaugh because during that era the Colts would often play two or three quarterbacks in a game. I don't have play-by-play logs from before 1996. Also, Peyton Manning since 2004 is 3-4 in comeback situations as I describe them.)

Regarding where .313 comes from: GBS got it right in #29.

Again, thanks for reading and for the great comments. You really thought about the information and brought up a lot of good points. I'm always impressed by the readership at this site.

45
by coltrane23 (not verified) :: Tue, 06/27/2006 - 12:16pm

Not much to add to the discussion, since I know little about statistics (but there are some nice explanations in the article and the comments). Still, I did want to chime in with a "kudos" to the author. Some of the statistical permutations may go over my head, but it's obvious that a lot of work went into this article. Nicely done.

One question, and it may have been addressed but I missed it: how were games counted where a QB successfully leads a comeback drive in the closing minutes of a game, only his team's defense then relinquishes the lead? Does that game count as a loss against the QB? Seems to me like it shouldn't--the QB did his job well, he can't be expected to play defense too.

But nice work . . . much better stuff than I find anywhere else on the Internet.

46
by coltrane23 (not verified) :: Tue, 06/27/2006 - 12:20pm

And I see you addressed my question as I was typing it . . . weird. :-)

47
by chris (not verified) :: Tue, 06/27/2006 - 12:44pm

QBs don't win championships, that should be obvious with Bens 22.6 QB rating in SB40.

Defense wins championships...Defnse's also make it possible for those 4th quarter comebacks. None of those QBs are going to have a chance to win if their D can't stop the other team.

Better defense..better chance of a comeback.

Ben Roth "coming back" against NE would be a lot less impressive than David Carr "coming back" against NE.

Ben has the D...David don't.

48
by Englishbob (not verified) :: Tue, 06/27/2006 - 2:03pm

It's a really good article but maybe if looking to rank the players you could have gone binomial and used t tests to set a threshold as to what was statistically significant thus included. A second table could have included rated players not significant, thus avoid Rosenfels over Green. With that said it's still better work than I'd be capable of!

49
by jetsgrumbler (not verified) :: Tue, 06/27/2006 - 2:48pm

very cool premise, but i can't really beleive that the analysis yeilds any meaningful results. any chart that has vinny t ranked ahead of brady and bulger is clearly flawed. and i say this as a devout JETS fan.

how many of you out there want jake or peyton playing qb for your team when they are behind in an important game?

50
by Pyper (not verified) :: Tue, 06/27/2006 - 3:05pm

Great article. Great research.

But using T-tests doesn't seem to be the answer here. It's overweighting opportunities and under valueing production.

Jason, I think your criticism of the binomial system is better dealt with by applying a requirment for minimum opportunities than switching to a t-test.

A binonimal system with a requirement of at least "X" number of opportunities would be the best way to rank the players and coaches.

It seems clear to me that the the statistical results point towards Brady and Belichick being Kings of their respective fields.

Interestingly, I get the same impression when I watch both of these guys perform in real life.

);p

Great Job!!!

51
by B (not verified) :: Tue, 06/27/2006 - 3:07pm

I would want Peyton as QB when my team is behind late, as long as Vanderjet isn't the kicker. And I say this as a Pats fan.

52
by Peter (not verified) :: Tue, 06/27/2006 - 3:22pm

I agree with what has been said above, I would prefer (obviously it's not me doing the work though, heh) if it were done favoring the Sage Rosenfels of the world, then simply cut out everyone with fewer than, say, 8 attempts. Scrubs are mostly cut out, sample size isn't as big an issue, and Brady/Bulger/Roeth would be the tops.

As for the defense adjustment: you're right, Jason, basically the whole thing would have to change if defenses were taken into account, and there's no telling where any given quarterback would end up on the new scale. It depends on how you newly define a 'clutch' drive. Since it's the whole fourth quarter, there could be three or four scoring drives (or ST plays) involved there. If Manning takes back the lead early in the 4th, then Bulger takes it back again, then Manning back again, but Bulger ends up winning after a punt return for TD, who gets the "win" here? Clearly they both had ridiculous clutch performances (though against bad D's) even though neither had a "game-winning" drive. In conclusion, I have no point and don't know what I'm talking about. Also, this is all discussing work that I will never have to put time into.

By the way 47: y'know what wins championships? Being a good team. Sometimes that means offense, sometimes that means defense. Remember the Rams? They weren't awful on D, but I certainly wouldn't say it carried them to the title. Ben did indeed suck; it's amazing the Steelers were able to cover for the most important position in football failing miserably.

52
by Peter (not verified) :: Tue, 06/27/2006 - 3:22pm

I agree with what has been said above, I would prefer (obviously it's not me doing the work though, heh) if it were done favoring the Sage Rosenfels of the world, then simply cut out everyone with fewer than, say, 8 attempts. Scrubs are mostly cut out, sample size isn't as big an issue, and Brady/Bulger/Roeth would be the tops.

As for the defense adjustment: you're right, Jason, basically the whole thing would have to change if defenses were taken into account, and there's no telling where any given quarterback would end up on the new scale. It depends on how you newly define a 'clutch' drive. Since it's the whole fourth quarter, there could be three or four scoring drives (or ST plays) involved there. If Manning takes back the lead early in the 4th, then Bulger takes it back again, then Manning back again, but Bulger ends up winning after a punt return for TD, who gets the "win" here? Clearly they both had ridiculous clutch performances (though against bad D's) even though neither had a "game-winning" drive. In conclusion, I have no point and don't know what I'm talking about. Also, this is all discussing work that I will never have to put time into.

By the way 47: y'know what wins championships? Being a good team. Sometimes that means offense, sometimes that means defense. Remember the Rams? They weren't awful on D, but I certainly wouldn't say it carried them to the title. Ben did indeed suck; it's amazing the Steelers were able to cover for the most important position in football failing miserably.

54
by Peter (not verified) :: Tue, 06/27/2006 - 3:23pm

A double post! My shame is great. =(

55
by Tom Kelso (not verified) :: Tue, 06/27/2006 - 3:47pm

51:

Then you would have lost in a snowstorm in Denver; but then, you are a Pats fan, so you might be used to that.

56
by Bruce Anderson (not verified) :: Tue, 06/27/2006 - 3:51pm

Interesting article!

Unfortunately, I don't think these statistics mean much of anything. As others have said, it doesn't seem reasonable to have comparrisons of players with fewer opportunities being made to those with many opportunities, it doesn't take into account other confounders like the players at the end of their careers or the impact of the defense (which brings up another suggestion: why not do an analysis of use of the "prevent defnese"? Does use of a prevent defense really just end up preventing wins???).

But it clearly shows one thing: it demonstrates the need for footbal season to start. :D

Bruce

57
by B (not verified) :: Tue, 06/27/2006 - 5:36pm

53: Actually, the Rams had a great defense when they won the superbowl, ranked 4th by DVOA.

58
by AD (not verified) :: Tue, 06/27/2006 - 5:37pm

Bobman: Perhaps we should just re-write the whole article to see how best to highlight Manning by comparing whatever stats suits you best?

But maybe you should lay out all the stats about Brady's comebacks before you just assume he had so much more help than Manning. I think the quality of the offense skill players would be worth comparing, such as running game (Edge vs. Antowain Smith, ect.), O-Line, and WR
skill and depth. Of course this would be from 2001-2005. I imagine that a good offense helps in a pinch?

And then when it comes to defense, let's look at the whole picture. Like the fact that the Pats were average on D in most every category in 2001, dead last and much worse than the Colts in 2002, and barely cracked the top ten in 2003-2004 (in most categories), and again dead last in 2005. The fact that the Pats had some success limiting points, as in 2001,03 and 04, was due to a bend but don't break defense, which is of very little value in a pressure situation since the Pats can't seem to stop anybody until the red zone in many situations. This type of defense does not preserve time on the clock, in fact, it wastes time on the clock.

Perhaps until you examine all the facts you should not just make baseless claims about how much more help Brady gets from his team. The Pats defense has never been able to limit yards and time of possession like the Steelers, for instance. And on offense, they only had one year with a decent line-up, 2004. But nothing nearly as special as what Manning got to work with from 2001 all the way thru 2005.

Maybe quality of opponents should also be looked at. The Pats have played a lot tougher schedule than just about anybody in the league since 2001, with all the challenges that a bullseye on your back brings with it. Part of the toughness is in the way games are scheduled, as well as the quality of opponent. Another part is the sheer amount of games that Brady has played since 2001.

Maybe you can do an extensive article that weighs every possible variable and then let us know why Brady's straight percentage is currently MUCH BETTER than Manning's, but you feel that it should be the other way around.

Better yet, maybe somebody objective should do it!

59
by cjfarls (not verified) :: Tue, 06/27/2006 - 7:00pm

Re: 58

Actually AD, if you click your way over to the FO stats, you'll see that until the big flip-flop of 2005 (Indy D ranked 6th, NE 26th), NE's defense was always 10+ spots better than Indy's 2001-04:
2001 NE 19th - Indy 30th
2002 NE 10th - Indy 22nd
2003 NE 3rd - Indy 20th
2004 NE 6th - Indy 18th

Basically, your defense isn't as bad as you portray it... in fact, in 03-04 it was really good. DVOA also accounts for your "bend not break" argument, because it looks at the Def's ability to stop "success", which includes large gains & first downs, not just points.

I think we all agree that a good QB can't do a comeback on their own, and both Manning and Brady are both great QBs... lets remember that the rankings here, by all manners proposed so far are statistically suspect in one way or another.

As a Denver fan, Plummer scares me with his idiot gunslinger tendancies, but I also know that they can be a benefit in comebacks, etc. I just wish he could channel a little more Elway-ness for the rest of the time... but I'd still take him in a close 4th Qtr comeback. (Where I REALLY don't trust Plummer is down lots early, ala AFCC game... gunslinging can work for 1 or 2 drives... but then the odds of "bad things" catch up to you).

I'd take also Manning or Brady or McNabb or Delomme or pre-wreck Ben... and I guess off this analysis, Bulger looks pretty good too...

As a 'Skins fan (my current home), Brunell's pitiful performance in this analysis is depressing... especially since I mainly remember him as a Jag kicking Elway & friends butt in the playoffs... (coincidnetly which, had 3 or 4 4th quarter lead switches with Brunell & Elway both running hogwild over the opposing Def's)

60
by Ben B. (not verified) :: Tue, 06/27/2006 - 7:39pm

Re 49:
Basically the point of an article like this is to debunk conventional wisdom like yours which is based on limited and subjective personal experience. (Granted, sometimes it also confirms conventional wisdom, as this one does in part.) Sure, there are problems with the study, as outlined by others here, but not because they contradict what you personally have experienced watching football.

61
by Trogdor (not verified) :: Tue, 06/27/2006 - 9:50pm

Hey! Hey! Hey! Get that crap outta here! Take it over to the official thread for irrational Brady-Manning arguments thread where it belongs!

http://footballoutsiders.com/ramblings.php?p=232&cat=1

Oh, and great article by the way. I'd talk more about it, but I'm not gonna. Bye.

62
by Larry (not verified) :: Tue, 06/27/2006 - 10:31pm

I admire your effort, but I believe the results are skewed without taking a player's entire career into account. Elway, as an example, was shafted in this article because he was already ahead in the game more often than not with his 1996-98 teams . . . they were his best teams. But nice job, insofar as you conducted it.

63
by AD (not verified) :: Tue, 06/27/2006 - 10:32pm

CJFARLS: In order for Brady to drastically regress to a Manning level of comebacks, Brady will need to go from 13-8 to 19-29. This means Brady would have to lose 21 and only comeback 6 times. His comeback percent would have to drop from 62% now to 22% from this moment forward to match Manning.

I had to assume that people would immediately refer to DVOA as the one and only stat to portray the Pats defense. How about ALL the other categories which also matter, how about yards allowed, first downs, ect. The fact is that a bend but don't break defense that gives up lots of yards and takes a lot of time off the clock doesn't help hardly as much as Bobman implies.

Secondly, you have not addressed the OFFENSE side of the ball. Let's talk Manning's skill players in 2001 vs. Brady's. Or 2002, or 2003, even 04-05. Let's weigh both sides of this instead of making wild assumptions about Manning or Brady. I have watched Brady mount several comebacks, only to have some rookie like Givens or Graham (2002), or some total scrub, like Patten, or some fumbler, like Faulk, fail to make an easy catch or fumble the ball. Brady would have MUCH better stats in general, and more comebacks, if he had ever had an allstar cast on offense, like Manning has every season. Offense skill players make as big a difference as defense in a comeback. And more when you are trying to rate the QUARTERBACK.

I think we can be a little less simplistic than ignoring offense AND using only one stat on defense.

This article is already written from an angle which downplays Brady's much better comeback stats than Manning's because his sample size is 'only' 21 games. But Bobman wants to really break things down, so I say let's play fair and compare apples to apples. You have to compare EVERYTHING on both teams before saying that Brady has some special advantage over Manning that explains away his ability to comeback.

64
by Luke (not verified) :: Tue, 06/27/2006 - 10:48pm

This bears out my long held opinion that Mike Holmgren, though a great coach and play caller in many situations, is one of the worst when it comes to the end-game.
And Hasselbecks numbers are a bit unfair, as he has had more than his fair share of game winning drives stalled by a dropped pass, or lost after regaining the lead by a porous D. Both these things were fixed last year. Which brings up another point. Its hard to hold onto a lead if your Defense cant stop the pass, or you have a lame running game.

65
by Jim A (not verified) :: Tue, 06/27/2006 - 11:41pm

Great research, Jason. There's no real good way to rank the QBs in this study, and I agree that the t-test isn't really appropriate, but if one must rank them, it's probably the best alternative for the reasons stated.

I also wanted to point out that this study doesn't necessarily tell you whether a QB has any clutch ability above and beyond his overall ability. The data used in this research are just another "split" of stats which may or may not represent anything useful. It may turn out that something like career QB rating is a better predictor of future comeback performances than past comeback performances. But that's another study for another day.

66
by centrifuge (not verified) :: Tue, 06/27/2006 - 11:53pm

Wasn't there an article on this site sometime last season that showed a "bend-but-don't-break" defense is just a bad defense that is often bailed out by the offense pinning the other team deep?

And if you don't want DVOA to be the main stat used in discussions, you're on the wrong website.

67
by AD (not verified) :: Wed, 06/28/2006 - 10:15am

Centrifuge: The Pats defense is a somewhat above average defense that happens to be able to limit red zone scoring during certain years and create turnovers during certain years. The Colts defense has also had good years in terms of turn-overs. When you look at the yards and time off the clock this defense gives up, any rational person would not claim that this defense explains away Brady's vastly superior comeback percentage. The Steelers run the kind of defense you want in a comeback. They STOP people now!

So what you are saying about DVOA is that this website REFUSES to use any stats which exist in the world today EXCEPT DVOA? Even what is relevant to getting a full picture?

WHEN are we going to talk about offensive skill players making a difference?

By the way, I have read what other websites have posted about this article. Jason's clever comments about 'cold, hard Pats fans' ends up being inaccurate. Pats fans think that the stats in this article re-arrange the reality. Why would Brady's vastly superior comeback percentage be turned upside down? It would take years of choking for him to match Manning's stats. He will have to go 22% for like 4 seasons to catch up with Manning's stats.

68
by AD (not verified) :: Wed, 06/28/2006 - 10:19am

Of course, the real losers in this article are people like Elway and Montana, who get the shaft by not even having their stats represented correctly.

69
by Pat (not verified) :: Wed, 06/28/2006 - 10:45am

So what you are saying about DVOA is that this website REFUSES to use any stats which exist in the world today EXCEPT DVOA? Even what is relevant to getting a full picture?

If you can find another stat which is better than DVOA, feel free. I doubt you will. The ones you mentioned (yards allowed, number of first downs allowed, etc.) have all been shown or could all easily be shown to be more biased and less useful.

What you're actually saying is that you'd like to use stats which more fully support your picture.

70
by AD (not verified) :: Wed, 06/28/2006 - 10:54am

What I am saying is that we would need to analyze the entire offense at every skill position from 2001-2005 and every category of defense relative to Manning and Brady's teams before drawing any of Bobman's conclusions.

What I am also saying is that this article blows off Elway and the old timers completely, and then turns around the stats of the current QB's in a way which I happen to find highly questionable. I am not alone in my opinion, because some people like cold, straight stats.

Maybe Brady will go from 62 percent comebacks, to 22 percent for the next several seasons, and my point will be proven wrong, but I will believe it when it happens.

71
by centrifuge (not verified) :: Wed, 06/28/2006 - 11:59am

70. How's about the sum of each team's skill players' DPAR, or the averages of their personal DVOAs? Or are those also unacceptable?

72
by Pat (not verified) :: Wed, 06/28/2006 - 12:07pm

What I am also saying is that this article blows off Elway and the old timers completely,

Did you miss this comment?

We’ve mentioned this many times, but I’ll do it again — we only have pbp data going back to 1996. That’s why so many studies on FO only go back to 1996.

If you'd like to get copies of all NFL games from, say, 1980 to 1996 and extract the play-by-play from them, I'm sure Jason'd be more than happy to redo the article.

73
by Rocco (not verified) :: Wed, 06/28/2006 - 12:27pm

"I am not alone in my opinion, because some people like cold, straight stats."

The "cold, straight stats" had Arizona in the Top 10 in both offense and defense, the only team in the league to accomplish that. I didn't see a lot of the Cards, but I saw enough to know that they aren't that good. Sometimes, conventional stats are misleading.

I don't think anyone here considers DVOA to be the one true stat. Even the creators keep revising the formula at times. It's more of a "totality of the circumstances" test (apologies for the legal terms- bar studying has corrupted my mind)- you look at all the stats and indicators, and make an opinion. Sometimes, the conventional numbers and higher math line up together, and sometimes they conflict. If they conflict, the question then becomes why they conflict, and which stats are closer to the truth.

74
by coltrane23 (not verified) :: Wed, 06/28/2006 - 12:30pm

Ditto to what Luke said (#64) re: Holmgren. He does seem to come apart at the seams in crunch time, although it didn't help that receivers were dropping passes and the defense imitated a cheesecloth for a few years. It's hard to complain about a guy who has gotten his team to the Super Bowl three times, but man, is he frustrating to watch sometimes as a fan when the game is tight.

He tried to give away the game in STL last year with a horrible play-calling sequence just prior to the Rams' fumbling away the punt, and don't get me started on the atrocious two-minute drills in the SB. Some of the trouble is attributable to execution, of course, but not all of it. He's a good coach, that much is obvious, but he frequently makes close games a little more interesting than they need to be at the end.

And as for Bulger, yeah, he's got ice in his veins. Doesn't surprise me that he'd be in the top 10 at this particular metric.

Too bad there's not PBP data available prior to '96, because this would be a really interesting project to take back even further in time. (As long as I didn't have to write it. :-) )

75
by Rodrigo (not verified) :: Wed, 06/28/2006 - 12:44pm

The alternative headline for this article was "Look How Bad Every Redskin QB has Been" --- A not so shocked fan

76
by Dan (not verified) :: Wed, 06/28/2006 - 1:25pm

Sage Rosenfels at 2-1 grades out better than Trent Green at 15-28. Jamie Martin at 4-4 is better than Kent Graham at 8-9. How can this be, when Rosenfels and Martin would both need to be above average over their next 40 and 9 games (respectively) just to get to the levels of Green and Graham?

I like your thinking here, Jason (#44). In fact, you could build the whole ratings system out of it. If one player has more attempts than another, and he was above average in those extra attempts, then he should be rated above the other guy. If he was below average in those extra attempts, then he should be rated below the guy. That would give you ... the rankings from my first comment (#10).

Your t-test ranking doesn't meet those criteria (as GBS pointed out in comments fifteen & eighteen), which is why I started to suspect that it wasn't actually based on the p values of a t-test. And when I tried a t-test on a couple QBs, I found that their p-values were not in the order of your ranking (#25). Are you sure didn't make a mistake with a square root or something in your formula? And could you share the p values that you found, at least for Brady & Plummer?

If anything, ranking quarterbacks by wins over average puts QBs with lots of chances too high (on the best QBs list, and too close to the bottom on the worst QBs list). First of all, it's a cumulative stat like DPAR, not an average stat like DVOA, so it's going to give above average players with more attempts higher rankings than similar players with fewer attempts (even if the player with more attempts isn't quite as good on a per-attempt basis). And second, even average players with lots of attempts can look very good (or very bad) compared to players with fewer attempts, just by chance. It's not unusual for a fair coin to turn up heads 1015 times in 2000 flips just by chance, and that would rank above a coin that had 29 heads in 30 flips in total heads above average, even though the latter result is much less likely to happen by chance, in addition to being a much higher percentage heads. I don't think that it's a problem for the binomial to rank Trent Green at 15-28 close to average, because that performance is close to average - just a win and a half more than expected after 43 attempts, which could easily happen just by chance, and just 32.5% success rate in the games beyond what Sage Rosenfels played. (The statistical argument here is that, when you're looking at the total number of successes for some statistic, or the total number minus the expected number, the error is going to be roughly proportional to the square root of the sample size, which means that it will increase as sample size increases. Larger sample size does mean that you have a more accurate estimate when you're looking at averages like the success rate, since then the error is roughly proportional to one over the square root of the sample size.)

77
by Jason McKinley (not verified) :: Wed, 06/28/2006 - 3:16pm

My presentation of at least one important thing in this article was very poor. That is, how many smiley-faced emoticons did I need to place around my "word of caution" section to make people really believe that the order in the rankings wasn't as important as being significant in the first place? Going back and reading it, I should have been a lot more emphatic on that point. I failed to drive that point home nearly as hard as I should have. The rankings were practically arbitrary as presented, but gave a good jumping off point for debate and discussion. In retrospect, I probably should have just not ranked anyone and talked instead about "significantly good," "significantly bad" and "not significant." But then where would the Brady-Manning debate go if they're both presented as "significantly good" equals? We all know how very important it is to try and add more to that debate.

Here's more about t-tests: They are a great way to check for significant differences in a group with a small sample size. That's why they are a good fit here. Back in my pharmaceutical research and development days we often made use of transgenic mice. These mice would be genetically altered so they would die of some "human" disease if left unchecked. Let's say an untreated transgenic colony would have a mortality rate of 68.7% after 4 months. We want to try various treatments on them, but homozygous transgenic mice that die at that age can't breed, so we can't have large groups. We may only have 8-20 to work with for any treatment. After treatment, we look at the mortality rate at four months and compare it to the control group. We can do this with a t-test. Now, the fewer the mice, the smaller the p-value we want to see; if we didn't see an affect at the p

78
by Jason McKinley (not verified) :: Wed, 06/28/2006 - 3:21pm

(whoops, used a "less than" sign) Now, the fewer the mice, the smaller the p-value we want to see; if we didn't see an affect at the p "less than" .01 level, we probably wouldn't go ahead from there.

But let's look at two drugs that are significant. We'll call one Roethlistatin and another one Kentgrahamalol. Kentgrahamalol has a stronger p-value after the t-test than Roethlistatin, but both are significant at p "less than" .01. We investigate further. The Kentgrahamalol mice had nine out of 17 deaths, while the Roethlistatin mice had only two out of nine deaths at the 4 month time point. Both are worthy of consideration, but which drug is actually better? (Consider that we may be forced for whatever reasons to only proceed with one drug at this development stage.) Well, the t-test already did its job: It told us that the mortality rate in both groups of treated mice is significantly better than in the control population. There really isn't any need to look at the t-test further. (I did look at the t-tests further and used them for ranking. I explained in the article that it was wrong, but felt so right. I realized this would be controversial when I did it. I just didn't like my other options quite as much.) Now, if we don't just want to guess, we could break out the binomial method to see which of the significant drugs is better. Based on this, Roethlistatin is a solid winner, despite not being significant at quite the same level as Kentgrahamalol.

Personally I felt that the article spent far too much time explaining statistics as it was, or perhaps I would have changed the rankings in the way I've intimated above - by running binomials on only the players that have been shown to be statistically good or bad. But then I would have needed to explain binomials, on top of explaining t-tests, ANOVAs and mentioning Shapiro-Wilk. It was just too much for one football article, in my opinion. Maybe I was wrong.

Anyway, if we run binomials on only the significant groups of quarterbacks, the new order for the "good" group would actually be, from number 1 to number 21: Brady, Bulger, Roethlisberger, Young, Elway, McNabb, Delhomme & Fiedler, Plummer, Graham, Manning & Testeverde, Couch, Kitna, Brooks, Stewart, Collins, Green, Culpepper, Grbac and Gannon.

The new order for the "bad" group would be (from number 145 to number 162): Flutie, O'Donnell, Carr & Harbaugh & Hasselbeck, Frerotte, Brunell, Banks, Holcomb, Warner, Rattay, Griese, Beuerlein, George, Kanell, Reich, Tolliver and Ramsey.

While I much prefer these as actual rankings compared to using the t-test significance levels as I did in the article, it doesn't mean that doing it this way is perfect. Again, the main point I wanted to get across in the article as a whole was that some quarterbacks are better than others at bringing their teams from behind, some coaches are better than others at holding small, late leads, and based on the data we have to this point, these individual differences may actually be important or at least somewhat consistent. Then I wanted to mention who looks good and bad according to the t-tests. That's pretty much it. The rankings, as I've said again and again now, were more for fun and to move the discussion.

79
by AD (not verified) :: Wed, 06/28/2006 - 3:42pm

Consider the discussion moved!

80
by Pat (not verified) :: Wed, 06/28/2006 - 3:46pm

I can't remember offhand the proper way to compute quantization errors on the p-values when comparing to a binomial test, but given the small sample sizes, you'd have to take that into account. That is, Brady and Bulger are close (0.37% compared to 0.51%) but in actuality, even if they were exactly the same player, they couldn't've ended up with the same values given the numbers of games they played - had Bulger won one more game, his would've been 0.1%, and had Brady lost won more game, his would've been 1.2%. Similar with Roethlisberger as well.

There's just no way to call those players anything but equivalent. But that's the big problem with ranking things by p-value, of course: for small numbers, things can change by a lot with just one more piece of data.

81
by stan (not verified) :: Wed, 06/28/2006 - 4:20pm

Fun article, but not worth a whole lot in terms of serious evaluation.

Any serious stat would have to adjust for the quality of the teams involved. If QB X plays great and manages to keep his worthless team destined for a 4-12 record within 8 points against the champs until the end of the 3d Q, saying he "blew" a comeback chance is really silly.

As noted by others, the quality of the defenses for both teams involved makes a big difference. This is really a smaller subset of adjusting for the overall quality of talent on both teams. Let's face it a team with superior talent should mount a higher percentage of 4th Q comebacks than a team with inferior talent. (Of course, this is true of every stat which tries to measure QBs. The quality of the O-line is the single greatest factor affecting a QB's performance.)

Timing is an issue. I don't think that a drive early in the 4th Q to overcome a 3 pt deficit is a big deal. As long as a dominant offensive line has time to win the game by driving it down the throats of the defense, QB "heroics" really aren't involved. It would be better to look at situations where a team was down to its last chance and had to drive most of the field to win.

Given the small subset of games, it would also help to throw out instances where the winning score was made or set up by the defense or special teams. Why give a QB credit for being clutch if he had nothing to do with the comeback?

I'm not criticizing someone who does a study like this for not putting in weeks and weeks of extra work. You don't have time for it and no one expects it. My point is to say that we ought not put a whole lot of stock in the results given the limitations.

A fun thing to look at. A big attaboy for trying.

82
by AD (not verified) :: Wed, 06/28/2006 - 4:28pm

RE#78: Thanks, Jason, for appeasing even the most cold, hard football guy on this thread.

83
by Peter (not verified) :: Wed, 06/28/2006 - 5:22pm

A look at only the last drive would be a nice start that doesn't seem like it should take a horribly long time... at least, less than adjusting for defenses and all that. I'm guessing that would make the sample sizes truly tiny for most of these though.

84
by stan (not verified) :: Wed, 06/28/2006 - 7:51pm

Peter,

Good point on sample size. If we spent a great deal of time examining game tape to isolate only situations which were truly "clutch", found a way to adjust for talent differential, etc., we'd likely come up with such small sample sizes that we'd be unable to craft a meaningful stat.

Of course, baseball number crunchers swear that "clutch hitting" is a myth. And there, you have a huge data set and a sport where hitters are not dependent on teammates to succeed (although the luck factor is huge).

85
by Pat (not verified) :: Wed, 06/28/2006 - 8:45pm

Of course, baseball number crunchers swear that “clutch hitting� is a myth.

Actually, be careful - baseball statistics shows that clutch hitters are a myth. Clutch hits are of course real - the hit that wins a game is a clutch hit. What they're saying is that there aren't players who consistently hit better than other players in only clutch situations.

There are, incidentally, NFL teams which consistently play better in tight situations, but clock management is a real factor in NFL games that only really comes into play in tight situations, so there are other factors than just "magic clutch" to consider.

86
by B (not verified) :: Thu, 06/29/2006 - 11:40am

The latest edition of Baseball Prospectus, which I guess would be the definitave word from the "baseball number crunchers" has an essay about clutch hitters. The conculusion was there are some players who do better in clutch situations.

87
by B (not verified) :: Thu, 06/29/2006 - 11:40am
88
by Jim A (not verified) :: Thu, 06/29/2006 - 2:01pm

I doubt that even the Baseball Prospectus authors would be so presumptuous as to describe their work as being the "definitive" word from baseball number crunchers, especially today during the annual SABR convention weekend.

A better consensus of sabermetricians would probably be that clutch hitters may exist, but that it's highly questionable whether any such evidence will ever be sufficient to serve any practical purpose in decision-making. See link for more discussion.

89
by Pat (not verified) :: Thu, 06/29/2006 - 3:38pm

The conculusion was there are some players who do better in clutch situations.

That's actually "there have been players who have done better in clutch situations", not "there are players who consistently will do better in clutch situations."

I'm actually a little surprised by that article - it would've been relatively simple to run a quick, simple, simulation and see if the players listed there fell well outside a random distribution. My guess would be not, but that's just my guess.

90
by zlionsfan (not verified) :: Mon, 07/03/2006 - 2:01pm

I believe that you could summarize the sabermetric community's position on the existence of a "clutch" ability like so:

Phase I - initial position (up until a year or two ago?): There is no such thing as "clutch" ability.

Phase II - current position: There may be a "clutch" ability, but it is unlikely that we will be able to isolate it.

Now, if you can't isolate clutch play in a sport that consists largely of isolated events that have been carefully logged for decades and decades, I do not believe it would be reasonable to suggest it could be identified in football.

I like the article - it displays some interesting data (maybe Jason Hanson gets a little credit for the Harrington "comebacks"?) and gives us something to discuss. The resurrection of the M-B silliness certainly wasn't the author's fault.

It sounds like what might be helpful is a Faithful Reader with time and statistical knowledge to help Jason out with further research while he makes up for the time he spent on this project ... (no, I'm not volunteering myself, but I suspect there are those above who would be qualified)

91
by morganja (not verified) :: Mon, 07/03/2006 - 8:47pm

Interesting game with the statistics but the analysis only tells us one thing, which quarterbacks have been on the field when their team came back in the fourth quarter to win the game. That's all. This is another example of statistical models being used outside the limited scope of the model. There is too much extrapolation of conclusions. It answers a simple tightly defined historical question, no more. It really doesn't say anything about which quarterbacks are actually playing better than other quarterbacks in fourth quarter comebacks and certainly doesn't address the question of who are good fourth quarter comeback quarterbacks.
I'm not criticizing the author, merely pointing out that statistics are not a magic wand. So many statistical studies are done without really analyzing the relevency of the model to the question asked.

92
by Sophandros (not verified) :: Wed, 07/05/2006 - 11:33am

91: Great point, which is something that has been mentioned in the past in some of the debates of who's a better QB, etc.

When people talk about "intangibles" and "rings", what are they REALLY talking about? Is it the player, his team, or his circumstances? Or is it a combination of all three?

93
by Nino (not verified) :: Fri, 07/07/2006 - 1:14pm

Well, as a dolphin fan, I'm not surprised by Fiedler making the list. He always distinguished himself by playing nightmare first quarters, usually including an INT or 2 returned for a TD, and turning it around in the second half. Heck, in 2003 he got into a game, coming back from an injury, that B. Griese had been horrible in, and won it in the 4th coming back from at least 10 points, against Bills or Skins, I don't remember exactly. In fact, if we could only erase 1st halves from our memories, we'd remember a pretty darn good QB.

I bet you can't come up with a stat to account for THAT!

94
by Lee Courtney (not verified) :: Tue, 07/11/2006 - 2:35pm

Loved the take on comebacks.
When you have the time I'd be interested in comebacks in the last 2 minutes.
and
Did you factor in a defensive comeback-a la an interception run back for a touchdown?

thanks,
Lee

95
by Ray (not verified) :: Tue, 01/09/2007 - 4:33pm

What would be interesting to see is weighting comebacks by time remaining. Brady is great, but generally the Patriots can take the lead with 14:00 remaining and the defense will hold the opponent. Brady gets credit for a 4th quarter comeback and the media says WOW! Whereas Marino or Manning may have to retake the lead 2-3 times in the 4th quarter - because the Dolphins and Colts didn't play defense in the respective eras. Although overall, this is a fantastic analysis.