Presenting Adjusted Pythagorean Theorem

Presenting Adjusted Pythagorean Theorem
Presenting Adjusted Pythagorean Theorem
Photo: USA Today Sports Images

Guest column by Brett Lieblich

Garbage time in the NFL is relatively rare. Teams that are winning big tend to shorten the game by keeping the clock moving; likewise, teams that are losing big tend to allow the opposing team to shorten the game either by pulling starters or running out the clock themselves. Nevertheless, points are scored by almost every team each year in virtually meaningless situations. For a good example, consider the 2009 Buffalo Bills. If you were following this 6-10 team, the Bills rewarded you with seven blowout games that included scoring in the last 4:30 of regulation. For whatever reason, teams really wanted Dick Jauron fired.

I have volunteered to wade through the muck of these otherwise unwatchable games to determine if even the ugliest points still count. Ultimately, I was interested in the impact of the removal of garbage time in predicting future results.

A common theory among sports statisticians is that a given team's winning percentage can be estimated by its totals of points scored and allowed. As explained in the FO glossary:

Pythagorean Theorem: The principle, made famous by baseball analyst Bill James, that states that the record of a baseball team can be approximated by taking the square of team runs scored and dividing it by the square of team runs scored plus the square of team runs allowed. Statistician Daryl Morey later extended this theorem to other sports including professional football. Teams that win a game or more over what the Pythagorean theorem would project tend to regress the following year; teams that lose a game or more under what the Pythagorean theorem would project tend to win more the following year, particularly if they were 8-8 or better despite underachieving.

One of the benefits of Pythagorean expected wins is that they are relatively simple to digest, and they rely entirely on point differential rather than win/loss outcomes to measure performance for each team while eliminating the luck factor of close-game results. From there, we can look at the team's actual record and winning percentage and determine whether they over- or under-performed their true level of performance, which often helps predict that team's win-loss record in the following season. What happens if we adjust this equation for garbage-time scoring?

First, a definition of "garbage time" is in order. For this study, I limited "garbage time" to two specific scenarios:

  • 1. Points scored by the winning team while winning by 17 points or more with fewer than nine minutes left in the game.
  • 2. Points scored by the winning team while winning by nine points or more with fewer than four minutes left in the game.

Note that garbage-time points were removed for both the offense and defense. I also examined the effect of removing points scored by the losing team while in comeback mode -- i.e., down 24 points or more in the fourth quarter. However, these points actually proved to be a good indicator of future results.

Since the point of adjusting for garbage time is to increase the predictive quality of Pythagorean, we want to ensure that we don't overcorrect and eliminate points that were scored in competitive situations. Ultimately, we would rather overlook a few garbage-time points than accidentally ignore points that actually mattered. Therefore, I chose "up by three scores with nine minutes remaining" and "up two scores with four minutes remaining" as the parameters for garbage time. Varying the time -- such as from nine minutes to eight or ten -- had a miniscule effect on the included results, and it was more important to establish the score differential than the exact time left in the fourth quarter. A team that is up 17 points with nine minutes left has a 99.9 percent chance of winning according to Pro Football Reference's win probability model.

A quick refresher on Pythagorean Expected Wins:

EXP = 1.5*LOG((PF+PA)/16)


Where EXP is the exponent in the Pythagorean equation, PF is points scored by a team, and PA is points against a team. The exponent, developed by Aaron Schatz and Football Outsiders, adjusts for the offensive environment of each team and replaces the static exponent of 2.37 suggested by Daryl Morey when he adapted Pythagorean wins for the NFL. Further reading on Offensive Environment can be found here.

Using the Pro Football Reference Game Play Finder Tool, I isolated the scoring for our garbage-time scenarios. Once I removed the garbage-time points from each team, I had new values for PF and PA to plug into the Pythagorean and Exponent equations above. I then calculated the Pythagorean expected wins over an 11-year sample from 2006-2016, which we can call Adjusted Pythagorean (ADJ PYTH). In the sample there are 17 teams with at least a half-win difference between Adjusted Pythagorean and regular Pythagorean.

Major Swings In Adjusted Pythagorean Theorem
Year Team W L PF PA Pythagorean
ADJ PF ADJ PA ADJ PYTH Delta Following Year
Change in Wins
2009 IND* 14 2 416 307 10.88 383 307 10.12 -0.76 -4
2016 PIT* 11 5 399 327 9.94 372 327 9.26 -0.68 ?
2006 BAL* 13 3 353 201 12.57 322 201 11.91 -0.66 -8
2007 JAC* 11 5 411 304 10.85 380 301 10.22 -0.64 -6
2012 CHI 10 6 375 277 10.80 351 277 10.21 -0.60 -2
2012 JAC 2 14 255 444 3.26 255 410 3.84 0.58 +2
2007 MIA 1 15 267 437 3.66 267 406 4.24 0.58 +10
2008 PIT* 12 4 347 223 11.79 323 223 11.22 -0.57 -3
2015 IND* 8 8 333 408 6.01 333 385 6.58 0.56 0
2008 SD* 8 8 439 347 10.32 415 347 9.77 -0.55 +5
2012 TEN 6 10 330 471 4.60 330 444 5.13 0.53 +1
2007 BUF 7 9 252 354 4.94 252 333 5.48 0.53 0
2016 SEA* 10 5 354 292 9.82 347 269 10.35 0.53 ?
2009 SF 8 8 330 281 9.51 330 265 10.02 0.51 -2
2006 HOU 6 10 267 366 5.11 264 342 5.62 0.51 +2
2015 TEN 3 13 299 423 4.75 299 400 5.25 0.50 +6
2008 OAK 5 11 263 388 4.50 256 357 5.00 0.50 0
* Playoff Team.
Teams in red font saw estimated wins decline in adjusted Pythagorean theorem.
Teams in black font saw estimated wins increase in adjusted Pythagorean theorem.

Of particular note are the 2006 Ravens, the 2007 Dolphins, and the 2015 Titans. In the case of the Ravens, their 12.57 Pythagorean wins were fifth highest in this sample, but they ultimately followed up with a disappointing 5-11 campaign in 2007. As always, there are a large number of factors that weigh into such a staggering decline (the declining play of an aging Steve McNair, replacing Jamal Lewis with Willis McGahee, and the free-agent departure of Adalius Thomas, to name a few.) Traditional Pythagorean considers this Ravens team as one of the best of the decade, but adjusting for garbage time reveals that they were a very good team but perhaps not an all-time great one. Their Adjusted Pythagorean wins of 11.91 ranked only 18th due to the 31 garbage-time points they scored in blowouts over the Steelers, Chiefs, Raiders, and Bucs. Once we remove the meaningless scoring, the Ravens appear more susceptible to a rapid decline.

In the case of the 2007 Dolphins, a team that increased its win total in 2008 by a staggering 10 wins, Adjusted Pythagorean suggests that the Dolphins were indeed a much better team than both their record and their regular Pythagorean would indicate. Miami was the unfortunate subject of other teams running up the score on them. As Tom Brady, Daunte Culpepper, and the Bengals' Shayne Graham ran up the score late in the fourth quarter to the tune of 31 points, the Dolphins were trying out backup quarterbacks like "Miss" Cleo Lemon and trying to stave off embarrassment with a washed up Trent Green.

[ad placeholder 3]

A similar story follows for the recently improved 2015 Titans, who were more than two full wins better than their record according to Adjusted Pythagorean and became a borderline playoff contender at 9-7 last year. Coming into last season, most were skeptical of the Mike Mularkey's "exotic smashmouth" Titans, but this overlooked their underrated defense (which was subjected to 23 garbage-time points the year before) as well as their improving and healthier young quarterback. These factors combined for a dramatic six-win turnaround, and Mularkey ripped off the interim label while returning the Titans to a watchable state.

As is always the case, there are a couple of exceptions in the above group. The 2008 Steelers and 2009 49ers are important reminders that Pythagorean will never account for all the changes, improvements, and declines from season to season for an NFL team. You may also note that the 2009 Colts actually have the largest difference between Regular and Adjusted Pythagorean. While that is true, they are not an especially interesting case for comparison since both metrics predict the resulting decline of around four wins.

Both Pythagorean and Adjusted Pythagorean identify teams which are likely to improve or decline, but once we remove the garbage-time points, Adjusted Pythagorean simply does a better job of predicting future wins.

In particular, Adjusted Pythagorean does especially well in predicting the future of teams that had outperform their Pythagorean record by one or more wins -- team like the already mentioned 2006 Ravens. In the first two data points, regular Pythagorean actually shows a dip from the largest outperformers to the teams that out-performed by one or two wins. Adjusted Pythagorean corrects that and produces the more intuitive result that the greatest over-performers will decline the most, and the greatest under-performers will improve the most. For the bulk of the teams in the middle of the curve, Adjusted Pythagorean does as good of a job as regular Pythagorean. At the extremes, it represents an incremental improvement with no associated drawback.

Let's break the data into groups of teams with similar Pythagorean differences (expected wins minus actual wins). Segments were chosen to contain 40 to 60 teams with the exception of the extremes, which contain the remaining teams.

Adjusted Pythagorean Theorem vs. Regular Pythagorean Theorem
Expected Wins -
Actual Wins

Adjusted Pythagorean
Regular Pythagorean
Mean Exp - Actual Number of Teams Average Change
in Wins
Mean Exp - Actual Number of Teams Average Change
in Wins
Less than -2.0 -2.57 15 -3.07 -2.52 15 -1.47
-1.0 to -2.0 -1.47 58 -1.97 -1.44 56 -2.11
-0.5 to -1.0 -0.73 43 -1.49 -0.72 46 -1.30
-0.5 to 0.0 -0.25 45 -0.18 -0.23 50 -0.36
0.0 to 0.5 0.23 45 0.27 0.26 43 0.02
.05 to 1.0 0.76 42 0.98 0.75 45 1.40
1.0 to 2.0 1.36 51 2.12 1.34 47 1.66
More than 2.0 2.40 21 3.29 2.40 19 3.79

Overall R-Squared Value: 0.23 Overall R-Squared Value: 0.19

Here the averages of the segments in the table, and the resulting average change in wins for each group, are plotted. For the majority of teams, their expected wins were within one win of their actual wins for the season. Another third of teams differed by one or two wins, with the remainder having the highest variation of more than two wins.

As we head into the 2017 NFL draft, let's take an early look at what Adjusted Pythagorean can tell us about the upcoming season. Regular and garbage-time Adjusted Pythagorean disagree strongest for 2016 when it comes to the Steelers and Seahawks, but they disagree for different reasons. In the case of the Steelers, regular Pythagorean predicts a moderate decline, whereas Adjusted predicts a full two-win decline due to the Steelers' league-leading 27 offensive garbage-time points. The Seahawks are the opposite, the victim of the most garbage-time points allowed (23). ADJ PYTH predicts improvement in 2017, while PYTH would actually predict a slight decline for Seattle.

[ad placeholder 4]

When looking forward to 2017, Adjusted Pythagorean ranks the Raiders, Texans, Dolphins, Giants, and Cowboys as the greatest over-performers, and thus most likely to decline (all other things equal). All were nine-plus-win playoff teams who over-performed by more than two wins. On the flip side are the Jaguars, Browns, Chargers, and Bears, cellar-dwelling teams that underperformed their Adjusted Pythagorean wins by more than two full wins. Also notable on the underperformers list are the Bengals and Cardinals, two teams who failed to return to the playoffs after reaching the postseason in 2015. Adjusted Pythagorean predicted them to be eight- or nine-win teams in 2016 despite their six- and seven-win seasons. Teams in the data set that underperformed by more than two wins increased their win totals by an average of 3.29 wins, so we can justifiably expect the Bengals and Cardinals to contend for the playoffs next season.

With 4,800 garbage-time points removed, we have a clearer view into the tea leaves of the NFL. Adjusted Pythagorean is especially well suited to help us make sense of a few teams per season, and it fortunately does so without muddying the waters of the rest. While we can't go back in time and un-watch those 2009 Bills, perhaps we can learn to appropriately dismiss meaningless blowout scoring. Taking out the garbage wasn't an overly illuminating improvement to Pythagorean, but the result is a cleaner version of an accessible statistic. Adjusted Pythagorean shows that a blowout is like a coincidence… there are no small blowouts and big blowouts, just blowouts.

The following table shows regular and adjusted Pythagorean wins for all teams in 2016.

2016 Adjusted Pythagorean Theorem
Team W PYTH ADJ PYTH PYTH-Actual ADJ PYTH-Actual Abs ADJ PYTH-Actual Abs Delta
OAK* 12 8.79 8.61 -3.21 -3.39 3.39 0.17
HOU* 9 6.49 6.57 -2.51 -2.43 2.43 0.09
MIA* 10 7.54 7.73 -2.46 -2.27 2.27 0.19
NYG* 11 8.82 8.92 -2.18 -2.08 2.08 0.10
DAL* 13 11.02 10.95 -1.98 -2.05 2.05 0.06
KC* 12 10.15 10.11 -1.85 -1.89 1.89 0.04
PIT* 11 9.94 9.26 -1.06 -1.74 1.74 0.68
TB 9 7.59 7.47 -1.41 -1.53 1.53 0.11
NE* 14 12.82 12.56 -1.18 -1.44 1.44 0.26
DET* 9 7.66 7.66 -1.34 -1.34 1.34 0.00
TEN 9 8.08 8.08 -0.92 -0.92 0.92 0.00
GB* 10 9.10 9.19 -0.90 -0.81 0.81 0.10
LARM 4 3.31 3.43 -0.69 -0.57 0.57 0.12
ATL* 11 10.89 10.60 -0.11 -0.40 0.40 0.29
NYJ 5 4.39 4.60 -0.61 -0.40 0.40 0.21
WAS 8 8.34 8.16 0.34 0.16 0.16 0.18
Team W PYTH ADJ PYTH PYTH-Actual ADJ PYTH-Actual Abs ADJ PYTH-Actual Abs Delta
DEN 9 9.09 9.22 0.09 0.22 0.22 0.13
SEA* 10 9.82 10.35 -0.18 0.35 0.35 0.53
MIN 8 8.60 8.52 0.60 0.52 0.52 0.09
IND 8 8.48 8.56 0.48 0.56 0.56 0.08
BAL 8 8.64 8.65 0.64 0.65 0.65 0.01
CAR 6 7.14 7.02 1.14 1.02 1.02 0.12
NO 7 8.34 8.34 1.34 1.34 1.34 0.00
BUF 7 8.55 8.53 1.55 1.53 1.53 0.02
SF 2 3.94 3.99 1.94 1.99 1.99 0.05
PHI 7 9.01 9.00 2.01 2.00 2.00 0.01
CHI 3 4.71 5.02 1.71 2.02 2.02 0.31
CIN 6 8.30 8.21 2.30 2.21 2.21 0.09
ARI 7 9.44 9.46 2.44 2.46 2.46 0.02
CLE 1 3.34 3.55 2.34 2.55 2.55 0.21
SD 5 7.68 7.60 2.68 2.60 2.60 0.08
JAC 3 5.79 6.03 2.79 3.03 3.03 0.24
* Playoff Team.
Teams in red font saw estimated wins decline in adjusted Pythagorean theorem.
Teams in black font saw estimated wins increase in adjusted Pythagorean theorem.

Brett Lieblich is a Green Bay Packers fan living in Philadelphia who does cloud software sales. He is a sports analytics advocate who once charted all the passes in a Tufts University Ultimate Frisbee game. You can contact him on Twitter @reptilestuff.


10 comments, Last at 21 Apr 2018, 12:50pm

#1 by jawillis87 // Apr 14, 2017 - 2:46pm

Great stuff.

Maybe the Bills should bring Jauron back to talk to Richie Incognito about the effects of bullying.

Points: 0

#2 by Pen // Apr 15, 2017 - 12:38am

Someone should tell Richard Sherman his team isn't actually in decline.

Points: 0

#3 by ChrisLong // Apr 15, 2017 - 11:10am

"I also examined the effect of removing points scored by the losing team while in comeback mode -- i.e., down 24 points or more in the fourth quarter. However, these points actually proved to be a good indicator of future results."

I found this statement to be interesting, and wondered if you could expand upon a bit. How much did losing team garbage time scoring improve the fit to the data? Do you have any thoughts on an explanation? Thanks!

Points: 0

#5 by skibrett15 // Apr 16, 2017 - 4:30pm

Great question.

By "these points actually proved to be a good indicator of future results" I meant that removing losing team garbage time scoring actually decreased the predictive quality of Pythagorean. Honestly this was quite a surprise to me, even when I upped the threshold to down 24+.

The overall correlation for this sample for traditional Pythagorean difference and future change in wins was 0.19, and it dropped to about 0.17 if you remove the comeback scoring. Once I saw this general trend, I didn't attempt to optimize the range for "comeback garbage time", though there may yet be some room for optimization on this as well as the overall definitions I used for garbage time scoring.

If I had to venture a theory on why these scores are relevant, I think there are a couple of factors. First, it's in the defenses interest to end the game as quickly as possible, so they are often still competing at their highest strategic and effort levels. Teams are also far, far less likely to pull their starting QB or superstar offensive players in a desperation comeback effort. So despite the scoreboard differential, both of these situations lead to competitive football, so Pythagorean should count these points with equal weight to normal game scoring.

Even though the game outcome was already decided, Pythagorean is simply measuring the overall team quality, and the results show that mounting a small yet futile comeback is an indicator of future improvement. Likewise, allowing a bend/don't break "prevent" style defensive comeback is probably not a good indicator of an excellent defense despite the blowout win.

Points: 0

#4 by RobotBoy // Apr 16, 2017 - 8:32am

A degree in the humanities doesn't qualify me to say anything except, 'This seems like a good thing.'
One incidental remark: I don't know if '...the free-agent departure of Adalius Thomas...' hurt Baltimore but he sure as heck didn't help his new team.

Points: 0

#8 by Aaron Schatz // Apr 17, 2017 - 4:37pm

It's generally true in the NFL that free-agent departures hurt the team losing the player more often than they help the team adding the player.

Points: 0

#6 by Hoodie_Sleeves // Apr 17, 2017 - 11:06am

This is probably a small enough question that it doesn't matter, but when you say "winning" do you mean "winning at the time" or "team that won the game?"

There are probably few enough of these comebacks to make a difference.

Points: 0

#9 by jgibson_hmc95 // Apr 20, 2017 - 11:24am

I just wanted to say that this was really good work. Something that seems intuitive, yet everybody I know that has ever tried to improve their power ranking in any sports by capping blowouts has made them less predictive. Even if the effect is tiny here, I think this could lead to some big breakthroughs in this area.

That blue dot vs. the red dot on the left is very telling.

Points: 0

#10 by medelste // Apr 21, 2018 - 12:50pm

an annual update on this, please!

Points: 0

Save 10%
& Support the Writers
Support Football Outsiders' independent media. Use promo code WRITERS to save 10% on any FO+ membership and give half the cost of your membership to tip the team of writers.