Two months ago, I discussed the strength of schedule challenge that the College Football Playoff committee would likely face at the end of this highly unusual season. The Big 10 and Pac-12 had not yet begun their respective seasons back in mid-October, while other teams had already faced six FBS opponents at that point in the year. Comparing the respective bodies of work by playoff contenders would require balanced judgments of quality and quantity of results. We've had silly "extra data point" arguments bubble to the surface in previous playoff scenarios with teams that didn't play in a conference championship game, but we've never had "twice as many data points" arguments to settle like we're blessed with in 2020.
As we've discussed and debated many times, the selection committee's stated charge to pick the "four best teams" at season's end does not always match up well with what our efficiency ratings indicate are the four "best," nor does it always match up well with other computer models or betting market versions of the same. The selection committee's picks do match up very well, however, with measures of the four "most deserving" teams at season's end. Team achievement -- who did you defeat, and who did you lose to -- matters more to the committee in the end than how did you win or lose those games in terms of performance metrics. The committee claims to do one thing, but they've proven to do another in practice.
The table below demonstrates this distinction between efficiency against schedule (performance) and record against schedule (achievement) through the lens of FEI and the schedule strength ratings derived from those numbers. The four teams selected to each of the first six College Football Playoff fields are provided in the table, along with distinct opponent-adjusted metrics calculated as of the conclusion of conference championship week of the given season:
- FEI (the per-possession scoring advantage the given team would be expected to have on a neutral field against an average opponent)
- EWR (elite win rating -- the given team's win total compared with what a team two standard deviations better than average would expect to have against its schedule)
- GWR (good win rating -- the given team's win total compared with what a team one standard deviation better than average would expect to have against its schedule)
- AWR (average win rating -- the given team's win total compared with what an average team would expect to have against its schedule)
The second-best team in terms of opponent-adjusted drive efficiency (the FEI data column) has missed the playoff field in each of the last two seasons, and only 17 of the 24 "best" teams according to FEI have been selected to the field in the first six iterations of the playoff. But check out the AWR data column -- the four "most deserving" teams in terms of record against schedule have made the playoff field in five of six years. Only 2017 Alabama -- which somewhat controversially made the field without winning its division or playing in a conference championship game -- appears as the lone exception to meeting this "most deserving" standard for playoff selection. (Note that the Crimson Tide in 2017 would go on to win the national championship after defeating Clemson and Georgia. Note also that Wisconsin, 12-1 after losing the Big 10 championship, had the third-best AWR rating at the time of playoff selection in 2017).
Selecting a playoff based on "most deserving" rather than "best" is obviously a judgment call, and one that I honestly don't think is a terrible idea in concept. Since teams play very different schedules every year, there will always be valid arguments to be made around the selection margins, and we don't have (and may never have) a particularly egalitarian set of playoff-qualifying standards in college football. It's one of the sport's enduring ... charms. I'm also more in favor of a selection committee process than a computer formula (or combination of formulae) determining the field, so long as the committee can be trusted to be well-informed and reasonably consistent in its deliberations and selections. The committee has new representation every year, and they are by no means infallible, but I think they've done a fair job every season in meeting that well-informed and reasonably consistent standard, especially as it relates to the playoff field itself.
Perhaps it's unfair then to harp on the apparent inconsistencies with the 2020 selection committee rankings. Given the extraordinary disruptions to the typical schedule, and the cancellation of all but a few decent non-conference games, can we fault the committee for applying a different standard than in previous seasons? The AWR metric that has lined up so well with committee's "most deserving" selections over the years does not line up well at all with the most recent CFP rankings.
|24||San Jose State||0.09||54||0.28||9||0.99||16||2.30||23|
Number of games played is a big factor in the EWR, GWR, and AWR ratings, since these are calculated from the number of wins accumulated compared with how many we would expect to be accumulated by an elite, good, or average team. Ohio State has the No. 2 FEI rating through Week 15, but only has five games played to date. It simply hasn't racked up many wins to meet the same "most deserving" standard we've had in previous seasons. In fact, Indiana has a better GWR and AWR rating than Ohio State, despite losing head-to-head, by having played two more opponents. Maybe that's an indication that these metrics aren't the best way to measure "most deserving," but I think it's a clue to the unique challenge this committee faces. How does 6-1 compare to 5-0, to 8-2, to 10-0, etc? We're used to making some simple judgments looking strictly at the loss column, but the win column this year is such a unique factor.
Coastal Carolina has played twice as many games and has twice as many FBS wins as Ohio State. Per EWR, a typical elite team would expect to have a 9-1 record against their schedule to date. A typical good team would expect to have seven or eight wins against their schedule to date, and a typical average team would expect to go .500 against their schedule to date. Perhaps the lack of a typical full non-conference schedule is artificially boosting the Sun Belt a little bit this season, but consider that the conference did go 3-0 against the Big 12. Louisiana beat Iowa State by 17 points. Coastal handed BYU its only loss.
Would the Chanticleers be favored over the Buckeyes? Certainly not. Would Coastal Carolina have been just as likely to rack up a 5-0 record against Ohio State's schedule to date if they swapped opponents? It's not that much of a stretch to say yes, I think so.
All this being said, the term "most deserving" is a dicey one to use in a pandemic-plagued season, especially when teams were often not able to control whether they were able to play a given week, or even whether to start their season in August versus in late-October. I'm not really going to try to make an affirmative argument for Coastal Carolina to make a four-team playoff field over Ohio State, nor am I going to use 2020 data to make an argument for or against future playoff field expansion. I do think the committee's primary faults are evidenced by their modest to severe suppression of the on-field achievements of Group of 5 teams. Wins against good teams, no matter how you define it, seem to matter more to the committee if they are won by Power 5 teams than if they are won by Group of 5 teams. The fact that the Big 12's two-loss champion may squeeze into the playoff picture if chaos reigns this weekend, while the Sun Belt's potentially undefeated champion will most certainly not, is the best evidence of this committee bias problem we've had in the playoff era.
Chaos may yet reign this weekend, but we're most likely to end up with a playoff field consisting of Alabama, Clemson, Ohio State, and Notre Dame when the dust settles. If this comes to pass, there will be little dispute that the committee "got it right" and selected the four best teams. But we should be willing to admit that measuring "best" this year is based significantly on our prior assumptions about the relative strength of teams and conferences. When we remove those assumptions altogether and put every conference on a level playing field, Group of 5 teams rise up.
2020 FEI Ratings (through Week 15)
FEI ratings (FEI) represent the per-possession scoring advantage a team would be expected to have on a neutral field against an average opponent. Offense ratings (OFEI) and defense ratings (DFEI) represent the per-possession scoring advantages for each team unit against an average opponent unit. FEI, OFEI, and DFEI ratings are based on a combination of opponent-adjusted results to date and preseason projections.
Net points per drive (NPD) is the difference between points scored per offensive drive and points allowed per opponent offensive drive. Net available yards percentage (NAY) is the difference between offensive available yards percentage and opponent offensive available yards percentage. Net yards per play (NPP) is the difference between drive yards per offensive play and drive yards allowed per opponent offensive play. Three different schedule strength ratings for games played to date are provided, based on current FEI ratings, representing the expected number of losses an elite team two standard deviations better than average would have against the given team's schedule (ELS), the expected number of losses a good team one standard deviation above average would have against the schedule (GLS), and the expected number of losses an average team would have against the schedule (ALS).
Ratings and supporting data are calculated from the results of non-garbage possessions in FBS vs. FBS games.