Writers of Pro Football Prospectus 2008

Most Recent FO Features

FitzgeraldLar04.jpg

» Scramble for the Ball: The DVOA Schism

Mike and Tom try to figure out what kind of secret sauce Arizona is feeding the media to sit at the top of the power rankings and in the middle of our DVOA rankings.

09 Nov 2006

Introducing the Fremeau Efficiency Index

Guest Column by Brian C. Fremeau

Many readers have asked us to introduce a DVOA system for college football. The Fremeau Efficiency ratings are based on drives rather than individual plays, and they don't consider field position to separate offense, defense, and special teams in the same way as DVOA. But we think they represent a first step towards FO-worthy college football statistics. We hope they make you think differently about where some teams are ranked in the BCS, and spur debate that will help produce even better ratings in the future. By the way, it's called the Fremeau Efficiency Index because it seems like every college rating system is named after the creator, so why not this one too?

As the college football season advances through what promises to be an eventful month of November, a tireless debate rages on: Which teams are the best? For better or worse, the much-maligned Bowl Championship Series (BCS) rating system attempts to answer that question definitively. Every voter's ballot may disagree and every computer system may weight certain factors over others, but when combined, doesn't the BCS actually represent a kind of utopian model for finding consensus among a collection of disparate voices?

BCS-as-utopia may be overstating it, and it certainly doesn't sit well with fans of the 117 teams not ranked #1 or #2 at season's end. But at the risk of provoking the growing mob determined to bury the BCS where it may never be found again, can college football be served by yet another statistical rating system joining the conversation? A Division-1A playoff may be just around the corner or fifty years away. While we wait, more than 650 games are played between 119 teams every fall. When considering the question "Which teams are the best," can we better understand and evaluate the games that are settled on the field?

Game Efficiency -- A New Statistic

The criticism of the BCS computer elements is inseparably wed not just to a distrust of cold data analysis but to the severe handicaps imposed upon the computers themselves. Margin of Victory (MOV) data was eliminated from the BCS computers after the 2001 season in order to negate the impact of blowout wins and losses. However, even when MOV is used soundly by other ranking systems, can it be trusted? Is there not a difference between a 44-41 shootout and a 10-7 defensive battle? When win/loss outcomes or an unreliable stat are the only data input used by a computer ranking system, is it any wonder that the average fan distrusts the output? Let's address this problem by first collecting better game data.

The game of football is basically divided into individual series of play, offense versus defense. A team on offense advances the ball until the series results in either a defensive stop (turnover, turnover on downs, punt, failed field goal, blocked kick, safety) or offensive score (field goal, touchdown), after which its opponent begins its own offensive series. This basic, alternating series structure is familiar to even the most novice fan. But note the method in which a typical game box score is published:

1st 2nd 3rd 4th F
USC 7 3 14 14 38
TEX 0 16 7 18 41
USC TEX
1st Downs 30 30
3rd down efficiency 8-14 3-11
4th down efficiency 1-3 1-2
Total Yards 574 556
Passing 365 267
Comp-Att 29-41 30-40
Yards per pass 8.3 6.7
Rushing 209 289
Rushing Attempts 41 36
Yards per rush 5.1 8.0
Penalties 5-30 4-34
Turnovers 2 1
Fumbles lost 1 1
Interceptions thrown 1 0
Possession 32:00 28:00

Take a moment to consider the value of this information. The scoring summary divides points scored by quarter. The team statistics divide yardage gained by passing and rushing. Possessions for each team are divided by total time elapsed while in control of the ball. Third- and fourth-down efficiency are given absent of drive context. Is it not strange that the basic division of play, the succession of possession series alternately played by the two teams, is totally ignored?

How would a fan having watched last January's BCS championship game describe it to someone afterwards? By margin of victory? By a breakdown of team yardage? Wouldn't the description more likely include important details like USC scoring touchdowns on each of its first four possessions of the second half, Vince Young's heroics leading Texas' final two possessions, and the game-hinging turnover on downs that set up the game-winning score?

Drive and play-by-play summaries are sometimes included as supporting information to the game box score, but these are presented in a comprehensive format that is difficult to synthesize. How well did a team maximize its own possessions and negate its opponent's possessions? It is the essential question in football, and it is answered statistically by Game Efficiency.

Game Efficiency quantifies the success rate of a team scoring while in possession of the ball and preventing scores while not in possession of the ball over the competitive course of a game. Since the success of a drive is contingent on the number of points it produces, there is a relationship between Game Efficiency and Margin of Victory, with two critical distinctions:

1. Game Efficiency represents not just an observed final outcome but how well each team played a given game to arrive at that outcome. In a sense, it is an enhancement of MOV, able to describe the difference between high-scoring shootouts and low-scoring defensive struggles, but also between a 17-14 ball-control game of only 15 possessions and a 17-14 triple-overtime game of 35 possessions.

2. Game Efficiency measures only the competitive possessions of a game, ignoring "garbage-time" scores and stops by both opponents. The only garbage-time adjustment options available for systems based on MOV are limited to arbitrarily-assigned scoring ceilings or a formula of diminishing returns. Neither of these options can distinguish between, for example, a 24-point lead earned in the waning moments of the 4th quarter from a 24-point lead earned before halftime. By charting games series-by-series, Game Efficiency is able to make such distinctions, measure late-game scoring opportunities against scoring leads/deficits, and weight the conclusive possessions accordingly for the fairest measure of how well two teams played a given game.

Game Efficiency = ((Points For – Points Against)/7) / (Total Competitive Possessions/2)

USC TEX
Points 38 41
Competitive Possessions 13 12
Game Efficiency -0.0343 0.0343

Processing the Data -- Fremeau Efficiency Index

Collecting Game Efficiency data from all games played thus far in the 2006 Division-1A college football season is a relatively basic exercise. But what do we do with the data once it is collected? How do we answer the question: Which teams are the best?

We could rank each team's average Game Efficiency over the course of the season (SE):

Rank Team Rec SE
1 LSU 7-2 0.37403
2 BYU 7-2 0.36415
3 Ohio St. 10-0 0.35899
4 Clemson 7-3 0.31308
5 Hawaii 6-2 0.30757
6 West Virginia 6-1 0.30595
7 Louisville 8-0 0.29914
8 Wisconsin 8-1 0.28118
9 Boise St. 8-0 0.27188
10 Texas 8-1 0.25075

This method of processing the data, of course, does not take into account the quality of the opposition faced. A team could play extremely efficiently against a weak slate of opponents and hardly be considered "better" than a team that played less efficiently against a strong slate. We next adjust each Game Efficiency data point to account for the quality of opponent, and rank each team's average Adjusted Game Efficiency over the course of the season (ASE):

Rank Team Rec ASE
1 Ohio St. 10-0 0.35435
2 LSU 7-2 0.34519
3 BYU 7-2 0.28392
4 West Virginia 6-1 0.28200
5 Texas 8-1 0.27832
6 Louisville 8-0 0.26366
7 Tennessee 7-2 0.26217
8 California 7-1 0.26175
9 Michigan 10-0 0.25227
10 Florida 8-1 0.24789

As valuable as this output (and subsequent-order versions of it) may be, it raises new questions that are more complex and completely unique to the challenge of evaluating 119 Division-1A teams: Can an efficiency margin recorded against the worst team in college football be effectively compared to an efficiency margin recorded against the best team? Are all data points and the results of all games played equally valuable?

The Fremeau Efficiency Index (FEI) weights the value of Adjusted Game Efficiency data by first evaluating the following criteria:

1. Who did you beat and how did you win those games?
2. Who did you lose to and how did you lose those games?

As the quality of the opponent decreases, the value of the first question receives less weight than the second. In other words, FEI rewards teams for playing well against good teams, win or lose, and treats losing to poor teams more harshly than it rewards winning against poor teams.

Fremeau Efficiency Index -- Week 10

Rank Team Rec FEI
1 Ohio St. 10-0 0.46081
2 LSU 7-2 0.44000
3 Tennessee 7-2 0.41829
4 Florida 8-1 0.41673
5 Louisville 8-0 0.40941
6 Michigan 10-0 0.40736
7 West Virginia 6-1 0.35332
8 Wisconsin 8-1 0.34683
9 Auburn 9-1 0.34377
10 California 7-1 0.33922
11 Notre Dame 8-1 0.32785
12 USC 7-1 0.31207
13 Boise St. 8-0 0.29543
14 Texas 8-1 0.29167
15 Arkansas 7-1 0.27023
16 Rutgers 7-0 0.24013
17 Clemson 7-3 0.23923
18 Georgia Tech 6-2 0.22491
19 Alabama 6-4 0.22465
20 South Carolina 4-4 0.21608
21 Georgia 5-4 0.21544
22 Hawaii 6-2 0.21249
23 Oklahoma 7-2 0.20434
24 BYU 7-2 0.19695
25 Boston College 6-2 0.19677

For complete rankings of all 119 Division-1A teams, click here.

FEI represents a weighted and opponent-adjusted Season Efficiency for each team as compared with an average Division-1A football team. The ratings may be interpreted as follows:

Ohio St. is 46% more efficient than an average D1A team. Following the Game Efficiency definition outlined above, 46% game efficiency over the course of a 21-possession game would translate into a MOV of approximately 34 points.

Ohio St. is 2% more efficient than LSU. Over the course of a 21-possession game, 2% game efficiency translates to approximately 1.5 points.

What About Home-Field Advantage?

Home-field advantage is supposed to be a much larger issue in college football than in the NFL, and originally the FEI calculations included a home-field adjustment. But after further study, that was removed. Though teams are 305-205 (59.8%) in home games this year with an average competitive time MOV of 4.5 points, those numbers are skewed by blowout MOVs and BCS-conference vs. non-BCS-conference scheduling (Big Ten powers aren't playing home-and-home dates with the MAC, for instance). In fact, in all Division-1A games decided by 7 points or less this year, home teams are exactly 86-86 (50%), with an average MOV of -0.2 points. Home-field advantage may be an emotional factor, but it does not appear to be a significant statistical one.

FEI and the Polls -- A Comparison

A quick glance at the FEI changes from Week 9 to Week 10 clearly distinguishes this rating system from the methodology employed by voters. While most voters deliberately anchor teams to certain slots week-to-week until a single result sways their opinion, FEI poll positions are anything but stable. This is because FEI reevaluates the value of each game played over the entire season every week. Notre Dame's #8 to #11 drop, for instance, had more to do with weaker performances by former opponents Michigan and Penn St than by its handling of overmatched North Carolina over the weekend.

LSU's thrilling victory over Tennessee vaults the Tigers into the #2 spot in FEI this week, a leap that flies in the face of the logic of voters, who unanimously rate LSU as the top team with two losses but can't convince themselves to elevate them over the pack of one-loss teams across the country. Is LSU the best team in the SEC? They suffered road losses to Auburn and Florida, true, but unlike the rest of the conference, have completely obliterated the rest of their schedule. They boast the fifth-most efficient offense in the nation and the second-most efficient defense. Last year, a two-loss Ohio State team racked up similar credentials. This year, without either a USC or Texas behemoth distancing itself from the pack, LSU is right there in the mix.

Michigan dropped all the way to #6 this week after flirting with disaster against Ball St., a fall mostly attributable to the elevation of the teams around them rather than a "punishment" of the Wolverines. But though their 10-0 record (including wins over #8 Wisconsin and #11 Notre Dame) is nothing to sneeze at, Michigan's weaknesses may be catching up with them. Their sixth-most efficient defense is a force to be reckoned with, but is their 24th-ranked offense, healthy or not, truly one of the elite? One trend that Ball St. took advantage of was holding the Michigan offense to a long field. The Wolverines have been extremely effective this season in converting short fields created by their defense into points, but they have struggled with longer fields. In 43 competitive possessions started inside their own 30-yard line (the national average starting field position is the 30), Michigan had scored only 10 times prior to Saturday (eight TDs, two FGs), and only one of those came against either Notre Dame or Wisconsin (a 70-yard first quarter TD by Mario Manningham against the Irish). Ball St. didn't keep enough long drives out of the end zone against Michigan (two of eight drives over 70 yards went for scores), but field position kept them in the game longer than it should have.

Despite their particular differences, the voted polls and FEI interestingly agree on the names of the top 16 teams in the country this week. Among these and the rest of the 119 Division-1A teams, let's take a closer look at a few others who are overrated or underrated by the polls.

Overrated: TEXAS (#4 AP, #3 USA Today; #14 FEI)

Texas should take no shame in their lone loss back in Week 2 to #1 Ohio State -- the Buckeyes have efficiently handled everyone in their path thus far. The problem with Texas' poll ranking is that the Longhorns' body of wins compares unfavorably with other high-profile one-loss teams and several two-loss teams (who all have no-shame losses themselves). Back-to-back narrow escapes over #36 Nebraska and #63 Texas Tech and a best win against #23 Oklahoma are the core of the Texas resume. The pollsters may be enamored with Colt McCoy's gaudy touchdown and yard stats racked up against #82 Iowa St., #85 Baylor, #98 Rice and #118 North Texas, but FEI is not. In a way-less-than-stellar Big 12 conference, Texas may not meet a true test until their bowl game, more than 16 weeks removed from their September 9th match-up with Ohio State. Regardless of their raw Game Efficiency numbers the rest of the way, will they be ready for a BCS bowl opponent?

Underrated: TENNESSEE (#13 AP, #15 USA Today; #3 FEI)

Narrow defeats to #2 LSU and #4 Florida and wins over the rest of their meaty SEC schedule give Tennessee a solid backbone on which to hang their September 2nd blowout of otherwise undefeated #10 California. Their offense is scoring touchdowns on 41% of its possessions (11th nationally), and will travel to Arkansas this weekend for what should be their final regular season test. Unless Florida mails it in down the stretch, Tennessee won't play for the SEC title this year but may be in a prime position to steamroll an unsuspecting bowl opponent in January.

Overrated: OREGON (#20 AP, #21 USA Today; #30 FEI)

Controversial or not, Oregon's "Onside-gate" victory over Oklahoma back on September 16th has been pretty much the only significant win on the year so far. Add in a blowout loss to #10 California and a run-of-the-mill record through a run-of-the-mill conference slate, and it's easy to see why Oregon is an underwhelming candidate for a higher ranking. The Ducks travel to USC this weekend with a chance to change perceptions, but with the way they have played thus far against inferior competition, don't count on it.

Underrated: CLEMSON (Not Ranked AP, USA Today; #17 FEI)

In the anything-but-glamorous ACC, Clemson finds itself in fourth place in its division, yet ranks as the top team in its conference according to FEI. Why? They boast the best win in conference play (a dominating performance over #18 Georgia Tech), and though they have played efficiently in one-point defeats to #25 Boston College and #35 Maryland, the recent skid has turned the voters cold. Other ACC teams have yet to step up with a commanding win of their own, or in the case of Virginia Tech (a 24-7 winner over Clemson two weeks ago), haven't dominated the bulk of their schedules comparatively. A season-ending clash with rival South Carolina will reveal with certainty Clemson's true identity.

Overrated: TULSA (Receiving votes AP, USA Today; #79 FEI)

It isn't as though an overwhelming number of votes have been cast their way, but Tulsa's receiving any consideration at all from the polls is a total head-scratcher. They have best overall record in Conference USA, okay -- but a best win (by one) over #60 Navy ... and 3-score losses to #24 BYU and #65 Houston (a team now with a de facto two-game lead over Tulsa in their C-USA division)? FEI names 78 teams with a better resume than Tulsa -- it shouldn't be that hard for a voter to find 25 of them to like.

Underrated: HAWAII (Receiving Votes AP, USA Today; #22 FEI)

Colt Brennen's 39 touchdown passes are probably responsible for Hawaii's gathering votes in the polls this week, so the Warriors aren't getting totally dissed. Sure, the WAC isn't murderers row, but at a certain point, you can't ignore the absurd efficiency with which Hawaii is playing offense right now, scoring touchdowns on 57% (first nationally) of its competitive offensive possessions. Single-score losses to #13 Boise St and #19 Alabama are their worst games played to date. Hawaii will close with Purdue and Oregon St. at the end of the month and a 13-game regular season schedule. Could an 11-2 Hawaii be ignored?

Opening the debate

Are Game Efficiency and FEI the best way to determine the best teams in college football? The FEI Forecast (click here) will continue to predict winners of all Division-1A games each week based on the previous week's rankings (38-16 -- 70.4% -- in Week 10). But like DVOA in its infancy, several years' worth of Game Efficiency data needs to be collected and evaluated in order to develop and advance the statistics and system going forward. Are we anywhere near utopia? If your vision of utopia allows for better and more in-depth statistical analysis and a healthy level of debate, we're already there.

Posted by: Brian Fremeau on 09 Nov 2006

89 comments, Last at 11 Nov 2006, 9:10am by grailsearch

Comments

1
by SG (not verified) :: Thu, 11/09/2006 - 1:32pm

First!

2
by SteelBoots (not verified) :: Thu, 11/09/2006 - 1:34pm

Just want to say, very interesting read. Thank you for posting it here.

3
by Pat (not verified) :: Thu, 11/09/2006 - 1:43pm

Is there not a difference between a 44-41 shootout and a 10-7 defensive battle?

Not all statistical rankings use a one-dimensional margin-of-victory game output function. Massey's ratings use a 2D game output function, so it can easily distinguish between a 44-41 game and a 10-7 game.

I'm also confused by this:

By charting games series-by-series, Game Efficiency is able to make such distinctions, measure late-game scoring opportunities against scoring leads/deficits, and weight the conclusive possessions accordingly for the fairest measure of how well two teams played a given game.

followed by the definition given for "game efficiency index". It doesn't seem like that definition could do any of the above. There's also no definition of what "competitive possession" is, so I don't really see how it's different than arbitrary score limits on a margin-of-victory measure.

4
by Travis (not verified) :: Thu, 11/09/2006 - 1:44pm

Definitely interesting. A couple questions, though, so I can understand it better:

1. I-A vs. I-AA games aren't included, right? (For example, Rutgers is listed at 7-0, not 8-0.)

2. What exactly is a "competitive possession?" Any possession that doesn't end in a kneel down? Any possession outside of an unstated combination of time and score? Something else?

5
by BadgerT1000 (not verified) :: Thu, 11/09/2006 - 1:47pm

I am somewhat surprised that Penn State didn't make the "top 25". They have lost games to:

1. Ohio State
6. Michigan
8. Big-time cheaters may they burn in everlasting H*ll their coach is known to kill, skin, and eat baby seals Wisconsin
11. Notre Dame

You give me PSU and any of the teams ranked 14 and lower on a neutral field I take the Lions. Their QB stinks but everything else on that team is pretty impressive. If the coaching staff would just run the ball about 35 times a game and keep Morelli's damage to a minimum they would be better off.

6
by Brian (not verified) :: Thu, 11/09/2006 - 2:02pm

I'm really not a fan of college football, but this was a very interesting read. A big part of the reason I don't watch college ball is the absurdity of using the human polls as a determining factor in ranking the teams. The human pollsters have inherent biases, don't watch every game, and refuse to lower a top-ranked team unless it loses, even if teams ranked below them are playing better. It seems every year that the top team is the one with the best record that was ranked highest in the preseason polls or that lost early in the season so other losing teams dropped below them.

Statistics like efficiency and margin of victory are very important factors in determining how good a team is, yet these statistics have been either downplayed or completely removed from the BCS formula. It seems like every year, the BCS tweaks their system to make it better. "Better" in this case is usually defined as "getting results more similar to the human polls." Why bother to have the computer rankings in the first place? The BCS was best in the beginning when it could account for margin of victory, before it was tweaked to appease the proponents of human polls.

Yay computer rankings! Boo human polls!

7
by Pat (not verified) :: Thu, 11/09/2006 - 2:02pm

Also, I should say that this:

Since the success of a drive is contingent on the number of points it produces

isn't true. It's the best argument against margin-of-victory, or any purely points-based system.

Football has three objects of value: downs, clock, and yardage. There are situations in every game where teams risk wasting one for another: teams risk wasting a down for a better chance to gain large yardage. Teams waste the possibility of a gain in yardage for a better chance to gain a new set of downs. Teams waste a chance at yardage to burn off clock, and teams are willing to risk losing downs in order to conserve clock.

The only one of all of those three that directly relates to points is yardage. A drive that doesn't end in points isn't always unsuccessful - if it consumes significant clock, it could be quite successful.

It's a minor thing, since there are only maybe two or three of those per game, and they're rarely in a game anyway. But still, if you've got a ranking system which doesn't reward a team for successfully taking an action that actually increased their chances of winning, that's not right.

8
by Pat (not verified) :: Thu, 11/09/2006 - 2:05pm

The BCS was best in the beginning when it could account for margin of victory, before it was tweaked to appease the proponents of human polls.

Sigh.

It wasn't tweaked to appease the proponents of human polls. It was tweaked because people thought something was wrong in 2003. They asked statisticians "what could be wrong?" and the statisticians actually suggested removing margin of victory, reducing the weight of the statistical rankings, etc.

The human pollsters have inherent biases

What makes you think the statistical rankings don't?

9
by Doug (not verified) :: Thu, 11/09/2006 - 2:11pm

Very well done, I enjoyed this a lot. Nice to see BYU on your top 25, they deserve some credit.

10
by kubiwan (not verified) :: Thu, 11/09/2006 - 2:13pm

Penn State is a hard team to rank this year -- using FEI, all their losses are to Top 10 teams (if you use Big Ten counting methods :), but all their wins are against teams ranked 45th or lower (see below). They just haven't played any good, but not great, teams and haven't been involved in any upsets. Partially based on the crappy PI call that allowed them to win the Dome, I think 27th is probably a bit high, but not outrageously so.

45. Minnesota
46. Purdue
52. Illinois
57. Northwestern
96. Akron
I-AA. Youngstown State

11
by Tim (not verified) :: Thu, 11/09/2006 - 2:14pm

#8 - Because Pat, if you plug in the same results and the same numbers but change the names of the schools, the computers don't care.

But humans do.

12
by Pat (not verified) :: Thu, 11/09/2006 - 2:17pm

And wait, wait, I missed this:

In fact, in all Division-1A games decided by 7 points or less this year, home teams are exactly 86-86 (50%)

This isn't a compelling argument. Games decided by 7 points or less are essentially statistical tossups. It's unsurprising that home-field advantage can't sway a close game significantly.

What you'd be more interested in is looking at expected results versus actual results for home vs. away. Take all of a team's home games and calculate the efficiency for them. Then take all of their away games and calculate the efficiency for them. What's the average difference in all of college football?

13
by Pat (not verified) :: Thu, 11/09/2006 - 2:21pm

#8 - Because Pat, if you plug in the same results and the same numbers but change the names of the schools, the computers don’t care.

But humans do.

That's only one bias. That's not the only kind of bias you can have. With the statistical rankings, you can plug in the same results, same numbers, from a 24-21 victory where the team was down 24-0 until the last minutes, and the other team put in scrubs, whereas they went up 24-21 on the last play of the game, never with any realistic chance to win.

Statistical rankings are biased as well. They're either biased by their model of the way football works (margin of victory) or their own uncertainty (pure wins/losses).

Just a different kind of bias. And in some way, it's worse. Teams can be playing much better football and statistical rankings will screw them for it, whereas a perfect human poll, even biased by preseason results, wouldn't.

14
by Dennis (not verified) :: Thu, 11/09/2006 - 2:30pm

Just a different kind of bias. And in some way, it’s worse. Teams can be playing much better football and statistical rankings will screw them for it, whereas a perfect human poll, even biased by preseason results, wouldn’t.

But a perfect human poll doesn't exist and never will. Not that a perfect statistical ranking exists either, but I think we'll get a lot closer to perfection with statistical/computer systems than we ever will with a human poll.

The only reason to have human polls is if you think subjective factors should be included. A lot of us think the ranking should be purely objective (whatever that criteria might be), so therefore there is no need for human polls, in our opinion.

15
by Brian (not verified) :: Thu, 11/09/2006 - 2:32pm

#8

It wasn’t tweaked to appease the proponents of human polls. It was tweaked because people thought something was wrong in 2003. They asked statisticians “what could be wrong?� and the statisticians actually suggested removing margin of victory, reducing the weight of the statistical rankings, etc.

As I stated above, I don't follow college football to a large extent (this is especially odd since I now attend the University of Texas). I'm basing my assumption that the statisticians weren't in favor of removing margin of victory on 2 things. The first is my own vague memories of a bunch of debates I didn't really pay attention to at the time. Not a very good basis, I admit. The second is this statement Jeff Sagarin puts on his rankings every week:

In ELO-CHESS, only winning and losing matters; the score margin is of no consequence, which makes it very "politically correct". However it is less accurate in its predictions for upcoming games than is the PURE POINTS, in which the score margin is the only thing that matters. PURE POINTS is... the best single PREDICTOR of future games. The ELO-CHESS will be utilized by the Bowl Championship Series(BCS).

This hardly seems like he endorses removing the margin of victory, and he's the only computer pollster who makes his opinion readily public.

What makes you think the statistical rankings don’t (have inherent biases)?

Well, it's hard to accuse a statistical system of having a bias unless its creator is purposefully giving different weight to data in such a way that it's "favored" teams come out better. I've never seen any evidence that this is happening. Certainly, all statistical measures are going to give different values to different aspects of the game. This is why (a) a number of computer polls are considered, and (b) the highest and lowest computer ranking are thrown out when calculating the final result. Human pollsters, on the other hand, are notorious for voting to raise teams in their conference to make them look better, and for raising teams who are staffed by friends of the coach, DA, or (usually) some assistant. More to the point, however, is that humans only see a certain number of games. They are likely to rank some teams based on games they have seen, and other teams based on perusing the box scores. Generally, the teams one has seen are going to be ranked higher than the teams one has only read about. Computers, on the other hand, "see" every game equally.

16
by Fnor (not verified) :: Thu, 11/09/2006 - 2:35pm

This seems pretty cool, actually, despite my dislike for college football.

One concern I have is the use of drive stats: we discussed this last year (at least the commenters did), and I came away with the opinion that they are, for the most part, either misleading or of questionable use.

The other problem is what others have mentioned: the definition of competitive possession is extremely important and would likely warrant a gigantic statistic study in and of itself, if it is indeed possible to quantify. This seems problematic as the foundation of a complicated statistic.

17
by Richard (not verified) :: Thu, 11/09/2006 - 2:40pm

I found this interesting, but flawed. Sadly, Pat has addressed all of my points (plus stuff I hadn't seen/noticed).

18
by Pat (not verified) :: Thu, 11/09/2006 - 2:55pm

This hardly seems like he endorses removing the margin of victory, and he’s the only computer pollster who makes his opinion readily public.

Yet note that he uses the combination of Elo and Predictor for his rankings.

And I'm not sure what you mean by he's the only one who makes his opinion readily public. Most of them do, and most of them don't really have an opinion on it. It doesn't hurt the overall accuracy that bad, and you can introduce significant biases if you use the wrong kind of game output function (like Sagarin's, for instance).

Margin of victory makes a rating more predictive in aggregate: but the ratings shouldn't really be predictive. That's what the games are for. They should be measuring past performance. And for that, you don't want to rate teams by anything other than whether or not they won or lost.

Well, it’s hard to accuse a statistical system of having a bias unless its creator is purposefully giving different weight to data in such a way that it’s “favored� teams come out better. I’ve never seen any evidence that this is happening.

Nono! You're attributing to maliciousness what I'm trying to attribute to stupidity. Or rather, lack of knowledge. I'm not talking about a system biased for or against a specific team knowingly. I'm talking about systems biased for or against classes of teams unknowingly.

And this does happen. Sagarin's Predictor ratings tend to be wildly inaccurate with unbalanced teams. That's why there are Bayesian corrections in a some of the rankings: to look for teams which consistently win in excess of predictions. Unfortunately, there aren't nearly enough games in the season to do that accurately.

(To quote Massey's site, which has a good summary: "The results obtained by the MLE will be predictive in nature since they are based entirely on the scores of games and contain no provision for teams that win, but don't always win big. Other teams will tend to perform in a way that is highly correlated with the strength of their opponent. Differences in style, coaching philosophy, and performance in close games can easily be overlooked if we look at scores alone.")

19
by Pat (not verified) :: Thu, 11/09/2006 - 3:00pm

The only reason to have human polls is if you think subjective factors should be included.

No, the reason you include human polls is because you believe that the data sufficient to create an accurate game output function doesn't exist yet.

20
by DrewTS (not verified) :: Thu, 11/09/2006 - 3:21pm

Interesting read, and a good starting point for a DVOA-style ranking for college.

Personally, I would like to see more information about how the various adjustments are done. Ohio State's number changes from .35899 to .35435 to .46081 via the various adjustments, but I didn't see those adjustments explained very much. Oh, and someone already mentioned that it would be good to have a clear definition of "quality possession." I think we understand the basic idea behind it, but folks on here tend to be detail oriented.

Once again, I thought this was good. And I personally was surprised at how it's actually not that far off of what the human polls say. Rutgers is particularly noteworthy, in light of the debate surrounding their ranking in the polls. There are only a few teams where they seriously disagree, and that seems to be mostly a result of FEI loving the SEC.

21
by DrewTS (not verified) :: Thu, 11/09/2006 - 3:24pm

Oops, I meant competitive possession, not quality. I've got quality wins on the brain.

22
by Pat (not verified) :: Thu, 11/09/2006 - 3:36pm

Incidentally, for the poster who was looking for other statistical ranking author's opinions on margin of victory, here's Colley's (www.colleymatrix.com):

Ignoring margin of victory eliminates the need for ad hoc score deflation methods and home/away adjustments. If you have to go to great lengths to deflate scores, why use scores?

Or, Massey's:

Over the years, the BCS has gotten criticized for fine-tuning its formula. Recent changes have simplified the system for the better and removed extraneous redundancies. The current setup is a good balance of the traditional human polls, which the fan base is most comfortable with, and the objective computer component.

Massey also makes a good point that the mere existence of the objective statistical rankings are very likely seriously improving the human polls as well. There's some statistical evidence for that, too.

23
by Lincoln (not verified) :: Thu, 11/09/2006 - 3:52pm

You hit 70% winners last week, I wonder what that number was against the spread.

24
by DavidH (not verified) :: Thu, 11/09/2006 - 3:52pm

Margin of victory makes a rating more predictive in aggregate: but the ratings shouldn’t really be predictive. That’s what the games are for. They should be measuring past performance.

Really? It seems like a LOT of humans, when they make their rankings, use criteria along the lines of "if these 2 teams met on a neutral field, who would win?" That's predictive, and not just a measure of past performance.

I'm not saying you are totally wrong, just that reasonable people can disagree over whether the ideal ratings would be predictive or ... retrodictive(?). I fall into the predictive camp. I guess it comes down to whether you view the playoffs/bowl games as a reward for the best season, or as a method to crown the best team.

25
by Gary (not verified) :: Thu, 11/09/2006 - 4:02pm

Regarding home field advantage, wasn't there an article in the Journal of Quantitative Analysis of Sports some months ago that showed the home field advantage for Big XII teams?

Additionally, wouldn't it not make sense to look at the results of close games, since home field advantage would presumably make otherwise not-close games close and otherwise close games not-close? While keeping the close games at 50/50?

26
by BlueStarDude (not verified) :: Thu, 11/09/2006 - 4:10pm

As much as I enjoyed this read, and the idea of a DVOA-like metric for college football, this doesn't really solve the "problem" of college rankings and proclaiming a legitimate national champion. Statistical analysis shouldn't trump wins and losses when it comes to Bowl invites; a 7-2 team shouldn't get to play for the national title when there are a bunch of one-loss teams. I suppose FEI or something like it could be the best way to break ties, but the real issue is that if a school like Rutgers from a supposedly "major conference" can be one of two undefeated "major conference" teams and still not play for the championship, then the NCAA needs to restructure itself so that it can be clear from the outset that Rutgers is not actually in the same "league" as Ohio St. or Florida and has no shot at a championship no matter what they do. Advanced metrics are wonderful, but they can't replace wins and losses.

27
by Pat (not verified) :: Thu, 11/09/2006 - 4:13pm

It's retrodictive. And I said that poorly: what I meant was that I don't think they should exclusively be optimized for predictiveness, especially given the fact that the accuracy is so poor, and biases so prevalent. You basically need a balance of the two, and in my mind, that's what the BCS does: the human polls add the predictive element, and the statistical rankings add the retrodictive element.

Now, granted, the human polls currently kinda suck as a predictive element, but at least they suck in an understandable way.

28
by Dennis (not verified) :: Thu, 11/09/2006 - 4:35pm

No, the reason you include human polls is because you believe that the data sufficient to create an accurate game output function doesn’t exist yet.

If the data doesn't exist, then what's the human poll going to be based on? Subjective opinions.

29
by Pat (not verified) :: Thu, 11/09/2006 - 4:39pm

Oh, no, the data exist. It's the games themselves. They're just not anywhere near machine parsable.

30
by Whiskey (not verified) :: Thu, 11/09/2006 - 4:42pm

This an interesting article that is similar to Ken Pomeroy's method for modeling college basketball games.

Of course, I categorically reject the use of statistical models to replace a playoff system and believe this method should be optimized for predictiveness only.

31
by DavidH (not verified) :: Thu, 11/09/2006 - 4:47pm

a 7-2 team shouldn’t get to play for the national title when there are a bunch of one-loss teams.

What if the 7-2 team played 9 best teams in the country, and the one-loss teams played the 9 worst?

What if the 7-2 team played 9 best teams in the country, and the one-loss teams played the 9 most average?

What if the 7-2 team played 10th-best through 18th-best teams in the country, and the one-loss teams played the 20th-best through 28th-best?

See what I'm getting at here? Where do you draw the line and say "Yeah, they have one more loss, but their competetion was so much better that they still deserve it."

If Boise St. goes undefeated, and Michigan only loses to Ohio St in sextuple overtime, who gets to play in the championship? I'd vote for Michigan.

32
by Crushinator (not verified) :: Thu, 11/09/2006 - 4:56pm

Go, go, SEC!

Beating each others ass out of BCS contention year after year.

33
by noahpoah (not verified) :: Thu, 11/09/2006 - 4:57pm

For everyone discussing whether or not and how purely statistical ranking algorithms exhibit bias, there is a mathematical definition of bias that should clear things up a bit.

Applied to the present situation, if there is a true underlying rank ordering of teams, an estimate of the rankings is biased insofar as it differs from the true ranking.

Also, while the FEI ranking system is interesting, I will repeat questions asked above in the comments. How are (quality of) opponent adjustments made, and what is a competitive possession?

34
by Dennis (not verified) :: Thu, 11/09/2006 - 5:02pm

Pat, first you said:

No, the reason you include human polls is because you believe that the data sufficient to create an accurate game output function doesn’t exist yet.

Then you said:

Oh, no, the data exist. It’s the games themselves. They’re just not anywhere near machine parsable.

So does the data exist or not?

35
by NF (not verified) :: Thu, 11/09/2006 - 5:04pm

I think a better way to measure the success of a drive instead of by just looking at points is to come up with an expected score system based on field position, such as exists for the DVOA special teams, and use that on drives that did not score points in addition to points scored to rate drives.

36
by Jeremy Billones (not verified) :: Thu, 11/09/2006 - 5:05pm

Re: 31

Wouldn't undefeated Louisville make for a more relevant what-if?

37
by Pat (not verified) :: Thu, 11/09/2006 - 5:09pm

So does the data exist or not?

Depends if you consider the game films themselves 'data' or not.

38
by DavidH (not verified) :: Thu, 11/09/2006 - 5:10pm

34:
It seems to me that Pat is saying all the necessary information is there, in the form of "what happened in the games." But that nobody keeps track of exactly what happened at every point in every game in a way that it can be broken down into data that an algorithm can use to make a ranking. So the only current way we have of combining all the information is to ask the opinions of the people who watched the games.

39
by DavidH (not verified) :: Thu, 11/09/2006 - 5:10pm

Or I could let Pat answer himself.

40
by Dennis (not verified) :: Thu, 11/09/2006 - 5:15pm

Re 37: If you would evaluate a team differently based on watching the game than based on reading the play-by-account of the game, then you aren't even in the same universe as 'data'.

41
by Dennis (not verified) :: Thu, 11/09/2006 - 5:16pm

That should be 'play-by-play account'

42
by Pat (not verified) :: Thu, 11/09/2006 - 5:20pm

Applied to the present situation, if there is a true underlying rank ordering of teams, an estimate of the rankings is biased insofar as it differs from the true ranking.

Not exactly. Bias is when the mean value of a parameter, with uncertainty, differs from the true value of the parameter. It's the systematic error present in the system. Problem is - no true value. But you can also estimate bias via another way: if you have a known unbiased measure, with uncertainty, you can look at the difference between the biased measure and the unbiased measure, and average away the uncertainty.

Win-loss rankings are purely unbiased - they're just less precise (they're not less accurate) than others including more data.

So a simple estimate of bias would be to compare the rankings developed via a more advanced method with those of a pure win-loss, and look to see if certain classes of teams are constantly overrated or underrated. That's bias.

(Of course, you run into a slight problem in that rankings are coarse-grained.)

43
by Pat (not verified) :: Thu, 11/09/2006 - 5:22pm

Re 37: If you would evaluate a team differently based on watching the game than based on reading the play-by-account of the game, then you aren’t even in the same universe as ‘data’.

What's that supposed to mean?

I think most coaches at any level of football would do exactly that.

44
by Pete (not verified) :: Thu, 11/09/2006 - 5:54pm

As Sagarin will tell you, even his predictor model (margin of victory) is far from perfect. He does prefer it to ELO-CHESS as do I.
A computer model does things without normally biasing itself with a specific team ("Notre Dame is awesome" vs. "Notre Dame is over-rated"). However, some forms of bias can be good. For instance, Florida lost to Auburn by 11 points. However, the last 7 points was on a turnover during a last-ditch effort when down by 4 points. Is this really just comparable to a 4-point loss?
Also, some people may be better suited to take into consideration about injuries and suspensions. If the Ohio State QB is out for a single game due to a death in his family and Ohio State loses by a field goal, how good a team are they when he returns?

45
by zlionsfan (not verified) :: Thu, 11/09/2006 - 5:56pm

I liked this. It's a good starting point. I wonder if one obstacle to examine drives in more detail is that it can be difficult to get play-by-play info at that level for some games (usually between non-BCS teams).

46
by Wanker79 (not verified) :: Thu, 11/09/2006 - 5:57pm

Re: WTH is "Competitive Possession"?

Game Efficiency measures only the competitive possessions of a game, ignoring “garbage-time� scores and stops by both opponents.

If I'm understanding that correctly, a competitive possession is any drive that ends in a score.

So if you look at the scoring summary of the USC/Tex game:
USC TD
Tex FG
Tex TD
Tex TD
USC FG
USC TD
Tex TD
USC TD
USC TD
Tex FG
USC TD
Tex TD
Tex TD

USC had 6 offensive Competitive Possessions and 7 defensive Competitive Possesions.

Texas had the exact opposite.

But if you look at the "Game Efficiency" table:
USC - 13 Competitive Possessions
Tex - 12 Competitive Possessions

So I guess I must be missing something (I can't figure out why Texas has one fewer Competitive Possession), but I'm pretty sure I'm close.

47
by doktarr (not verified) :: Thu, 11/09/2006 - 6:07pm

Wanker, no, a "competitive posession" includes non-scoring drives. You could re-write that sentence as:

Game Efficiency measures only the competitive possessions of a game, ignoring “garbage-time� scores and “garbage-time� stops by both opponents.

But it only considers your offensive posessions, as I understand it, so there's nothing unusual about one team having one more posession than the other.

48
by Kal (not verified) :: Thu, 11/09/2006 - 6:09pm

I too would like to see how the opponent adjustment is made. I also think home field advantage has to factor in here one way or another; as remarked above, HFA doesn't matter in close games but should show up quite heavily in the big blowout games. I'd see whether you could factor it in and see how much better it was as a predictive quality vs. the normal way. That's arguably the better indication of whether it should be there or not.

49
by Pat (not verified) :: Thu, 11/09/2006 - 6:09pm

A computer model does things without normally biasing itself with a specific team

They don't bias themselves to any particular team. They bias themselves to classes of teams: ones that don't agree with their model for the game. In Sagarin's case, teams that play games that on the low side of scoring, and teams that play games on the high side of scoring. (That's why Massey uses a 2D function).

That's the one benefit of using drive-by-drive data, although as I've said elsewhere, I really, really don't understand the way this is being presented. If it's really just "(points for - points against)/number of competitive possessions" I don't see how it can do what it's claiming. I don't see how it can tell between a quarter-by-quarter score of "0-0 24-0 0-0 0-0" and "0-0 0-0 0-0 24-0", both of which could be two ridiculously different games.

50
by Wanker79 (not verified) :: Thu, 11/09/2006 - 6:11pm

Re: 47

You're right. If was just a coincidence that the total number of scoring drives equaled the number of offensive possesions by UCS and came damn close to the number of offensive possesions by Texas.

51
by Wanker79 (not verified) :: Thu, 11/09/2006 - 6:19pm

I have absolutely no objective data to support this, but I think I agree with ignoring HFA. I think the point is that if a game is a blow-out, it likely doesn't matter where that game was played. It may make the game closer, but it probably doesn't effect the overall w/l outcome. So if HFA is only good for helping a team win by more than it would have on the road, does it really matter? OTOH, the 0.500 record of close games may be the result of would have been loss but the HFA kept the game close enough to be a coin-flip. I don't have a damn clue what I'm rambling about anymore..Nevermi...*trails off*

52
by DavidH (not verified) :: Thu, 11/09/2006 - 6:24pm

How's this for defining "garbage time" ...

Working backwards from the end of the game, drive by drive, replace all drives by the winning team with 0 pts, and all drives by the losing team with 8 pts. The point at which the adjusted score crosses over to favoring the losing team is the last drive that counts as "competitive."

And of course, don't count drives that only consist of kneeling

Haven't thought this through. Maybe you should use 7 instead of 8, or there should be an adjustment of 1 drive in either direction. But I think it's a start.

53
by Dan (not verified) :: Thu, 11/09/2006 - 6:39pm

Pat, isn't W-L record biased against teams with hard schedules (and in favor of teams with easy schedules)?

54
by Pat (not verified) :: Thu, 11/09/2006 - 6:42pm

#51: Where it affects the W/L outcome is where it pushes two teams separated by the equivalent of less than 1 score (which is a tossup - say, OSU-LSU by those rankings) into the two-score region (which is not).

Just to inject a bit of math: with 172 events, the best you can say is that you know that there's less than a 1% chance that HFA is less than an 11% effect (something called the Chernoff bound). That's the best you can say. And that's even assuming that the factors that go into a less-than-7 point victory are affected, which is not necessarily true.

55
by Wanker79 (not verified) :: Thu, 11/09/2006 - 6:45pm

Re: 52

I like that idea. But if I'm thinking correctly, that would mean that in a 35-14 game where all scoring is in the first 3 quarters, only the final 3 drives of the game would be considered "garbage"?

56
by Pat (not verified) :: Thu, 11/09/2006 - 6:46pm

Pat, isn’t W-L record biased against teams with hard schedules (and in favor of teams with easy schedules)?

If all you did is rank teams by win-loss record, yes. But that's not what I meant - I meant models that use win-loss as their only inputs. Like all of the current BCS models. They are demonstrably unbiased, because there are no teams that win more than they win. :)

(In contrast, there are teams that win more games within 10 points than others - defensive teams - which is why margin of victory is biased.)

57
by DavidH (not verified) :: Thu, 11/09/2006 - 7:20pm

55:
Last 3 drives for the losing team, and last 2 or 3 for the winning team (depending on who had the ball last.

I don't think this is as big of a probem as you think. There have been 7 games so far where nobody has scored in the 4th quarter. 3 of those have had margin of victories of 20 or more points. I'll go through them real quick and post what would be defined as garbage time.

58
by DavidH (not verified) :: Thu, 11/09/2006 - 7:32pm

Oops. I meant 2 have had 20+ MOV's.

NO 23 ATL 3
http://tinyurl.com/ykvq4u
The entire 4th quarter would be garbage time. (3 ATL drives, 2 NO drives)

NE 31 MIN 7
http://tinyurl.com/ykoge5
OK, if you follow my exact rule and wait until the losing team is "ahead" and not tied, the entire 4th quarter is garbage time. (4 NE drives, 3 MIN drives) If you only go until they are tied, then the last 10:18 is garbage. (3 NE drives, 2 MIN drives

------
There is a 3rd that would have a 21 pt MOV if we change a NO FG to a TD. I'm going to treat it like that, because then it would be your exact scenario.

NO 31 TB 14 (i.e. NO 35 TB 14)
http://tinyurl.com/yfr6dz
The entire 4th quarter would be garbage time (2 drives for each team). If you use 7 points instead of 8 points, then the 4th quarter PLUS TB's last drive of the 3rd quarter would be garbage.

59
by Pat (not verified) :: Thu, 11/09/2006 - 7:38pm

David: well, yeah, but that's the NFL, where there's a ton more parity. In college, you routinely get several 40+ point spreads. Heck, last year, the Penn State-Illinois game was a 60-point spread, and no one could call the entire second half anything but "garbage time". Penn State didn't throw the ball once in the second half.

60
by Travis (not verified) :: Thu, 11/09/2006 - 7:39pm

If all you did is rank teams by win-loss record, yes. But that’s not what I meant - I meant models that use win-loss as their only inputs. Like all of the current BCS models. They are demonstrably unbiased, because there are no teams that win more than they win. :)

Except for, of course, the Billingsley ratings.

61
by DavidH (not verified) :: Thu, 11/09/2006 - 7:42pm

Let's look at the Chi-SF game from two weeks ago, where Chicago was up 41-0 at half time.
http://tinyurl.com/ye36pv

If you use the rule I laid out, the entire 2nd half PLUS the last 49ers drive of the 1st half are garbage time. (4 CHI drives, 5 SF drives)

If you adjust the rules so that the kneeldown by SF at the end of the 1st half doesn't count as a drive, then you also include the last Bears TD in garbage time. (5 drives each)

Using 7 instead of 8 gives the same results.

62
by DavidH (not verified) :: Thu, 11/09/2006 - 7:53pm

OK, in that 63-10 Penn St-Illinois game, the entire 2nd half plus the last play of the 1st half is garbage time.

Or, if you use 7 instead of 8 (or don't count that last play of the 1st half as a possible score for Illinois), then it's all of the 2nd half, plus the last drive for each team in the first half.

Subjectively, should even more of the game be garbage? Maybe there should be an additional "up by 35 rule" or something.

63
by Travis (not verified) :: Thu, 11/09/2006 - 8:02pm

Subjectively, should even more of the game be garbage? Maybe there should be an additional “up by 35 rule� or something.

Yeah, because no college team has ever lost when up by 35. :)

64
by DavidH (not verified) :: Thu, 11/09/2006 - 8:09pm

OK, how about an "up by 36" rule? :)

65
by kibbles (not verified) :: Thu, 11/09/2006 - 8:40pm

Unless Florida mails it in down the stretch, Tennessee won’t play for the SEC title this year but may be in a prime position to steamroll an unsuspecting bowl opponent in January.

Even if Florida mails it in down the stretch, Tennessee won't play for the SEC title. Florida has already clinched the SEC East, with only one SEC game remaining, one fewer loss than Tennessee, and the head-to-head tiebreaker.

66
by BlueStarDude (not verified) :: Thu, 11/09/2006 - 8:43pm

Hi DavidH. RE: "See what I’m getting at here? Where do you draw the line and say “Yeah, they have one more loss, but their competetion was so much better that they still deserve it.�

But this speaks to my point, which I'm probably not articulating well. Boise St. and Michigan shouldn't both be Division 1, because really Boise St. has no chance to win the Division 1 championship. If a team has no realistic chance at a championship, then WTF are they playing for? The fun of it?

College football has a structural problem and no ranking system is going to fix it. They need to reorganize so that there are fewer teams on the top level (the six major conferences, say) and let those teams duke it out (limit each team from playing more than one game against lower level competition).

67
by NY expat (not verified) :: Thu, 11/09/2006 - 8:51pm

Great article and great discussion. Just to toss in an additional suggestion, what about lessening weights of older games? In particular, California doesn't seem to be the same team that got blown out by Tennessee -- wasn't Ayoob replaced as QB after a few games? Unfortunately, then anyone who plays in a weaker conference suffers because the conference schedule covers most of the end of the season.

68
by DavidH (not verified) :: Thu, 11/09/2006 - 9:24pm

66:
Oooooooooh, OK. That would be sweet.

69
by Arkaein (not verified) :: Thu, 11/09/2006 - 9:49pm

Re 26, 31:
I agree that as a DVOA-like metric, these ratings are not suitable for directly deciding who gets to play in the National Championship game. However (assuming they are accurate), they would make a great strength of schedule adjustment to factor in with total record, creating something like the RPI (rating percentage index, I think) used in NCAA basketball and hockey tournament selections.

One possible formula could work like this:

RPI = WinPct * (SoS * K + C)

and

Sos = avg of opponents FEI

where K is a factor to adjust the weighting of record vs. strength of schedule between 0 and 1, and C is a constant to shift all values to a positive range. This way a team with a loss or two could jump undefeated teams in the ranking if they had a sufficiently high SoS.

70
by Whiskey (not verified) :: Thu, 11/09/2006 - 11:02pm

69:

The problem with using a statistical model in lieu of a playoff is that, by using the model, you destroy its objectivity. Take RPI, for example. The selection committee for the NCAA tournament has been moving away from the RPI, since teams long ago figured out how to game the system to make their RPI's higher. Those that didn't, like the 2002 Butler team, suffered. In subsequent years, they and other competitive teams scheduled in such a way as to raise their RPI's without necessarily improving as teams.

Another example is point differential, which caused stronger teams to run up the score on inferior teams.

If an efficiency score was used, teams in contention for the so-called national championship would then try to maximize their efficiency scores against lousy teams, making the deciding factor boil down to a team's ability to run up the score.

71
by Jason Scheib (not verified) :: Thu, 11/09/2006 - 11:06pm

Sounds like a variation on my Actual Turnover Ratio - every drive ends in either a score or a turnover, and this is just a different way of going about measuring that.

72
by Pat (not verified) :: Thu, 11/09/2006 - 11:15pm

College football has a structural problem and no ranking system is going to fix it. They need to reorganize so that there are fewer teams on the top level (the six major conferences, say) and let those teams duke it out (limit each team from playing more than one game against lower level competition).

Or, probably more appropriately, they need to split Division IA in two. There's just too much of a difference between the BCS conferences and the non-BCS conferences. And yes, you'll get a good team in non-BCS conferences occasionally, but you get Division IAA teams who beat Division IA occasionally too, and there's no one really crowing for a Division IA-IAA showdown.

73
by Travis (not verified) :: Thu, 11/09/2006 - 11:27pm

In particular, California doesn’t seem to be the same team that got blown out by Tennessee — wasn’t Ayoob replaced as QB after a few games?

Longshore has started every game. Ayoob only came in 1) what many would consider garbage time against Tennessee - down 35-0, 8 minutes to go in the 3rd quarter and 2) in the 3rd quarter against Portland State, up 42-16 (and was himself replaced for Steve Levy in the 4th).

74
by Pat (not verified) :: Fri, 11/10/2006 - 12:17am

Except for, of course, the Billingsley ratings.

Incidentally, for those who don't know, Billingsley does use margin of victory, in a twisted way. Teams get bonuses for shutouts and near-shutouts.

God, I wish they would get rid of that rating. It's so ridiculously stupid it's not even worth talking about.

Another example is point differential, which caused stronger teams to run up the score on inferior teams.

Exactly. And you're not trying to "promote good sportsmanship", because those teams weren't always not running up the score to be good sports. The problem is that in football, once you get a strong enough lead, the best way to ensure a victory is to end the game.

Which means it's worse than saying "oh, we'll just cap margin of victory at X points." Whatever you come up with, unless it perfectly measures the likelihood of the outcome of the game, it will change the way the game is played.

Imagine if a team, near the end of the game of the last play of the season, knows that they have to win by, I dunno, 4, instead of just winning by 3, in order to get to the National Championship. And so they go for the TD rather than a field goal they were already in range for.

That's why you can include ranking systems based on wins/losses only: because you can't bias them.

75
by Becephalus (not verified) :: Fri, 11/10/2006 - 12:29am

-66 hit the nail right on the head

College football needs to made made just like English Football Leagues. Roughly the 24 best teams in one league next 24 best in another etc.

he number of meaningful games would probably increase by a factor of 5, players would be better prepared for pros, The games would be more fun to watch, and many more teams could truthgfully say they achieved their maximum expectation for the year.

FOr travel reasons you could break each "league" into two 12 team divisions geographically organized if you like.

76
by Travis (not verified) :: Fri, 11/10/2006 - 12:34am

College football needs to made made just like English Football Leagues. Roughly the 24 best teams in one league next 24 best in another etc.

Seriously, a great idea. College football is the one American sport where promotion/relegation could work. (Boise State would be in the Premier League by now, and Duke would be playing in one of the non-League divisions.)

77
by BlueStarDude (not verified) :: Fri, 11/10/2006 - 1:16am

Hi Pat: RE: "Or, probably more appropriately, they need to split Division IA in two."

Yeah, that's basically what I was getting at. Thanks for putting it more eloquently. ;) Really, I don't care what they would call the various divisions because, like I said earlier, I only care about college football as a minor league for the pros - but I think I would care more about it if the system for determining a national champ made more sense; if we all knew clearly before hand whether a school like Rutgers was playing in the same league as Ohio State or not. Enough of these wishy-washy polls and BCS rankings which are only good for generating the same old pointless arguments year after year on SportsCenter, PTI, etc.

78
by Scott de B. (not verified) :: Fri, 11/10/2006 - 1:51am

It wasn’t tweaked to appease the proponents of human polls. It was tweaked because people thought something was wrong in 2003.

People = conventional wisdom = polls. They're the same.

79
by Brian C Fremeau (not verified) :: Fri, 11/10/2006 - 3:14am

Thanks to everyone at Football Outsiders for taking a first step to college DVOA. And thanks to everyone's feedback so far. Hopefully I can clear up some of the questions posed today.

DavidH's definition of competitive time versus garbage time is pretty close to what I actually use. The possession that begins garbage time occurs when a team's score deficit is greater than 8 times the number of their remaining team possessions plus one. All other possessions (except for 1st-half-ending kneel downs/clock kills) are competitive possessions. This methodology is admittedly simplistic, and I agree that it should be refined as FEI develops.

While I agree with Pat that the value of the result of an isolated possession should consider yardage, clock and points, Game Efficiency and FEI as described above are more interested in the collective success of all drives played in a game. Imagine a drive sequence that digs a team out of its own endzone then punts, pins its opponent deep in its own territory forcing a punt and return to the opp 45, and concludes with a field goal. You may assign a positive success value to the drive that changed field position and the defensive stop, but a slightly negative value to the short-field drive that netted only a field goal. I assign a Game (or in this case Sequence) Efficiency to those three drives as (3 points/7)/(3 possessions/2), or 28.5%. Separating out the drives would certainly distribute credit to the offense, defense and special teams efficiency, but Game Efficiency, as currently defined, is interested only in a data that succinctly describes how the team played the sequence as a whole.

I'll take the advice suggested and reexamine the reasons for factoring/discounting home field advantage.

And as for the merits of a ranking system versus a playoff, I certainly don't propose we eliminate a championship game just because computers or humans unanimously believe one team is better than the other. But I do think rating systems attempt to answer the question "Who is the best team?", while a playoff answers a related, but different question, "Who played the best sequence of games at season's end among the teams selected for the playoff?" With the BCS as it currently exists (a 2-team playoff) or with a hypothetical future expanded playoff, both questions are in play in college football. I kind of like it that way.

80
by Andrew (not verified) :: Fri, 11/10/2006 - 3:55am

Wanker79 #51:

I think I agree with ignoring HFA. I think the point is that if a game is a blow-out, it likely doesn’t matter where that game was played. It may make the game closer, but it probably doesn’t effect the overall w/l outcome. So if HFA is only good for helping a team win by more than it would have on the road, does it really matter?

Or perhaps more to the point, a lot of the home field advantage stemming from blowouts is from games where a powerhouse teams shells out $750,000 for Temple or Ball State or Texas Christian to come and play homecoming. If most of the blowouts are in highly uncompetitive games due to the relative ability of the teams playing, then HFA really doesn't do much when two closely matched teams meet up.

81
by Andrew (not verified) :: Fri, 11/10/2006 - 4:09am

More regarding Garbage Time.

Garbage Time should not merely be when you go up by 35 points or whatever. It should be rationally tied in to scoring records by quarter, because this is what truly influences the beginning of Garbage Time.

I.e., if the records are 42 points in a half, 28 in a quarter, and 21 in the last 6 minutes, and no team ever overcoming a 35 point deficit, the MOV during the game would need to be balanced off against these records to determine when Garbage Time actually begins. Of course part of why no team has ever overcome a 35 point deficit is because teams in blowout wins don't take their foot off the other team's throat in those games until the margin is far above 35 points unless it is already the 4th quarter.

It might also help when eliminating Garbage Time in the periodic blowout to look at the play-by-play data to see when the coach sends in the 2nd and 3rd and 4th stringers. For example, Penn State began pulling starters midway through the 2nd quarter against Illinois last year.

82
by Subrata Sircar (not verified) :: Fri, 11/10/2006 - 7:11am

How does Rutgers FEI index change with their win tonight?

83
by Subrata Sircar (not verified) :: Fri, 11/10/2006 - 7:12am

Oops, sorry for the redundancy there. Taking this win into account, what are Rutgers and Louisville's new FEI ratings?

84
by Arkaein (not verified) :: Fri, 11/10/2006 - 11:41am

Re 70:
Note that I did indicate that we would have to assume that FEI was an accurate representation of a team's ability. I think that any DVOA like metric that looks at every play (which I realize this doesn't quite do) and takes into account strength of opponents would be very difficult to game. RPI tends to use much simpler formulas which is probably why they can be gamed more easily.

I still like to see a tournament in college football (for superior entertainment value as well as fairness), even if just the top four teams, but I think an appropriate purely objective measure would be the best way to select those teams. The BCS isn't willing to go this far, but the I think that the idea has merit.

85
by jacknd71 (not verified) :: Fri, 11/10/2006 - 11:47am

In response to questions #82 and#83, I think you will discover that not only is the FEI of Louisville and Rutgers affected by the outcome of last night's game, but also the FEI's of many other teams which have played them or played teams which played them. I suspect that a calculation now, before the rest of week 11 plays out, will be meaningless.

86
by Wanker79 (not verified) :: Fri, 11/10/2006 - 12:14pm

Re: 80

Yeah, that's pretty much what I was trying to get at, but then I started to talk myself out of it halfway through the thought.

87
by Wanker79 (not verified) :: Fri, 11/10/2006 - 12:26pm

I still like to see a tournament in college football (for superior entertainment value as well as fairness), even if just the top four teams

That's exacty what I've been hoping for. You wouldn't even have to call it a playoff, just add one additional Bowl game to be played by the winner of the #1 vs #4 and the winner of the #2 vs #3 Bowl games. All the seeding could still be chosen using the exact same method as now (just so there isn't TOO much change all at once). I really can't fathom why anyone would be against this. The biggest arguement I've heard against a playoff system is that you'd be taking away some of the luster of the Bowl season or that it'd take away from the importance of the regular season. But you're not taking away from the Bowl games. If anything, you're adding interest to 2 Bowls. And the regular season is still just as important (if not moreso) because the seeding for the 2 playin Bowls would still work the same as it does now, except there'd be even more teams with a chance. I just don't understand. And until they figure it out, I just can't take college football as seriously as the other major sports (plus hockey, dammit).

88
by zlionsfan (not verified) :: Fri, 11/10/2006 - 4:33pm

Re 75: yes, that would be very interesting, and I do like the relegation/promotion idea. However, you have to keep in mind that you'd get the bad parts of the system as well as the good parts, something to which you alluded. I think this would have to wait until football and colleges are separated.

For one thing, in England, you've got four leagues distributed in an area roughly the size of Louisiana. Even dividing teams into geographic divisions, you'd eventually have some really long road trips, which would rule out most student trips to road games (and quite a few non-student trips). Interestingly enough, right now in the Massey ratings (which helpfully rank everyone together), our League 2 (using their terminology) would be split pretty evenly, with the West covering Texas to Oregon and the East covering Louisiana to Massachusetts, so it's not quite as bad right now as it could be.

For another, you'd have the same financial disparities here that you see there. So Northwest Missouri makes it up into League 1 with Georgia, Alabama, and Miami (FL). It would hurt those schools just as much to be in League 1 (and play smaller schools with smaller gate receipts) as it would hurt NW Mo to try to compete with them. But maybe that would be motivation to avoid relegation ...

Plus you wouldn't be able to do the in-season competitions - you'd have to stick to postseason play.

I still think it would be really cool, but I don't think it would ever happen.

89
by grailsearch (not verified) :: Sat, 11/11/2006 - 9:10am

The problem with almost all RPI-type rankings is that they're very easy to manipulate if you're a team with legitimate championship aspirations. Other than conference-mandated contests, play 0 elite teams and 0 bottom half teams.
The RPI-type rankings are misleading because it's not your averaged SOS that really defines your record but the # of teams you play that you could reasonably lose to. Two teams with overall SOS that average equally could have very different numbers in that case. There's a HUGE difference between playing 2 teams that could reasonably defeat you and playing 4.
The bowl system not only completely undermines the legitimacy of national championships (USC is always listed as having won recent consecutive national championships; but by the logic that allows them to claim two, it makes more sense to claim they won zero. They weren't in the national title game the first year they supposedly won, and the year of their "second" title an undefeated Auburn team clearly had more quality wins.) But it also destroys the integrity of the regular season. For all the denigration of smaller conference schools, very few schools from power conferences have more than two good wins, and it encourages the lopsided matchups which are tedious for the fan and place good teams in the position of either being bad sports or wasting their time and place bad teams in a position to be humiliated.
How much fun is a sport where you're legitimately interested in what your team does only a couple of times a year?