Writers of Pro Football Prospectus 2008

Most Recent FO Features

ManningPey98.jpg

» Broncos at Patriots: Week 9 Preview

Brady-Manning XVI: the biggest game in the AFC this year. Denver has juggernaut potential as a complete team, but the Patriots have the crucial home-field advantage in Week 9.

21 Nov 2012

FEI Week 12: Similarity Scores

by Brian Fremeau

Another Saturday is in the books, and another day full of surprises shook up the college football universe. In the last two weeks, Alabama, Kansas State, and Oregon were each positioned in the driver’s seat with a clear path to the BCS national championship game. Then, each lost a game as a double-digit favorite. Notre Dame now finds itself at the top of the BCS standings, looking to avoid a similar fate against rival Southern Cal.

There’s nothing specific in the data we track that would have predicted the recent poll carnage with any degree of certainty. The numbers did suggest that Oregon was more likely than not to stumble at some point in the home stretch, and Stanford’s defense did pose a major threat to the Ducks’ offensive attack, but FEI still projected Oregon to win by two touchdowns last weekend. Baylor was certainly capable of big plays and efficient offense against Kansas State, but FEI didn’t expect the Bears defense to play their best game of the season and turn it into a rout.

Projecting single games is a bit of a crap shoot, of course, and we need to see the forest for the (Stanford) trees. How successful are aggregated game-by-game projections? How successful are season-long projections? The projections aren’t something I’ve focused on too much in the weekly FEI column here at Football Outsiders, but I have been tracking the data at my personal website. There is still much more analysis to be done, but I’m satisfied with the progress so far. And I’m interested in making more sophisticated projections by including more data going forward.

I’ve been playing around recently with a new model I developed to calculate team similarity scores. I calculate a variety of metrics each week that characterize the offensive, defensive, special teams, and overall success and efficiency of each FBS team. The overall FEI rating is the most important of these metrics, but I’ve calculated the relationship between dozens of other stats (my drive-based stats, as well as many common stats) and each is important in its own context.

To build a new team similarity model, I selected 25 measures to define each team’s efficiency profile. The profile includes the opponent-adjusted overall FEI, offensive FEI, and defensive FEI ratings, as well as unadjusted data such as game efficiency, field-position advantage, special-teams efficiency, and 19 other offense, defense, and special teams splits. The significance of each measure in the profile was weighted in accordance with that stat’s correlation with overall winning percentage. I then compared the efficiency profile of the contenders against the efficiency profiles of each FBS team in the last five seasons, 599 teams in all.

By this analysis, the team with the most similar efficiency profile to the 2012 Notre Dame Fighting Irish is the 2010 Oklahoma Sooners, as illustrated in the chart below.

I find this team similarity model to be really interesting, but what can actually be learned from it? For starters, it is a good reminder that similarities in advanced stats don’t always translate to traditional statistics. The 2010 Oklahoma team scored at least 40 points seven times, whereas this year’s Notre Dame team has only done so twice. The Irish have only given up 20 points once this year, but the 2010 Sooners did so ten times. There might not be any obvious similarities on the surface, and yet, the opponent-adjusted data suggests that the profiles are quite similar. If this year’s Notre Dame team had played the 2010 Oklahoma schedule, should it have expected to have similar results to the Sooners? If the 2010 Sooners had played Notre Dame’s schedule this year, would its results have been expected to be similar to the Irish? That's the idea, at least.

I’m also interested in exploring the significance of the outlier data. 2010 Oklahoma and 2012 Notre Dame are a close match in many categories, but they are dramatically dissimilar in terms of special-teams efficiency. The Sooners ranked 27th in that category in 2010, whereas Notre Dame currently ranks 90th in that category. The Irish had a special-teams deficit by FEI's numbers in eight of their 11 games this year, and while it hasn’t cost them a victory yet – there have been a couple of close calls. The closest calls came in their narrow victories over Purdue and Pittsburgh, games in which the Irish lost 8.8 points in special teams combined.

Here are the efficiency profiles most similar to the current FEI top-10 teams:

FEI Team Similarity Profiles
FEI
Rank
2012 Team Team Similarity Profile
1 Alabama (9-1) 2008 Penn State (10-2)
2 Notre Dame (11-0) 2010 Oklahoma (10-2)
3 Oregon (9-1) 2010 Oregon (11-1)
4 Kansas State (9-1) 2010 Virginia Tech (11-2)
5 Oklahoma (7-2) 2010 Auburn (13-0)
6 Texas A&M (7-2) 2011 Stanford (11-2)
7 Florida (9-1) 2007 Virginia Tech (10-3)
8 Oregon State (8-2) 2010 Missouri (9-3)
9 Ohio State (11-0) 2009 Oregon (10-3)
10 Stanford (9-2) 2011 Michigan State (10-3)

There are many more things that can be done with the team similarity project, including an incorporation of that complete efficiency profile in the weekly projections. Currently, my weekly FEI projections are produced from a formula derived from aggregate FEI data. But what if, instead, we ran a similarity exercise comparing data for both teams in a given matchup against every FBS game played over the last five seasons (3576 games in all)?

Using FEI, OFEI, DFEI, and STE data only, I compared the Notre Dame and USC matchup against every game over the last five seasons to find matchups that were most similar across those efficiency data points. The most similar game was a 38-14 victory by Alabama over Arkansas last season. That is, Notre Dame’s profile compared with 2011 Alabama’s profile and USC’s profile compared with 2011 Arkansas’ profile produced the closest game match.

Not every similar matchup favors a victory for Notre Dame, of course. Oregon’s 2007 losses to California (24-31) and Arizona (24-34), LSU’s 2007 losses to Arkansas (48-50 in triple overtime) and Kentucky (37-43 in triple overtime), and USC’s 2008 loss to Oregon State (21-27) all rank among the 25 closest game similarity matches for this weekend’s ND-USC game.

If we aggregate the data, can we use it to make a more precise projection for the game? Taking the 25 most similar games, the Irish would be expected to win 80 percent of the time by an average final score of 33-24. Taking the 50 most similar games, the Irish would be expected to win 84 percent of the time by an average final score of 36-23. Taking the 100 most similar games, the Irish would be expected to win 80 percent of the time by an average final score of 35-23. Using my current methodology, FEI projects the Irish to have a 71 percent chance of victory by a score of 30-20.

This project is still in its infancy, but I’m looking forward to doing more with projections in the coming weeks. Any feedback would be appreciated.

Week 12 Revisionist Box Scores

This weekly feature identifies the games played each week that were most impacted by turnovers, special teams, field position, or some combination of the three. The neutralized margin of victory is a function of the point values earned and surrendered based on field position and expected scoring rates.

Week 12 Games In Which Total Turnover Value Exceeded Non-Garbage Final Score Margin
Date Winning Team Non-Garbage
Final Score
Losing Team TTV
+
TTV
-
TTV
Net
TO Neutral
Score Margin
11/17 Eastern Michigan 29-23 Western Michigan 6.3 0.0 6.3 -0.3
11/17 Middle Tennessee 20-12 South Alabama 13.2 0.0 13.2 -5.2
11/17 Northwestern 23-20 Michigan State 17.3 0.0 17.3 -14.3
11/17 Oklahoma 50-49 West Virginia 7.2 5.0 2.2 -1.2
11/17 Purdue 20-17 Illinois 10.2 0.0 10.2 -7.2
11/17 San Jose State 20-14 BYU 11.7 2.3 9.4 -3.4
11/17 Utah State 48-41 Louisiana Tech 14.1 0.0 14.1 -7.1
11/17 UTEP 34-33 Southern Mississippi 4.6 0.0 4.6 -3.6
11/17 UTSA 34-27 Idaho 11.2 3.5 7.7 -0.7
11/17 Wyoming 28-23 UNLV 6.3 0.0 6.3 -1.3

Week 12 Games In Which Special Teams Value Exceeded Non-Garbage Final Score Margin
Date Winning Team Non-Garbage
Final Score
Losing Team STV
+
STV Neutral
Score Margin
11/17 LSU 41-35 Mississippi 12.6 -6.6
11/17 Oklahoma 50-49 West Virginia 2.7 -1.7
11/17 Stanford 17-14 Oregon 6.1 -3.1
11/17 UTEP 34-33 Southern Mississippi 9.6 -8.6

Week 12 Games In Which Field Position Value Exceeded Non-Garbage Final Score Margin
Date Winning Team Non-Garbage
Final Score
Losing Team FPV
+
FPV
-
FPV
Net
FPV Neutral
Score Margin
11/17 LSU 41-35 Mississippi 38.5 24.8 13.7 -7.7
11/17 Oklahoma 50-49 West Virginia 25.9 19.3 6.6 -5.6
11/17 Purdue 20-17 Illinois 21.6 16.1 5.5 -2.5
11/17 UTEP 34-33 Southern Mississippi 24.9 13.7 11.2 -10.2

2012 totals to date:

  • Net Total Turnover Value was the difference in 98 of 621 FBS games (15.8 percent)
  • Net Special Teams Value was the difference in 46 of 621 FBS games (7.4 percent)
  • Net Field Position Value was the difference in 59 of 621 FBS games (9.5 percent)
  • Turnovers, Special Teams and/or Field Position was the difference in 135 of 621 FBS games (21.7 percent)

2012 Game Splits for all teams, including the offensive, defensive, special teams, field position, and turnover values recorded in each FBS game are provided here.

FEI Week 12 Top 25

The Fremeau Efficiency Index (FEI) rewards playing well against good teams, win or lose, and punishes losing to poor teams more harshly than it rewards defeating poor teams. FEI is drive-based and it is specifically engineered to measure the college game. FEI is the opponent-adjusted value of Game Efficiency (GE), a measurement of the success rate of a team scoring and preventing opponent scoring throughout the non-garbage-time possessions of a game. FEI represents a team's efficiency value over average.

Other definitions:

  • SOS Pvs: Strength of schedule to date, based on the likelihood of an elite team going undefeated against the given team's schedule to date.
  • SOS Tot: Strength of schedule, based on the likelihood of an elite team going undefeated against the given team's entire schedule. Conference championship games and bowl games are not yet included.
  • FBS MW: Mean Wins, the average number of games a team with the given FEI rating would be expected to win against its entire schedule.
  • FBS RMW: Remaining Mean Wins, the average number of games a team with the given FEI rating would be expected to win against its remaining schedule.
  • OFEI: Offensive FEI, the opponent-adjusted efficiency of the given team's offense.
  • DFEI: Defensive FEI, the opponent-adjusted efficiency of the given team's defense.
  • STE: Special Teams Efficiency, the scoring value earned by field goal, punt and kickoff units measured in points per average game.
  • FPA: Field Position Advantage, the share of the value of total starting field position earned by each team against its opponents.

These FEI ratings are a function of results of games played through November 17th. The ratings for all FBS teams, including FEI splits for Offense, Defense, and Special Teams can be found here. Program FEI (five-year weighted) ratings and other supplemental drive-based data can be found here.

Rk Team FBS
Rec
FEI LW GE GE
Rk
SOS
Pvs
Rk SOS
Tot
Rk FBS
MW
FBS
RMW
OFEI Rk DFEI Rk STE Rk FPA Rk
1 Alabama 9-1 .283 2 .314 2 .289 54 .285 58 9.9 1.0 .366 18 -.583 6 2.411 9 .562 4
2 Notre Dame 11-0 .275 4 .197 8 .210 34 .153 31 10.3 0.7 .541 6 -.731 2 -.949 90 .492 75
3 Oregon 9-1 .268 3 .339 1 .397 73 .231 46 9.6 0.5 .442 10 -.542 11 .470 47 .536 20
4 Kansas State 9-1 .262 1 .242 5 .205 33 .172 37 9.2 0.8 .312 23 -.538 12 4.279 1 .587 1
5 Oklahoma 7-2 .253 5 .196 9 .198 30 .137 24 8.7 1.6 .463 9 -.510 14 1.136 34 .518 44
6 Texas A&M 7-2 .244 6 .183 13 .190 27 .186 40 8.2 1.0 .625 2 -.316 28 -.750 86 .506 57
7 Florida 9-1 .239 7 .146 20 .218 35 .136 22 8.7 0.5 .095 47 -.740 1 3.170 2 .552 11
8 Oregon State 8-2 .224 8 .119 28 .191 28 .123 15 8.2 0.5 .431 12 -.440 21 .607 43 .510 53
9 Ohio State 11-0 .215 12 .162 17 .280 52 .244 51 9.9 0.8 .522 7 -.415 23 -.814 87 .501 64
10 Stanford 9-2 .212 13 .127 26 .105 6 .074 3 8.7 0.6 .091 48 -.706 4 1.509 26 .556 7
11 Nebraska 8-2 .209 11 .094 37 .153 19 .147 29 8.2 0.9 .589 4 -.474 15 -1.690 108 .461 105
12 Florida State 8-1 .203 9 .259 3 .588 104 .422 85 8.3 0.5 .127 42 -.544 10 2.252 12 .554 9
Rk Team FBS
Rec
FEI LW GE GE
Rk
SOS
Pvs
Rk SOS
Tot
Rk FBS
MW
FBS
RMW
OFEI Rk DFEI Rk STE Rk FPA Rk
13 LSU 8-2 .195 10 .129 24 .132 12 .130 19 8.2 0.9 .158 40 -.554 8 1.798 20 .558 6
14 Georgia 9-1 .177 15 .227 7 .358 65 .338 68 9.0 0.8 .397 15 -.328 26 .209 54 .527 29
15 Oklahoma State 6-3 .173 25 .129 23 .278 49 .114 11 7.4 0.8 .467 8 -.293 30 2.683 4 .486 78
16 South Carolina 8-2 .168 14 .181 14 .239 39 .176 39 8.0 0.5 .117 45 -.553 9 -.494 79 .506 58
17 Texas 8-2 .168 20 .113 33 .276 47 .129 18 8.4 1.0 .405 14 -.102 47 2.551 7 .564 3
18 UCLA 9-2 .158 22 .118 30 .368 70 .282 57 8.9 0.4 .276 26 -.426 22 -.407 74 .553 10
19 Wisconsin 6-4 .151 16 .123 27 .188 26 .148 30 7.3 0.5 .095 46 -.521 13 -.060 60 .539 14
20 USC 7-4 .145 18 .119 29 .152 18 .095 7 7.5 0.3 .289 24 -.197 37 1.389 27 .522 37
21 Clemson 9-1 .143 19 .189 11 .505 92 .421 84 8.6 0.5 .441 11 -.049 59 2.074 13 .524 33
22 Michigan 8-3 .142 28 .153 19 .100 5 .060 1 7.5 0.2 .330 21 -.408 24 .169 56 .487 77
23 Utah State 8-2 .142 26 .195 10 .364 67 .359 73 8.8 1.0 -.030 66 -.453 19 .639 42 .513 49
24 Cincinnati 5-3 .141 17 .117 31 .567 101 .549 105 7.7 1.7 .176 37 -.559 7 .286 51 .526 30
25 Northwestern 7-3 .141 27 .043 48 .247 42 .244 50 7.7 1.0 .255 30 -.194 38 2.530 8 .520 40

Posted by: Brian Fremeau on 21 Nov 2012

4 comments, Last at 23 Nov 2012, 1:16am by IrishGush

Comments

1
by Kal :: Wed, 11/21/2012 - 6:33pm

This is really fantastic stuff. I wish we got to see more of these sorts of things in the NFL side; both you and Bill have been knocking it out of the park in analyzing things like program success, coaching success, team 'identity' and the like.

I would imagine your next step is to do a regression analysis and see if comparing past results is actually 'predictive' of the real result.

2
by Brian Fremeau :: Wed, 11/21/2012 - 6:45pm

Thanks, Kal. I had a similar thought. Apply this approach and recreate projections over the last few weeks and see if it actually produces more accurate results.

The trick is that I don't have a reliable collection of team efficiency splits until the middle of the year, and each week provides more data to strengthen that collection. Also, I'm comparing current 2012 team profiles to full-season team profiles over the last few years, and I wonder what it would look like if I compared only same-week data.

3
by Kal :: Wed, 11/21/2012 - 8:03pm

Oh - for that, I figured you'd simply look at a prior year and do the analysis. No need to talk about 2012. See if doing an analysis of similarities could reasonably predict, say, Alabama/LSU. Or Oregon/USC. Or non-upsets, like Oregon/Stanford in 2011. You've got all the data for that year; all you need to do is take the 25 closest games to those and see how they work.

4
by IrishGush (not verified) :: Fri, 11/23/2012 - 1:16am

Thanks for the great work, Brian. Fantastic job, as usual, and these statistics really help hone in on key factors that otherwise might be overlooked and under-appreciated.

You might also want to dive into what you think drives the difference between your methodologies' predictions and your similar teams' projections. In the case of ND-USC, for some reason, the ND-parallel beat the USC-parallel 80% and 84% of the time (among your top 25, 50 and 100 similar games); the 71% projected odds aren't far off, but that 10% distinction doesn't help this anxious Irish fan's confidence (although Barkley's injury more than makes up for it).

Is that partly because there's still one game remaining, or do you think that eleven games provide enough sample size and other factors explain the difference?