Comparing the final Football Outsiders F/+ projections to the AP poll and conventional wisdom.
26 Sep 2008
by Bill Connelly
The most frustrating thing about college football stats is that while one team is putting up good numbers against a good team, some other team is putting up great numbers against a terrible team. It's impossible to get too much information from statistical rankings because of this.
On August 30, Graham Harrell threw for 536 yards against Eastern Washington; meanwhile, Sam Bradford threw for 395 yards against Cincinnati on September 6. Last year, Colt Brennan threw for 416 yards and 6 touchdowns against Northern Colorado on September 1, while Tim Tebow threw for 304 yards and 2 touchdowns against South Carolina on November 10. By all basic statistical accounts, Harrell's and Brennan's stats were insanely good and looked better than Bradford's and Tebow's performances on the ESPN scroll. However, could Brennan have put up Tebow's numbers against SC? What would Bradford have done against Eastern Washington? Thanks to the "+" concept, we can start to approximate that.
When I started entering all this play-by-play data, one of my main goals was simply to apply some of the basic sabermetric ideas to football. If they make sense in one sport, they should make sense in another, no? The idea behind my EqPts (and therefore Points Per Play) measure came from two baseball concepts: EqR, the Equivalent Runs concept that takes a series of offensive stats and determines how many runs those stats should have produced on average, and Expected Runs, the matrix that shows you, on average, how many runs you can expect out of specific "__ runners on, __ outs" situations. And of course the S&P measure (Success Rate + Points Per Play) was an admitted and obvious take-off of OPS.
Well, the "+" concept is a co-opt of the ERA+ and OPS+ (also known as Adjusted ERA or Adjusted OPS) figures. It starts with saying, basically, that not every 3.68 ERA or 0.890 OPS is created equal. Was it during the deadball era? Was it in a hitter's park or the Polo Grounds? You try to put everybody on as even a playing field as possible to evaluate their stats. That idea should work for football too, right?
The goal of the "+" concept is to adjust for what's expected against different opponents. There is no need to take things like "park factors" into account just yet; the "+" figures I am using for now simply compare a team's output to the average output of the opponent's opponents. For every major measure I use, both the ones I created and the ones I borrowed ("borrowed" sounds much better than "stole") from others -- Success Rate, PPP, S&P, Line Yards/Sack Rates, etc. -- you could create "+" measures that compare an offense's or defense's performance to what their opponents typically averaged.
Since I'm a Mizzou fan, and I'm still a bit bitter that the Tigers were left out of last year's BCS bowls, I'll use last year's Rose Bowl as an example. Last January 1, USC thumped Illinois, 49-17. Without taking special teams and turnovers into account, USC put up 48.44 EqPts and a 1.092 S&P to Illinois' 12.11 EqPts and 0.536 S&P. How did USC's offensive performance compare to what a typical team did to Illinois? As one could probably surmise, it compares quite favorably. For the season, the Illinois defense gave up an average of 18.95 EqPts and a 0.669 S&P. For EqPts, USC gained 2.56 times what the average Illinois opponent gained, 256 percent of normal. Meanwhile, its S&P was 163 percent what the typical Illinois opponent managed. One of the main ideas behind the "+" concept is that 100 = 100 percent of normal. Therefore, USC's EqPts+ against Illinois was a stellar 256, and its S&P+ 163.
Meanwhile, if you flip the equation, you can come up with a defensive score as well. (You have to flip the equation so a good defensive performance also results in a score above 100. Keeping everything on the same scale is good for sanity.) Illinois' average offensive numbers were 25.15 EqPts Per Game and a 0.815 S&P; that means USC put up a Defensive EqPts+ of 208 (25.15 divided by 12.11 = 2.08) and a Defensive S&P+ of 152 (0.815 divided by 0.536).
Wasn't that fun? Quite simply, we can more accurately measure how good teams really were. The "+" concept is obviously similar to the VOA number that FO has mastered. If the "100 is good, above 100 is good, below 100 is bad" idea is hard to remember or grasp, you could easily think of a 163 S&P+ as something similar to a 63% VOA. Whatever floats your boat. I know that FO wants to move toward a collegiate DVOA figure if at all possible, but in the meantime consider this a crude substitute. Get used to the "+." You're going to be seeing a lot of it in future Varsity Numbers columns.
The bottom line is that the "+" concept gives you a way to factor in teams' strengths of schedule to their overall stats. Technically you could do this same thing with rushing yards, actual points, or any of the other standard box score stats, but since I've been doing all of this measuring of EqPts, success rates, etc., and since I'm very much sold on the quality of these measurements (and I want you to be as well), by god I'm using them.
The best way to illustrate what the "+" concept can do is to probably show you some rankings.
|Top 10 Offenses in EqPts Per Game|
|Rank||Team||EqPts Per Game|
Now, none of the names on that list are particularly surprising, but how do these rankings compare to pure scoring and yardage rankings?
|Top 10 Offenses, with Traditional Rankings|
|Team||Points Per Game Rank||Yards Per Game Rank|
And what about some of the teams who ranked high in the "regular" rankings but didn't appear in the top 10 above?
|Other Notable Offenses, with EqPts+ Rankings|
|Team||Points Per Game Rank||EqPts+ Per Game Rank|
|* This is Houston's ranking in yards per game.|
As you would expect, teams with tougher slates -- i.e,. a lot of SEC teams -- were held in higher regard using the "+" concept. And Houston played a really weak schedule.
So what about the S&P+ measure? That takes both efficiency and explosiveness into account. It also eliminates the built-in advantages of spread offenses that huddle rarely and run a lot more plays when it comes to points per game.
|Top 10 Offenses in S&P+|
First of all, kudos to Florida and to Heisman voters. Tim Tebow quarterbacked what was simply the best offense in the country according to these numbers, and since he basically was the rushing game ... yeah, Tebow gets some dap.
And just for fun...
|Rushing S&P+, Offense||Passing EqPts+, Offense|
Some dap for Navy there as well -- of course, they put up big-time rushing numbers running Paul Johnson's option system, but they apparently did it against a series of respectable rushing defenses. He's already seeing success on the ground at Georgia Tech too.
On to defense.
But first, a caveat: It's very much possible for an offense to put up something like 0.32 EqPts in a given game. Since you're flipping the equation now, the opposing team's offensive average is in the numerator, and the 0.32 would be in the denominator. If you take that team's average (say, 15.0) and divide it by their 0.32 output for that game, you're going to get an insanely high defensive EqPts+ score (4687.5, to be exact), and obviously that would significantly skew averages. So I installed a cap: no "+" score for an individual game can be higher than 300. I'm open to suggestion on whether or not there's a better cap to use, but that's what I've applied to date.
|EqPts+, Defense||S&P+, Defense|
|1||Ohio State (134.91)||Ohio State (163.65)|
|2||USC (123.62)||USC (144.67)|
|3||LSU (122.75)||LSU (144.45)|
|4||Virginia Tech (119.97)||Virginia Tech (140.65)|
|5||Oklahoma (119.77)||Rutgers (140.34)|
|6||TCU (118.16)||Oregon State (140.24)|
|7||Rutgers (117.95)||Oklahoma (134.54)|
|8||South Florida (117.72)||Penn State (133.84)|
|9||Penn State (117.67)||Boise State (133.01)|
|10||Auburn (117.41)||Arizona State (131.68)|
|Rushing S&P+, Defense||Passing S&P+, Defense|
|1||Ohio State||Ohio State|
This obviously opens the door for a ranking system. This is nothing official by any means, but what happens if you add together the two EqPts+ measures and the two S&P+ measures? You get the following list:
|Rank||Team||EqPts+ plus S&P+|
Aside from the Boise State outlier (and the fact Georgia is inexplicably No. 27), that's really not a bad list. It's a start, anyway. Next week, we'll talk about which stats are the most correlated with actual victories, and that will give us a good idea of which categories need to be included in any sort of "+" ranking.
One thing I'll look into doing when I have more time on my hands is seeing which of the following methods is best for doing this.
Method A: What I've described above. Team A's S&P vs Team B divded by Team B's average allowed S&P.
Method B: Comparing it more intricately to what would have been expected by saying "Team A rushed 27 times against Team B and passed 38 times. That should have produced 18.67 EqPts and a 0.734 S&P." It's more specific (and time-consuming) than the current method, and while it wouldn't change much when it comes to the major categories (Overall EqPts+, et. al.), it might be better for the categories with smaller sample sizes ("Third Down Non-Passing Downs Rushing S&P+" and things like that) where one really good or really bad game (or even one play) can skew the numbers.
As I said in my first column, I tend to think of three main reasons for getting as deep as possible into sports statistics: 1) to understand the game better, 2) to evaluate/rank stuff, and 3) to predict stuff. The "+" concept applies mostly to (2), and possibly a bit toward (3) -- if a team has an Offensive EqPts+ of 113, you could take their upcoming opponent's Defensive EqPts Per Game figure, multiply it by 1.13, and get a pretty decent read on how many points they may score. We'll go into a bit more detail on that in a future column. (All the degenerate gamblers' ears just perked up.)
There are two problems with the "+" concept, however:
1) As with most measures, it requires a pretty healthy sample size before it really becomes applicable. You probably need four or five games under your belt before your averages really mean anything.
2) It requires a full set of data/results from all 120 FBS teams (plus the six "tiers" of FCS teams into which I merge all FCS opponents) to give a 100 percent complete look, and right now that's not possible. I'm keeping up with all BCS conference results (40 to 50 play-by-plays) on a week-to-week basis, but there are still 15 to 20 non-BCS conference games that get left by the wayside. I'm working on ways to up my productivity in that regard, but until that happens I'm limited. For recent Mizzou previews (example here), I've been taking the 2007 "+" numbers and making manual adjustments as I see fit -- obviously not the most statistically pure way to go about it. But it's all I can do until I've got everybody's data.
In all, though, I like to consider the "+" concept a pretty strong step in the right direction, especially when some programmer takes pity on me and creates a play-by-play parser or something. Next week we'll begin to look at some of the ways the "+" can be used to both evaluate teams in detail and tell us what's most important when it comes to simply winning games.
More responses to comments from last week's column...
Good statistics are either explanatory or predictive -- they either show why something happened, or predict what will happen in the future. These stats are neither.
Nor were they supposed to be. Think of last week's column as a table setter for future columns.
What you should be doing is considering punts as turnovers, because that's what they are.
This is actually an interesting idea. I might have to tinker with this. No matter how I do it, the basic point impact will be the same, but how it's applied to an offense's point total can still be tweaked.
Might the other 3 "missing" points be from OVERTIME possessions, as both teams start on the other's 25 yd line? Not sure, but I bet a 25 yd run/pass TD on 1st down wouldn't generate 6/7 EqPts but would on the scoreboard.
This is an absolutely fantastic point. Starting a possession at your opponent's 25, with no special teams occurrence or turnover to get you there, wipes a few points (3.706, to be exact--that's the point value of the opponent's 25) off the board. There aren't simply a ton of overtime games, so I doubt this accounts for the full missing two points per game, but it has to account for something, and I slapped my forehead awfully hard when I read this.
9 comments, Last at 30 Sep 2008, 1:05am by swc