Varsity Numbers: Projections and Whatnot
by Bill Connelly
Earlier this week, Brian Fremeau and I completed our rough-draft F/+ projections. We tend to create this to give us a general idea of where projections may lie while writing the rough drafts of our conference chapters for the upcoming Football Outsiders Almanac 2012. Then, after spring practice, the NFL Draft, etc., we complete the projections for the book. These initial projections come from combining mine and Brian's own personal FEI and S&P+ projections (like we combine the actual ratings during the season).
Because of both my desire for conversation and my pathological need to share information when I have it, I posted this year's rough-draft projections at Football Study Hall this week. I felt I should elaborate on them a bit here, or more specifically, I felt I should elaborate on what information is missing from these projections (and won't be from the final version).
What's In There?
First, here is the current top 40 as projected:
1. Alabama (+37.2%)
2. LSU (+32.8%)
3. Oklahoma (+27.4%)
4. Oklahoma State (+26.8%)
5. Oregon (+23.3%)
6. USC (+21.2%)
7. Georgia (+20.3%)
8. Florida State (+19.1%)
9. Stanford (+19.0%)
10. Texas (+18.4%)
11. Notre Dame (+16.9%)
12. Michigan (+16.8%)
13. Florida (+16.3%)
14. Arkansas (+15.2%)
15. South Carolina (+15.1%)
16. Texas A&M (+15.0%)
17. West Virginia (+14.7%)
18. Wisconsin (+14.7%)
19. Michigan State (+14.6%)
20. Ohio State (+14.4%)
21. TCU (+13.9%)
22. Virginia Tech (+13.8%)
23. Tennessee (+12.0%)
24. Nebraska (+11.9%)
25. South Florida (+11.6%)
26. Georgia Tech (+11.6%)
27. Auburn (+9.9%)
28. Baylor (+9.6%)
29. Houston (+9.5%)
30. Missouri (+9.2%)
31. Clemson (+9.2%)
32. North Carolina (+8.2%)
33. Utah (+7.8%)
34. BYU (+7.6%)
35. Kansas State (+7.2%)
36. Vanderbilt (+7.0%)
37. Penn State (+7.0%)
38. Texas Tech (+6.5%)
39. Boise State (+6.3%)
40. Rutgers (+5.9%)
For the most part, the current projections include just the following:
Returning Starter Figures
For this, we used the numbers documented by Phil Steele a couple of months ago. These have obviously changed since then and probably will again. In work I have done this offseason, I have begun to realize that particularly high or low figures have an impact of far greater magnitude than those in the middle range. Because of this, and its impact on the S&P+ side of the projections, you get some weird projections like Tennessee (18 returning starters) 23rd and Boise State (five) 39th. You may see this impact tampered down a bit before the final projections are released.
Two-Year Recruiting Rankings
In the past, we have utilized a five-year recruiting ranking, but I switched to two for the reasons on which I elaborated here.
For the last few years, we have incorporated recruiting rankings into our Football Outsiders Almanac projections. I have begun to realize over the last few months, however, that we may have been going about it all wrong. For the most part, we have used a weighted, five-year average of recruiting rankings, using the logic that most of your starters will have been recruited two, three, four or even five years ago, therefore their rankings are what matter the most. However, it appears that isn't necessarily the case. In general, recruiting rankings are indeed predictive, but they have a shelf life. By the time a class is four years old, their performance has trumped their potential. Classes from a while ago, therefore, should be judged by on-field performance; older recruiting rankings are no longer valid.
For instance, the fact that Florida quarterback John Brantley was a high-four-star recruit in 2007 no longer mattered by the time he was a senior in 2011. The only thing that mattered was the potential he had actually shown on the field (which was, to put it as politely as possible, not that of a high-four-star prospect).
The rankings that are valid, however, are of those who have not yet played major roles, but will soon. I've been tinkering with correlations recently, and it appears that use of recruiting rankings becomes quite a bit more accurate and helpful when only looking at the past two classes. After that, the on-field product takes over.
Recruiting rankings have thus far just been used on the S&P+ part of the equation. In all, that means it is carrying about one-sixth of the weight of the projections as a whole. When all is said and done, they may carry a bit more weight. We shall see.
On the S&P+ side, I am still tinkering with how much weight to place on performance from two to five years ago, as compared to last year's performance. We have shown in the past how a team's four- and five-year history can be as predictive as their past year's performance, but on the S&P+ side at least, only last year's data has been included. On the FEI side, more years may have been included.
And really, that's about it. These factors make up quite a heavy portion of the final projections, but there is still work to be done.
What Isn't In There?
Here is a look at what still needs to be factored into the projections.
Updated Returning Starter Data
As players get arrested/dismissed, suspended, injured, etc., these numbers will change. They do not factor incredibly heavily into the projections, but it does make a difference. It is still up in the air how we may handle suspensions.
Who This May Help: Anybody who doesn't get anybody hurt or kicked off the team for something stupid this spring.
Who This May Hurt: TCU. Let's just say that if you are leading the Fulmer Cup race by this much, this soon, your "returning starters" figure probably isn't going to end up as high in May or June as it was on January 20. (This could also impact Georgia once we factor suspensions into the mix. They have piled them up on the defensive side of the ball this offseason.)
On the S&P+ side, at least, this is still a factor for inclusion, once the weights have been determined.
Who This May Help: Arizona, Auburn, Boston College, Kansas, Kentucky, Maryland, Ole Miss, Oregon State, Texas Tech, Troy. Those are the ten teams whose five-year F/+ ratings are the most difference (in a positive way) from their 2011 ratings.
Who This May Hurt: Baylor, Florida International, Houston, Kansas State, Louisiana Tech, Southern Miss, Temple, Texas A&M, Toledo, Utah State. Those are the ten teams on the other end of the spectrum.
Remember, 2011 will still almost certainly carry the most weight overall, but those previous seasons do still matter; a single-season's performance does not give us the same look at overall program health that a handful of seasons does. It is certainly possible for teams to make a one-year change and have it stick (hello, Kansas), but most of the time you regress back toward the four- or five-year mean.
You would expect that draft data would hold at least a decent correlation to the future season's success, but here's something you may not have expected: Using draft points and total picks lost, you can derive almost as much about a team's success in the upcoming season as you can looking at returning starter data.
Here are some correlations for you: On offense, returning starter data holds a 0.36 correlation to improvement or regression the following season. Draft Points holds a 0.33 correlation. On defense, returning starters hold a 0.27 correlation; Draft Points, 0.22. For a change factor, these are rather strong numbers. More concrete, stable values like 5-year data and recruiting rankings hold correlations in the 0.6s and 0.7s, and they do not change much from year to year. But, complementary factors like returning starters and draft data will absolutely play a role in FO's 2010 projections, if a more minor one.
Others might wonder why the correlations aren't even stronger. Two main reasons: 1) There are enough differences between college and pro styles of football that simply being successful at the college level doesn't promise pro success or, more importantly, high draft status. 2) The draft can be used almost as much as a sign of strength instead of a sign of soon-to-be weakness. As we will see, the teams that lose the most to the draft are the ones most likely to lose the most to the draft again in the near future.
The idea is that some lost starters mean more than others, and while there is not a perfect correlation between the perceived quality of a player getting drafted highly (or at all) and the quality of that player in college, there is obviously a strong enough correlation to use this. We will see if the points I defined in that article are still best for use, but this will be factored in to some degree.
Who This May Help: Oklahoma, Oregon. Oklahoma is not projected to have anybody selected in the first round of the draft, though Ronnell Lewis may be close. In all, they have only two or three players that may be drafted at all. Meanwhile, Oregon might see LaMichael James (second or third round) and Cliff Harris (later) drafted. Of the projected top-five teams, the Sooners and Ducks will have the fewest players drafted by far.
Who This May Hurt: Stanford, Baylor, Oklahoma State. As much as anybody else, these three teams took hefty steps forward in recent years, spurred mostly by players that are expected to be taken quite high in this month's draft. Stanford could not only see Andrew Luck go No. 1, but they could see two offensive linemen -- David DeCastro and Jonathan Martin -- gone by the end of the first round. Baylor will have both Robert Griffin and Kendall Wright plucked on the first day. Oklahoma State should see Justin Blackmon gone in the first five to seven picks, followed by Brandon Weeden in the first few rounds. None of these teams are expected to have double-digit players drafted, but these players were difference makers, and this will ding them regardless.
As part of this year's team previews for SB Nation (already well underway), I am including full-team statistical profiles as well as one-page glances at a team's starters, stats and projection factors. Here is Alabama's profile, for example. As part of the "projection factors" section, I am including a per-game "Turnover Luck" measure. I have been playing with the concept recently at Football Study Hall, expanding it to include passes broken up.
PBUs are sometimes dropped interceptions. In 2011, the 120 FBS teams intercepted 1,436 passes and "broke up" 5,131. That means that 21.9 percent of what we call "passes defended" were interceptions. Only, the spread from team to team was enormous. For N.C. State, 43.5 percent of their passes defended were INTs. For Akron, 6.7 percent.
I looked at this data from year to year and found that, though it would appear based somewhat on skill (N.C. State had David Amerson and Akron didn't, after all), a team's year-to-year percentages have almost no correlation. Georgia's Bacarri Rambo was second in the country in interceptions (eight), but in his three years of participation, the Bulldogs' ratio of interceptions to overall passes defended has gone from 19.6 percent in 2009, to 35.6 percent in 2010, to 27.0 percent in 2011. Some years, you catch them. Others, you don't.
So what that means here is that we are going to look at a team's ratio of interceptions to passes defended and apply it to the Adjusted Turnover Margin as well. Even with Amerson, we can safely assume that part of their ridiculous 27 interceptions were caused by luck, and as we know, luck is incredibly fickle.
(NOTE: this data is readily available for defenses, but not offenses, i.e. the number of passes they threw that were broken up. So for this go-round, we are only looking at the defensive numbers.)
Turnover Luck Per Game. I have found that, on average, turnovers are worth about 5.0 points. (For more on Turnover Points, start here.) So if we apply that point value to the difference between a team's turnover margin and its adjusted turnover margin, we can also take a look at how many points they tended to gain or lose from turnovers luck.
Some teams are better at breaking up passes, forcing fumbles, avoiding fumbles, et cetera, but there is still luck involved in turnover margin, and with this per-game look, I am zeroing in on an adjustment that will probably be factored into projections to some degree.
Who This May Help: Duke, Fresno State, Texas A&M, SMU, Utah State. Those are the five teams that were dinged the worst by this measure in 2011.
Who This May Hurt: Maryland, Michigan, Oklahoma State, N.C. State, South Carolina. These are the five teams that potentially benefited the most from a level of luck that will probably even out in 2012.
(That's right, Maryland. They went 2-10 in 2011 despite quite a few good bounces. Yikes.)
As we dive further into the projections, there may be other factors taken into account as well. I have also tinkered in years past with the idea of creating different weights for major-conference and mid-major teams. Mid-majors are more likely to be hurt by a large number of draft points (or plain old starters) lost, for instance; meanwhile, mid-majors' recruiting rankings may not carry such a heavy weight because the difference between a low two-star and high two-star athlete (over whom many mid-majors are fighting) is probably not as great as the difference between a four-star and five-star athlete. Regardless, I wanted to dive into these numbers a bit and let readers know what might change them between now and when the Football Outsiders Almanac 2012 is released this summer.