Introducing Lewin Career Forecast v2.0
by Aaron Schatz
Five years ago, Football Outsiders unveiled our first college quarterback projection system. It came to be known as the Lewin Career Forecast, since it was created by a college kid named David Lewin who now works for the Cleveland Cavaliers. The elements were simple: The LCF did a surprisingly good job of projecting the success of first- and second-round quarterbacks using just college games started and college completion percentage. It was so popular that references to the Lewin Career Forecast started showing up all over the media, sometimes even "referencing" entire paragraphs of my writing.
There's only one problem: In the last couple years, the LCF hasn't done so well. The formula predicted success for a number of flops including Kellen Clemens, Brady Quinn, Brian Brohm, and Matt Leinart. I detailed these issues in an ESPN Insider piece last week, but let me summarize here for those of you who don't get ESPN Insider. From 1997 through 2005, there were 11 quarterbacks who:
- were chosen in the first two rounds
- had at least 33 games started in college
- completed at least 58 percent of passes in college.
Out of these 11 quarterbacks, the worst was Byron Leftwich, who was good enough to lead a 12-4 team to the playoffs in 2005. However, the same baselines between 2006 and 2009 produce this list of quarterbacks: Matt Leinart, Brady Quinn, Kevin Kolb, John Beck, Brian Brohm, Chad Henne, Josh Freeman, and Pat White. OK, maybe we don't consider White as a player who was drafted as a "conventional quarterback," but still, that list has four flops, one success (Freeman), and two guys who we're not sure about yet (Kolb and Henne). It's a huge change from 1998-2005.
With these problems in the last couple years, there have generally been two criticisms of LCF. The first is that completion rates don't clearly indicate NFL-level accuracy anymore because of the rise of the college spread option. However, this really isn't as big an issue as some readers seem to believe. Despite a slight rise in completion rates across college football due to the spread offense, the real issue is number of games started. Before 2005, games started were a strong clue as to whether scouts got it right or wrong on the top prospects. Since 2005, many quarterbacks with plenty of experience washed out while similarly accurate, but much less experienced quarterbacks like Aaron Rodgers and Joe Flacco have become successful NFL starters.
The phrase "before 2005" gets to the second criticism, which is that LCF is more descriptive than it is predictive. It describes the quarterbacks from the years that David Lewin used in his original data set, but a high number of games started only correlates to NFL success for the quarterbacks specifically in that data set. That data set has a small sample size and is "cherry-picked" by only using a small subset of years. That's not necessarily true, however. Two points:
1) When we first ran LCF in Pro Football Prospectus 2006, not every quarterback drafted between 1997 and 2005 was part of the research. Philip Rivers is perhaps the best example of a quarterback who gets a high projection because of collegiate games started; he had 49 starts at North Carolina State. But he wasn't part of the data set used to create LCF, because as of PFP 2006 Rivers had only 30 NFL pass attempts and zero games started. Based on the performance of other quarterbacks, LCF projected that Rivers would be an MVP-level superstar, and he has been.
2) Games started may not seem like an important variable if we go forward from the introduction of the LCF, but it is definitely important if we go backwards. From 1990 through 1997, games started are a hugely predictive variable for first- and second-round quarterbacks. Only two of the top quarterbacks drafted during this period were four-year starters: Steve McNair and Brett Favre. Those are also the most successful quarterbacks drafted during that eight-year period. There were also two quarterbacks drafted with only one year of starting experience: Dan McGwire and Matt Blundin. Unless you read my ESPN Insider piece last week, I'm guessing you have never even heard of Matt Blundin, and McGwire is a well-known flop. The further we go back, the harder it is to get exact college stats, and sometimes we have to guess whether a player started all the games he played in, but it looks like these quarterbacks also started fewer than 24 games in college: Browning Nagle, Todd Marinovich, Dave Brown, David Klingler, Tommy Maddox, Heath Shuler, and Tony Banks. Again, not a Hall of Fame list.
Therefore, we need to accept that any quarterback projection system that is based on past performance is going to value collegiate games started. For more than 15 years, it was far and away the most important variable in determining the success of highly-drafted quarterbacks. However, analysis of quarterbacks drafted between 1998 and 2008 showed that we could add some more variables to the Lewin Career Forecast to make it more accurate. Thus, I present to you Lewin Career Forecast v2.0.
I put together LCF v2.0 with a regression that attempted to forecast total DYAR for these quarterbacks in years 3-5 of their NFL careers. In order to include a larger data set, I did look at 2007 draftees (DYAR in years 2-4), and 2008 draftees (DYAR in years 2-3, multiplied by 150 percent). In his first LCF, David Lewin included only quarterbacks drafted in the first two rounds; for this new version, I included quarterbacks chosen in round three as well. In addition, many of the variables have upper or lower boundaries in order to try to limit the importance of extremes like Colt McCoy's 53 games started or Cam Newton and Tim Tebow's rushing statistics.
The new LCF has seven factors.
- Career college games started. This is still the most important variable in the equation. Uses a minimum of 20, a maximum of 48.
- Career completion rate; however, this is now a logrithmic variable. As a quarterback's completion percentage goes down, the penalty for low completion percentage gets gradually larger. As a result, the bonus for exceedingly accurate quarterbacks such as Tim Couch and Brian Brohm is smaller than the penalty for inaccurate quarterbacks such as Kyle Boller and Tarvaris Jackson.
- Difference between the quarterback's BMI and 28.0. This creates a small penalty for quarterbacks who don't exactly conform to the "ideal quarterback size." This year, that would include both Colin Kaepernick (BMI: 26.8) and Cam Newton (BMI: 29.4).
- Run-pass ratio in the quarterback's final college season, with a maximum of 0.5.
- Total rushing yards in the quarterback's final college season, with a minimum of 0 and a maximum of 600.
These two variables work together. Remember, there are two ways to have a high run-pass ratio in college football. Either you are a quarterback who relies a lot on his legs, or you are a quarterback who takes a lot of sacks, because sacks count as runs in college football. So with these two variables, both of those types of quarterbacks end up penalized, while pocket quarterbacks who are successful when they do run (and therefore have positive rushing yards) get a bonus. A good example here is Andrew Luck. Last year, Luck had a very low run-pass ratio of 0.15 -- among this year's top prospects, only Ryan Mallett had a lower ratio -- but when he did run, he gained an excellent 8.2 yards per carry.
- For quarterbacks who come out as seniors, the difference in NCAA passer rating between their junior and senior seasons.
This variable was a bit of a breakthrough when it came to explaining many of the failures of LCF v1.0. Quarterbacks who struggle as seniors often see their draft stock fall, but apparently not far enough. Obviously passer rating has its issues, but it was a good proxy for figuring out when a quarterback saw his improvement stagnate. There are nine quarterbacks in our data set whose NCAA passer rating fell by more than 10 points in their senior seasons: Rex Grossman (an astonishing 49.3-point collapse), Brodie Croyle, Drew Stanton, Quincy Carter, Trent Edwards, Chad Henne, Brady Quinn, Marques Tuiasosopo, and Patrick Ramsey. Brian Brohm's passer rating fell by 7.3 points. The quarterbacks with the largest senior-year improvements were Jason Campbell, John Beck, Kevin Kolb, Philip Rivers, Chad Pennington, Carson Palmer, and Eli Manning. Obviously this variable isn't foolproof -- besides Beck, guys like Joey Harrington and Kellen Clemens also had significant senior-year improvements, while Jay Cutler and Matt Schaub saw their passer ratings drop slightly as seniors. Still, this variable did a lot to improve results.
What does it mean? This variable could show that quarterbacks who don't keep improving as seniors aren't going to improve as professionals either. Or perhaps, it shows that certain players have flaws in their games that opponents figured out in their senior years.
For quarterbacks who come out as juniors or redshirt sophomores, this variable is always 5.0, which is the average increase for the seniors in our data set.
- Finally, a binary variable that penalizes quarterbacks who don't play for a team in a BCS-qualifying conference. We counted Notre Dame here as a BCS school, even though that actually lowered the accuracy of the projections. However, this variable only qualifies for Division I-A quarterbacks, not Division I-AA quarterbacks. Perhaps this means that scouts do a better job of identifying the few Division I-AA quarterbacks who can translate their games to the NFL. (The data set has only three of these players: Josh McCown, Tarvaris Jackson, and Joe Flacco.)
How does this new, more complex version of LCF change our projections? To figure that out, I also created a formula that used the same data set (including the third-round picks) with the same dependent variable, but only used the same two factors as the original LCF: just games started and completion percentage. The old LCF had a R-square of .24. The new LCF has an R-square of .58. Here's a list of the best and worst projections from 1998 through 2008 using both the first LCF and the newer version. (Since the newer version is more accurate and has more variables, it's also going to give you higher highs and lower lows, which is why the best and worst projections are more extreme with LCF version 2.0.)
|LCF v1.0 Top 10||LCF v2.0 Top 10||LCF v1.0 Bottom 10||LCF v2.0 Bottom 10|
|Chad Pennington||1778||x||Philip Rivers||2476||x||Marques Tuiasosopo||-506||x||Alex Smith||-782|
|Philip Rivers||1671||x||Drew Brees||2190||x||Michael Vick||-473||x||Brodie Croyle||-736|
|Kevin Kolb||1626||x||Carson Palmer||1973||x||Akili Smith||-413||x||Marques Tuiasosopo||-621|
|Charlie Frye||1615||x||Peyton Manning||1784||x||Ryan Leaf||-326||x||Trent Edwards||-611|
|Daunte Culpepper||1396||x||Chad Pennington||1678||x||Tarvaris Jackson||-195||x||Ryan Leaf||-407|
|Peyton Manning||1379||x||Brady Quinn||1518||x||Joey Harrington||-14||x||Quincy Carter||-336|
|Chad Henne||1349||x||Jason Campbell||1506||x||Shaun King||54||x||Josh McCown||-311|
|Brady Quinn||1348||x||Jay Cutler||1444||x||J.P. Losman||64||x||David Carr||-299|
|Carson Palmer||1198||x||Chad Henne||1411||x||Brodie Croyle||97||x||Patrick Ramsey||-223|
|Donovan McNabb||1163||x||Matt Ryan||1403||x||Quincy Carter||122||x||J.P. Losman/Tim Couch||-195|
Here's a look at which quarterbacks improved the most from version 1.0 to version 2.0, and which quarterbacks declined the most. The new formula does a good job of improving the projections for a lot of quarterbacks who became stars, although it now misses even more egregiously on Kellen Clemens and Brian Brohm. The list of the quarterbacks who declined the most seems like a good list of players who were overrated coming out of school, with the exception of Daunte Culpepper and Donovan McNabb. Those guys both appear on the "biggest decline" list because of the new BMI variable, as they are two of the three quarterbacks in the data set with BMI over 30. (The other is JaMarcus Russell.)
|Biggest Increase for Projection in LCF v2.0||Biggest Decrease for Projection in LCF v2.0|
|Player||LCF v1.0||LCF v2.0||Player||LCF v1.0||LCF v2.0|
|Drew Brees||835||2190||x||Charlie Frye||1615||117|
|Matt Ryan||473||1403||x||Alex Smith||221||-782|
|Philip Rivers||1671||2476||x||Trent Edwards||223||-611|
|Carson Palmer||1198||1973||x||Brodie Croyle||97||-736|
|Kellen Clemens||532||1248||x||Daunte Culpepper||1396||663|
|Vince Young||576||1059||x||Donovan McNabb||1163||472|
|Eli Manning||818||1292||x||Patrick Ramsey||420||-223|
|Brian Brohm||846||1290||x||Tim Couch||445||-195|
|Joe Flacco||305||732||x||Rex Grossman||472||-124|
|Peyton Manning||1379||1784||x||David Carr||275||-299|
Now, let's look at the projections for quarterbacks outside of our data set. First, we'll look at the projections for the quarterbacks chosen in rounds 1-3 of the past two drafts. The number listed is projected total DYAR for career years 3-5.
- Colt McCoy: 2,092
- Josh Freeman: 1,367
- Sam Bradford: 1,345
- Jimmy Clausen: 1,062
- Tim Tebow: 925
- Matthew Stafford: 714
- Mark Sanchez: 151
As you might expect, LCF v2.0 loves Colt McCoy. So did LCF v1.0 -- although McCoy wouldn't have been considered by LCF v1.0 because he was a third-round pick. McCoy had 53 college games started with a career completion rate above 70 percent. The new boundaries added to try to limit the importance of outlier variables do tampen down the McCoy excitement slightly. (Not to mention that without those boundaries, Sanchez's projection would actually be negative.) Still, McCoy has the third-highest projection of any quarterback since 1997. Philip Rivers and Drew Brees are the only other quarterbacks projected above 2,000.
Five of these seven quarterbacks have significantly higher projections using the new version of LCF. Only Tebow and McCoy are lower with LCF v2.0, and the difference with Tebow is pretty small.
It's important to understand that LCF is meant to be a tool used alongside the scouting reports, not instead of the scouting reports. Sam Bradford was still the proper number one overall selection in the 2010 draft. What's important is not that his projection is lower than Colt McCoy's projection -- instead, what's important is that he has a very good projection, which should give the Rams confidence that their scouts got it right. We don't claim to believe that the Lewin Career Forecast is a foolproof way of figuring out which quarterback an NFL team should draft. This is an interesting regression analysis, not Moses bringing the tablets down from Sinai. Still, we think that LCF v2.0 is valuable as a crosscheck device and should be part of the conversation about quarterback draft prospects.
With that in mind, let's look at the projections for this year's quarterbacks.
Andy Dalton, TCU: 1,616 DYAR
Important stats: 48 games started, 61.7% competion rate, senior passer rating improved 14.7 points.
Dalton is LCF's favorite prospect for 2011. He's also a great example of where LCF might go wrong. Our own Doug Farrar did a good job of running down Dalton's problems in this post on Yahoo's Shutdown Corner blog. Dalton played in a college spread offense where routes were generally designed to clear out specific spots in the defense. Plays didn't include a lot of receiver progressions. He has problems with arm strength, particularly on those intermediate-length throws that an NFL quarterback has to stick into very tight windows. Still, both his pros and his cons sound a lot like the pros and cons of last year's LCF favorite, Colt McCoy -- and McCoy had a more successful rookie year in the NFL than anyone expected. Dalton is a good example of how the LCF doesn't tell you that a quarterback is definitely going to be a star. It tells you "if your scouts determine that Andy Dalton fits your offensive scheme despite his weaknesses, he is very unlikely to be a complete bust."
Ricky Stanzi, Iowa: 1,305 DYAR
Important stats: 35 games started, 59.8% completion rate, senior passer rating improved 26.0 points, 48 carries for -6 yards.
Stanzi gets an asterisk. I don't think he's going in the first three rounds. He's another guy scouts have to do their due diligence on. Still, he did improve a lot as a senior and could be a nice fourth- or fifth-round sleeper. Rushing numbers suggest he may take too many sacks.
Colin Kaepernick, Nevada: 1,044 DYAR
Important stats: 48 games started, 58.2% completion rate, .482 run-pass ratio, 26.8 BMI.
Kaepernick of course played in a somewhat "gimmicky" offense in college, and a lot of his value was based on his running ability. He didn't have the greatest completion rate across his entire college career, although he's been a four-year starter so there's a lot of film to break down here. He had a moderate improvement as a senior, 11.3 points of passer rating. I don't have much of an opinion on him past these numbers.
Blaine Gabbert, Missouri: 656 DYAR
Important stats: 26 games started, 60.9% completion rate.
Here is where maybe you get the sense that this isn't the best year for low-risk quarterback prospects. From Gabbert on down, every quarterback prospect for 2011 is lower than every quarterback prospect from 2009-2010 except for Mark Sanchez. Gabbert is a little low on games started, a little high in completion rate, and basically average on all the other variables in the system, so LCF v2.0 thinks he's going to be a very average quarterback. His projection is close to the average projection for all the players in the data set used to create LCF v2.0, which is 604. An average quarterback can be a very useful thing on the right team, but it is not something you want to get with a top ten pick.
Jake Locker, Washington: 569 DYAR
Important stats: 40 games started, 53.9% completion rate, senior passer rating dropped 5.5 points.
I just don't think Jake Locker is ever going to be accurate enough to be an above-average NFL quarterback.
Ryan Mallett, Arkansas: 471 DYAR
Important stats: 29 games started, 57.8% completion rate, 26.8 BMI, 44 carries for -74 yards.
Perhaps you have heard that Ryan Mallett has some mobility issues? In three years of college football, he has a total of 135 carries for -141 yards. There are a lot of sacks in there. Maybe you don't think the rushing yardage thing is a big deal, but here's the list of players in our data set with fewer than -50 rushing yards in their final college season: Brodie Croyle, Tim Couch, Chris Simms, Carson Palmer, Patrick Ramsey, Andrew Walter, Kyle Boller, and Rex Grossman. I count one successful quarterback out of eight. Mallett's downside is Dan McGwire. His upside is "What if Drew Bledsoe was kind of a dick."
Christian Ponder, Florida State: 413 DYAR
Important stats: 33 games started, 61.8% completion rate, senior passer rating dropped 12.0 points.
Maybe somebody reaches for him because so many teams have quarterback needs this year, but Ponder just seems to me like a classic third-round pick. How high is his ceiling, really? Isn't he basically just Drew Stanton? I would be scared of how his improvement stagnated in his senior year.
Cam Newton, Auburn: 175 DYAR
Important stats: 14 games started, 65.4% completion rate, 29.4 BMI, 0.94 run-pass ratio.
I thought Tim Tebow was the most unique prospect in recent times, but Cam Newton may have surpassed him. You get most of the same questions, but you take out the questions about throwing motion and replace them with questions about character and inexperience. Nobody doubts that Newton is an amazing athlete who was a supremely valuable college football player. In the NFL, he is a massive risk-reward candidate. I just happen to think that the risk is larger than the reward. I would not take him with the first overall pick in the draft unless a) there was absolutely no other player worth that top pick, and b) I knew for certain that the post-lockout CBA would include a rookie salary slotting system that would go into effect immediately.
Let's throw in one more guy, because I know some people will be curious.
Andrew Luck, Stanford: 1,604 DYAR
Important stats: 25 games started, 64.4% completion rate, 453 rushing yards with only 0.15 run-pass ratio.
This would be Andrew Luck's projection if he had come out after his sophomore year. If he puts up the same stats as a junior, he'll come out with the second-highest projection of any quarterback since 1997, behind only Philip Rivers.