Introducing Lewin Career Forecast v2.0

Introducing Lewin Career Forecast v2.0
Introducing Lewin Career Forecast v2.0
Photo: USA Today Sports Images

by Aaron Schatz

Five years ago, Football Outsiders unveiled our first college quarterback projection system. It came to be known as the Lewin Career Forecast, since it was created by a college kid named David Lewin who now works for the Cleveland Cavaliers. The elements were simple: The LCF did a surprisingly good job of projecting the success of first- and second-round quarterbacks using just college games started and college completion percentage. It was so popular that references to the Lewin Career Forecast started showing up all over the media, sometimes even "referencing" entire paragraphs of my writing.

There's only one problem: In the last couple years, the LCF hasn't done so well. The formula predicted success for a number of flops including Kellen Clemens, Brady Quinn, Brian Brohm, and Matt Leinart. I detailed these issues in an ESPN Insider piece last week, but let me summarize here for those of you who don't get ESPN Insider. From 1997 through 2005, there were 11 quarterbacks who:

  • were chosen in the first two rounds
  • had at least 33 games started in college
  • completed at least 58 percent of passes in college.

Out of these 11 quarterbacks, the worst was Byron Leftwich, who was good enough to lead a 12-4 team to the playoffs in 2005. However, the same baselines between 2006 and 2009 produce this list of quarterbacks: Matt Leinart, Brady Quinn, Kevin Kolb, John Beck, Brian Brohm, Chad Henne, Josh Freeman, and Pat White. OK, maybe we don't consider White as a player who was drafted as a "conventional quarterback," but still, that list has four flops, one success (Freeman), and two guys who we're not sure about yet (Kolb and Henne). It's a huge change from 1998-2005.

With these problems in the last couple years, there have generally been two criticisms of LCF. The first is that completion rates don't clearly indicate NFL-level accuracy anymore because of the rise of the college spread option. However, this really isn't as big an issue as some readers seem to believe. Despite a slight rise in completion rates across college football due to the spread offense, the real issue is number of games started. Before 2005, games started were a strong clue as to whether scouts got it right or wrong on the top prospects. Since 2005, many quarterbacks with plenty of experience washed out while similarly accurate, but much less experienced quarterbacks like Aaron Rodgers and Joe Flacco have become successful NFL starters.

The phrase "before 2005" gets to the second criticism, which is that LCF is more descriptive than it is predictive. It describes the quarterbacks from the years that David Lewin used in his original data set, but a high number of games started only correlates to NFL success for the quarterbacks specifically in that data set. That data set has a small sample size and is "cherry-picked" by only using a small subset of years. That's not necessarily true, however. Two points:

1) When we first ran LCF in Pro Football Prospectus 2006, not every quarterback drafted between 1997 and 2005 was part of the research. Philip Rivers is perhaps the best example of a quarterback who gets a high projection because of collegiate games started; he had 49 starts at North Carolina State. But he wasn't part of the data set used to create LCF, because as of PFP 2006 Rivers had only 30 NFL pass attempts and zero games started. Based on the performance of other quarterbacks, LCF projected that Rivers would be an MVP-level superstar, and he has been.

2) Games started may not seem like an important variable if we go forward from the introduction of the LCF, but it is definitely important if we go backwards. From 1990 through 1997, games started are a hugely predictive variable for first- and second-round quarterbacks. Only two of the top quarterbacks drafted during this period were four-year starters: Steve McNair and Brett Favre. Those are also the most successful quarterbacks drafted during that eight-year period. There were also two quarterbacks drafted with only one year of starting experience: Dan McGwire and Matt Blundin. Unless you read my ESPN Insider piece last week, I'm guessing you have never even heard of Matt Blundin, and McGwire is a well-known flop. The further we go back, the harder it is to get exact college stats, and sometimes we have to guess whether a player started all the games he played in, but it looks like these quarterbacks also started fewer than 24 games in college: Browning Nagle, Todd Marinovich, Dave Brown, David Klingler, Tommy Maddox, Heath Shuler, and Tony Banks. Again, not a Hall of Fame list.

Therefore, we need to accept that any quarterback projection system that is based on past performance is going to value collegiate games started. For more than 15 years, it was far and away the most important variable in determining the success of highly-drafted quarterbacks. However, analysis of quarterbacks drafted between 1998 and 2008 showed that we could add some more variables to the Lewin Career Forecast to make it more accurate. Thus, I present to you Lewin Career Forecast v2.0.

I put together LCF v2.0 with a regression that attempted to forecast total DYAR for these quarterbacks in years 3-5 of their NFL careers. In order to include a larger data set, I did look at 2007 draftees (DYAR in years 2-4), and 2008 draftees (DYAR in years 2-3, multiplied by 150 percent). In his first LCF, David Lewin included only quarterbacks drafted in the first two rounds; for this new version, I included quarterbacks chosen in round three as well. In addition, many of the variables have upper or lower boundaries in order to try to limit the importance of extremes like Colt McCoy's 53 games started or Cam Newton and Tim Tebow's rushing statistics.

The new LCF has seven factors.

  • Career college games started. This is still the most important variable in the equation. Uses a minimum of 20, a maximum of 48.
  • Career completion rate; however, this is now a logrithmic variable. As a quarterback's completion percentage goes down, the penalty for low completion percentage gets gradually larger. As a result, the bonus for exceedingly accurate quarterbacks such as Tim Couch and Brian Brohm is smaller than the penalty for inaccurate quarterbacks such as Kyle Boller and Tarvaris Jackson.
  • Difference between the quarterback's BMI and 28.0. This creates a small penalty for quarterbacks who don't exactly conform to the "ideal quarterback size." This year, that would include both Colin Kaepernick (BMI: 26.8) and Cam Newton (BMI: 29.4).
  • Run-pass ratio in the quarterback's final college season, with a maximum of 0.5.
  • Total rushing yards in the quarterback's final college season, with a minimum of 0 and a maximum of 600.

These two variables work together. Remember, there are two ways to have a high run-pass ratio in college football. Either you are a quarterback who relies a lot on his legs, or you are a quarterback who takes a lot of sacks, because sacks count as runs in college football. So with these two variables, both of those types of quarterbacks end up penalized, while pocket quarterbacks who are successful when they do run (and therefore have positive rushing yards) get a bonus. A good example here is Andrew Luck. Last year, Luck had a very low run-pass ratio of 0.15 -- among this year's top prospects, only Ryan Mallett had a lower ratio -- but when he did run, he gained an excellent 8.2 yards per carry.

  • For quarterbacks who come out as seniors, the difference in NCAA passer rating between their junior and senior seasons.

This variable was a bit of a breakthrough when it came to explaining many of the failures of LCF v1.0. Quarterbacks who struggle as seniors often see their draft stock fall, but apparently not far enough. Obviously passer rating has its issues, but it was a good proxy for figuring out when a quarterback saw his improvement stagnate. There are nine quarterbacks in our data set whose NCAA passer rating fell by more than 10 points in their senior seasons: Rex Grossman (an astonishing 49.3-point collapse), Brodie Croyle, Drew Stanton, Quincy Carter, Trent Edwards, Chad Henne, Brady Quinn, Marques Tuiasosopo, and Patrick Ramsey. Brian Brohm's passer rating fell by 7.3 points. The quarterbacks with the largest senior-year improvements were Jason Campbell, John Beck, Kevin Kolb, Philip Rivers, Chad Pennington, Carson Palmer, and Eli Manning. Obviously this variable isn't foolproof -- besides Beck, guys like Joey Harrington and Kellen Clemens also had significant senior-year improvements, while Jay Cutler and Matt Schaub saw their passer ratings drop slightly as seniors. Still, this variable did a lot to improve results.

What does it mean? This variable could show that quarterbacks who don't keep improving as seniors aren't going to improve as professionals either. Or perhaps, it shows that certain players have flaws in their games that opponents figured out in their senior years.

For quarterbacks who come out as juniors or redshirt sophomores, this variable is always 5.0, which is the average increase for the seniors in our data set.

  • Finally, a binary variable that penalizes quarterbacks who don't play for a team in a BCS-qualifying conference. We counted Notre Dame here as a BCS school, even though that actually lowered the accuracy of the projections. However, this variable only qualifies for Division I-A quarterbacks, not Division I-AA quarterbacks. Perhaps this means that scouts do a better job of identifying the few Division I-AA quarterbacks who can translate their games to the NFL. (The data set has only three of these players: Josh McCown, Tarvaris Jackson, and Joe Flacco.)

How does this new, more complex version of LCF change our projections? To figure that out, I also created a formula that used the same data set (including the third-round picks) with the same dependent variable, but only used the same two factors as the original LCF: just games started and completion percentage. The old LCF had a R-square of .24. The new LCF has an R-square of .58. Here's a list of the best and worst projections from 1998 through 2008 using both the first LCF and the newer version. (Since the newer version is more accurate and has more variables, it's also going to give you higher highs and lower lows, which is why the best and worst projections are more extreme with LCF version 2.0.)

LCF v1.0 Top 10   LCF v2.0 Top 10   LCF v1.0 Bottom 10   LCF v2.0 Bottom 10
Chad Pennington 1778 x Philip Rivers 2476 x Marques Tuiasosopo -506 x Alex Smith -782
Philip Rivers 1671 x Drew Brees 2190 x Michael Vick -473 x Brodie Croyle -736
Kevin Kolb 1626 x Carson Palmer 1973 x Akili Smith -413 x Marques Tuiasosopo -621
Charlie Frye 1615 x Peyton Manning 1784 x Ryan Leaf -326 x Trent Edwards -611
Daunte Culpepper 1396 x Chad Pennington 1678 x Tarvaris Jackson -195 x Ryan Leaf -407
Peyton Manning 1379 x Brady Quinn 1518 x Joey Harrington -14 x Quincy Carter -336
Chad Henne 1349 x Jason Campbell 1506 x Shaun King 54 x Josh McCown -311
Brady Quinn 1348 x Jay Cutler 1444 x J.P. Losman 64 x David Carr -299
Carson Palmer 1198 x Chad Henne 1411 x Brodie Croyle 97 x Patrick Ramsey -223
Donovan McNabb 1163 x Matt Ryan 1403 x Quincy Carter 122 x J.P. Losman/Tim Couch -195

Here's a look at which quarterbacks improved the most from version 1.0 to version 2.0, and which quarterbacks declined the most. The new formula does a good job of improving the projections for a lot of quarterbacks who became stars, although it now misses even more egregiously on Kellen Clemens and Brian Brohm. The list of the quarterbacks who declined the most seems like a good list of players who were overrated coming out of school, with the exception of Daunte Culpepper and Donovan McNabb. Those guys both appear on the "biggest decline" list because of the new BMI variable, as they are two of the three quarterbacks in the data set with BMI over 30. (The other is JaMarcus Russell.)

Biggest Increase for Projection in LCF v2.0   Biggest Decrease for Projection in LCF v2.0
Player LCF v1.0 LCF v2.0   Player LCF v1.0 LCF v2.0
Drew Brees 835 2190 x Charlie Frye 1615 117
Matt Ryan 473 1403 x Alex Smith 221 -782
Philip Rivers 1671 2476 x Trent Edwards 223 -611
Carson Palmer 1198 1973 x Brodie Croyle 97 -736
Kellen Clemens 532 1248 x Daunte Culpepper 1396 663
Vince Young 576 1059 x Donovan McNabb 1163 472
Eli Manning 818 1292 x Patrick Ramsey 420 -223
Brian Brohm 846 1290 x Tim Couch 445 -195
Joe Flacco 305 732 x Rex Grossman 472 -124
Peyton Manning 1379 1784 x David Carr 275 -299

Now, let's look at the projections for quarterbacks outside of our data set. First, we'll look at the projections for the quarterbacks chosen in rounds 1-3 of the past two drafts. The number listed is projected total DYAR for career years 3-5.

  • Colt McCoy: 2,092
  • Josh Freeman: 1,367
  • Sam Bradford: 1,345
  • Jimmy Clausen: 1,062
  • Tim Tebow: 925
  • Matthew Stafford: 714
  • Mark Sanchez: 151

As you might expect, LCF v2.0 loves Colt McCoy. So did LCF v1.0 -- although McCoy wouldn't have been considered by LCF v1.0 because he was a third-round pick. McCoy had 53 college games started with a career completion rate above 70 percent. The new boundaries added to try to limit the importance of outlier variables do tampen down the McCoy excitement slightly. (Not to mention that without those boundaries, Sanchez's projection would actually be negative.) Still, McCoy has the third-highest projection of any quarterback since 1997. Philip Rivers and Drew Brees are the only other quarterbacks projected above 2,000.

Five of these seven quarterbacks have significantly higher projections using the new version of LCF. Only Tebow and McCoy are lower with LCF v2.0, and the difference with Tebow is pretty small.

It's important to understand that LCF is meant to be a tool used alongside the scouting reports, not instead of the scouting reports. Sam Bradford was still the proper number one overall selection in the 2010 draft. What's important is not that his projection is lower than Colt McCoy's projection -- instead, what's important is that he has a very good projection, which should give the Rams confidence that their scouts got it right. We don't claim to believe that the Lewin Career Forecast is a foolproof way of figuring out which quarterback an NFL team should draft. This is an interesting regression analysis, not Moses bringing the tablets down from Sinai. Still, we think that LCF v2.0 is valuable as a crosscheck device and should be part of the conversation about quarterback draft prospects.

With that in mind, let's look at the projections for this year's quarterbacks.

Andy Dalton, TCU: 1,616 DYAR

Important stats: 48 games started, 61.7% competion rate, senior passer rating improved 14.7 points.

Dalton is LCF's favorite prospect for 2011. He's also a great example of where LCF might go wrong. Our own Doug Farrar did a good job of running down Dalton's problems in this post on Yahoo's Shutdown Corner blog. Dalton played in a college spread offense where routes were generally designed to clear out specific spots in the defense. Plays didn't include a lot of receiver progressions. He has problems with arm strength, particularly on those intermediate-length throws that an NFL quarterback has to stick into very tight windows. Still, both his pros and his cons sound a lot like the pros and cons of last year's LCF favorite, Colt McCoy -- and McCoy had a more successful rookie year in the NFL than anyone expected. Dalton is a good example of how the LCF doesn't tell you that a quarterback is definitely going to be a star. It tells you "if your scouts determine that Andy Dalton fits your offensive scheme despite his weaknesses, he is very unlikely to be a complete bust."

Ricky Stanzi, Iowa: 1,305 DYAR

Important stats: 35 games started, 59.8% completion rate, senior passer rating improved 26.0 points, 48 carries for -6 yards.

Stanzi gets an asterisk. I don't think he's going in the first three rounds. He's another guy scouts have to do their due diligence on. Still, he did improve a lot as a senior and could be a nice fourth- or fifth-round sleeper. Rushing numbers suggest he may take too many sacks.

Colin Kaepernick, Nevada: 1,044 DYAR

Important stats: 48 games started, 58.2% completion rate, .482 run-pass ratio, 26.8 BMI.

Kaepernick of course played in a somewhat "gimmicky" offense in college, and a lot of his value was based on his running ability. He didn't have the greatest completion rate across his entire college career, although he's been a four-year starter so there's a lot of film to break down here. He had a moderate improvement as a senior, 11.3 points of passer rating. I don't have much of an opinion on him past these numbers.

For those curious, 6-foot-4, 200 pounds is the same size as Eli Manning, Joey Harrington, Chris Simms, and Tim Couch.

Blaine Gabbert, Missouri: 656 DYAR

Important stats: 26 games started, 60.9% completion rate.

Here is where maybe you get the sense that this isn't the best year for low-risk quarterback prospects. From Gabbert on down, every quarterback prospect for 2011 is lower than every quarterback prospect from 2009-2010 except for Mark Sanchez. Gabbert is a little low on games started, a little high in completion rate, and basically average on all the other variables in the system, so LCF v2.0 thinks he's going to be a very average quarterback. His projection is close to the average projection for all the players in the data set used to create LCF v2.0, which is 604. An average quarterback can be a very useful thing on the right team, but it is not something you want to get with a top ten pick.

Jake Locker, Washington: 569 DYAR

Important stats: 40 games started, 53.9% completion rate, senior passer rating dropped 5.5 points.

I just don't think Jake Locker is ever going to be accurate enough to be an above-average NFL quarterback.

Ryan Mallett, Arkansas: 471 DYAR

Important stats: 29 games started, 57.8% completion rate, 26.8 BMI, 44 carries for -74 yards.

Perhaps you have heard that Ryan Mallett has some mobility issues? In three years of college football, he has a total of 135 carries for -141 yards. There are a lot of sacks in there. Maybe you don't think the rushing yardage thing is a big deal, but here's the list of players in our data set with fewer than -50 rushing yards in their final college season: Brodie Croyle, Tim Couch, Chris Simms, Carson Palmer, Patrick Ramsey, Andrew Walter, Kyle Boller, and Rex Grossman. I count one successful quarterback out of eight. Mallett's downside is Dan McGwire. His upside is "What if Drew Bledsoe was kind of a dick."

Christian Ponder, Florida State: 413 DYAR

Important stats: 33 games started, 61.8% completion rate, senior passer rating dropped 12.0 points.

Maybe somebody reaches for him because so many teams have quarterback needs this year, but Ponder just seems to me like a classic third-round pick. How high is his ceiling, really? Isn't he basically just Drew Stanton? I would be scared of how his improvement stagnated in his senior year.

Cam Newton, Auburn: 175 DYAR

Important stats: 14 games started, 65.4% completion rate, 29.4 BMI, 0.94 run-pass ratio.

I thought Tim Tebow was the most unique prospect in recent times, but Cam Newton may have surpassed him. You get most of the same questions, but you take out the questions about throwing motion and replace them with questions about character and inexperience. Nobody doubts that Newton is an amazing athlete who was a supremely valuable college football player. In the NFL, he is a massive risk-reward candidate. I just happen to think that the risk is larger than the reward. I would not take him with the first overall pick in the draft unless a) there was absolutely no other player worth that top pick, and b) I knew for certain that the post-lockout CBA would include a rookie salary slotting system that would go into effect immediately.

Let's throw in one more guy, because I know some people will be curious.

Andrew Luck, Stanford: 1,604 DYAR

Important stats: 25 games started, 64.4% completion rate, 453 rushing yards with only 0.15 run-pass ratio.

This would be Andrew Luck's projection if he had come out after his sophomore year. If he puts up the same stats as a junior, he'll come out with the second-highest projection of any quarterback since 1997, behind only Philip Rivers.


123 comments, Last at 11 Apr 2012, 1:10am

#1 by bubqr // Apr 20, 2011 - 1:23pm

That was a very enjoyable read, that will sure get a lot of comments.

I'd hate to be the Panthers.

Points: 0

#98 by Kibbles // Apr 25, 2011 - 6:42pm

Why would you hate to be the Panthers? There are several players in this draft worth the #1 overall pick (Peterson, Dareus, possibly Von Miller). True, they're probably going to blow it and grab a QB, but if you were the Panthers you could just choose to make a much wiser selection.

Points: 0

#2 by Jon // Apr 20, 2011 - 1:29pm

who did BMI excessively flag outside of Russell? My first intuition is that his egregious failure was such an outlier that it could break the entire system due to not having a large enough sample to work with.

Points: 0

#18 by Scott C // Apr 20, 2011 - 3:00pm

Yes, you can't just use BMI to flag JR, since BMI isn't a proxy for lazy.

It probably helped the projection for others, such a variable for just one player is not something I'd expect to be done.

Points: 0

#3 by AnonymousA (not verified) // Apr 20, 2011 - 1:38pm

How can I tell this system was created through over-use of regression and under-use of statistical understanding?

Reason one: because it attempts to predict games started. What?

Seriously. WHAT?

Games started is affected by quarterback ability, other quarterbacks available to the team, quarterback health, quarterback compatibility with the team's system/personnel, and other external variables. The only one of these that could possibly be predicted by college information is quarterback ability. So yes, games started will correlate with college stats to some extent, but to attempt to predict it off them is a clear sign that when you have a linear-regression hammer, everything looks like a linearly-related nail.

Reason two: because it's full of cut-offs. If the data isn't linear, it isn't linear -- stop trying to make it linear by chopping out chunks of it. You can get away with this as an approximation technique, but the number of times it's done in Lewin 2.0 seems dangerously high.

Reason three: 20, 48, 28.0, 0.5, 0, 600. Those are the constants mentioned in LCF2. Where did these numbers come from? Careful study of the problem indicating that these were the best choices? Or data-fiddling and over-use of regression? As examples, the 600 looks like a "woah, people with more than this many yards are weird outliers. Chop those bastards out, and watch R-squared go up!" and 28.0 looks like "regress DYAR against BMI. Round off. Hey, jackpot!" Unless there's some reason heavy quarterbacks shouldn't be as good as one would expect from past success, this is purposeful over-fitting.

Prediction: LCF2.0 fails *even harder* than LCF1.0 because it is even more over-fitted.

Points: 0

#7 by Scott P. (not verified) // Apr 20, 2011 - 2:01pm

Where do they attempt to predict games started?

Otherwise, good points.

Points: 0

#26 by Aaron Brooks' … (not verified) // Apr 20, 2011 - 3:35pm

600 kind of makes sense.

Since 1990, 8 times has a QB in the NFL rushed for 600 or more yards, and 4 of those times was Michael Vick. (Cunningham, McNabb, Culpepper, and McNair each did it once)

Indeed, you could probably cap it at 540 if the QB is white. (Gannon and Young each broke 500 once)

Points: 0

#83 by BigCheese // Apr 23, 2011 - 11:06pm

How can I tell this post was crafted through pre-existign bias against the subject and without any actual comprehension of same? Because he thinks that the Games started stats under the prospects (and presumably all those other stats there) are projections instead of, you know, their college data.

And with that "peculiar" reading I'd have far more problem with a projection for BMI...

Seriously, WTF?!?!

- Alvaro

Points: 0

#115 by Anonymous1234 (not verified) // Sep 04, 2011 - 1:43am

Football Outsiders is statistical analysis gone wrong -- in some cases, horribly wrong. Pretty unfortunate.

Points: 0

#4 by commissionerleaf // Apr 20, 2011 - 1:56pm


Games started is a proxy, not a variable. I mean that in a kind way, as a positive for the projection. If a guy can't convince Nick Saban, Pete Carroll, or Steve Spurrier that he's the best quarterback on a college team, on what planet is he going to be successful in the NFL? Games started shows the ability of the player to convince smart football people that he can play football, which is important predictively because (1) It shows that he has the tools college scouts and coaches look for, which are different but highly correlated to NFL skills; and (2) The player will have to do EXACTLY THE SAME THING to win an NFL starting job and accumulate DYAR.

I'm hopeful, but unfortunately this business is a crapshoot; a lot depends on the willingness of a player to work hard after they've earned $25 Million guaranteed for how well they played in college.

Points: 0

#33 by DisplacedPackerFan // Apr 20, 2011 - 4:45pm

Two more words.

Seventh Round

A team took a flyer on him. Teams do that with 7th round picks or undrafted free agents. Graham Harrell wasn't drafted, but he's on an NFL roster and you know it's possible he could end up being a serviceable NFL starter. Of course there are tons of guys taken that late that don't pan out. How did the careers of JaJaun Seider, Tim Rattay, Joe Hamilton, Josh Heupl, Seth Burford, and Ken Dorsey turn out?

Sometimes you get Matt Cassel or Bart Starr, he was taken 200th overall in the 17th round (less teams back then) which compares well to the 230th pick used on Cassel. Most of the time you get Joe Hamilton or Tim Rattay.

Points: 0

#82 by Lebo // Apr 23, 2011 - 7:25pm

Yeah, I get that.

My post was a reaction to the sentence: "If a guy can't convince Nick Saban, Pete Carroll, or Steve Spurrier that he's the best quarterback on a college team, on what planet is he going to be successful in the NFL?".

I agree that college starts are probably a useful indicator of NFL ability. But I think it's silly to imply that a non-starter in college could never have a successful NFL career when Matt Cassel presents such an obvious and current contradiction.

Points: 0

#42 by Jerry // Apr 20, 2011 - 6:06pm

It's also been suggested that a greater number of starts gives scouts more film with which they can make a more accurate determination.

Points: 0

#84 by BigCheese // Apr 23, 2011 - 11:26pm

Which would be a wonderful rebuttal if he were complaining about Games Started as a variable. He actually thinks Games Started is an output!

Apparently AnonymousA is so hell-bent on leveling criticism and so self-assured that he can't be bothered to actually read past the GS stat, or he's so bhlinded he thinks they're also proyecting the BMI of several prospects, their run-pass ratio and, most impressive of all, Ricky Stanzi to have 48 carries for -6 yards, which would be quite the astonishing stat-line in the NFL, where sacks count as passing yards, not rushing...

- Alvaro

Points: 0

#5 by DSMok1 (not verified) // Apr 20, 2011 - 1:57pm

I would really like to see some out-of-sample validation runs from this regression. It certainly has the appearance of a potentially over-parameterized regression.

Points: 0

#20 by djanyreason // Apr 20, 2011 - 3:07pm


I'm shocked, shocked, to learn that adding variables and the last 5 years of data makes the projection better fit the data we've seen over the last 5 years.

Points: 0

#36 by Steven T. (not verified) // Apr 20, 2011 - 5:21pm

+1 to yours and the original comment. Lots and lots of effort, but probably of no use beyond being an exercise.

Points: 0

#89 by Mountain Time … // Apr 24, 2011 - 6:17pm

So he made a bad point in amongst an overall good argument. I think his other points stand; namely, that V2.0 has been arrived at via regression analysis rather than truly predictive statistics. I am not as convinced as he is that that makes it likely to fail, but I agree this is an experiment to see whether regression-based formulas like this turn out predictive or not.

I'd say the jury is still out on DVOA (though it's easily the best descriptive statistic out there), and I still think it was a grave mistake to switch from DPAR to DYAR.

Points: 0

#6 by cisforcookie (not verified) // Apr 20, 2011 - 2:01pm

I like trying to create a metric for ability to avoid sacks. This seems like an underrated ability even in the nfl today.

I also find interesting the idea of penalizing players who regress statistically as seniors. I wonder if it'd be possible to temper this with info on teammates lost between those two seasons though, such as losing a future nfl receiver or left tackle and having them replaced by a much worse player.

the BMI thing strikes me as purely descriptive.

Is the data set too small to adjust the games started variable based on the reasons why a player didn't start? ie injury. plus considering whether a player was kept from starting because of another star a year or two ahead of him? I almost feel like it's more interesting to consider how many games a player didn't start and why vs how many games they did start. of course, this seems like it probably runs into sample size issues.

Points: 0

#27 by Aaron Brooks' … (not verified) // Apr 20, 2011 - 3:41pm

I think the BMI metric is crap.

Starting NFL QBs have been right around a mean of 28 since 2000 with a standard deviation of about 1.4, (Meaning both Newton and Kaepernick are within a SD of the mean) but that's only true back to 2000. For a data set which goes back to 1990, though, those baselines are no longer true. Starters in 1990 averaged a BMI of 26.7 with a SD of 1.2 -- meaning a baseline QB (BMI of 28) would have been considered obese within the timeline of the background data. (from 1990-2000, QBs went from 26.7 to 28, and have been flat at 28 since)

Points: 0

#38 by Steven T. (not verified) // Apr 20, 2011 - 5:37pm

BMI in general is crap. It can't tell if you're totally ripped or have a huge beer belly, it thinks Michael Jordan was overweight and that Andre Agassi in his prime was within 10 pounds of being so. The only reason it lives on is it provides a measurement device to people who really want to have one--even one that's horribly inaccurate.

Points: 0

#8 by Aaron Schatz // Apr 20, 2011 - 2:18pm

Let me point out that a lot of the complaints about the statistical methods here are accurate, but there's really not much we can do about them. We're stuck with a small sample size. When you are stuck with a small sample size, you have to accept imperfection. If we tried to use out-of-sample runs to validate the formula, there wouldn't be enough runs left to create the formula. If we didn't try to correct for outliers, we would end up with some nutty-looking results. So we do what we can, and we say things like "this is not a guarantee" and "we are not perfect."

Points: 0

#51 by JonFrum // Apr 20, 2011 - 7:23pm

Classic statistics headslap. When you're stuck with a sample size that's too small, you stop what you're doing. There is no correction for a fundamental flaw.

If you wanted to know whether a particular variable predicted success in the NFL, you could test it. If you go on a data mining expedition to find what multiple variables you can feed into a model, you're going to come up with spurious correlations, regardless of sample size. And the more variables you add in an attempt to 'improve' your model, the worse it gets. And every revision you attempt is just creating a 'chasing your tail' problem.

If you want to say "We do this for fun - so don't worry about it," then fine. Just don't pretend you're doing something serious here.

Points: 0

#54 by cfn_ms // Apr 20, 2011 - 7:33pm

The whole article basically said "we sure hope these numbers are good, but we really don't know."

To a certain extent, every revision is a chasing tail problem, but they also have multiple more years of data to work with. Slowly but surely the sample size should be approaching the point where you can take it seriously, at least in part.

Also, they've got seven regression variables. Considering that they're looking at better than 10 QB's per year, and have a sample size of at least a decade, I'm inclined to think that overfitting isn't THAT big of a deal. It probably rears its ugly head in terms of variable selection, and maybe that's material, but I would think that at this point it should at least be slightly useful.

And given that there really isn't any other tool out there which does this type of analysis (or at least I don't know of any), I think it's reasonable to allow them to potentially reach a bit in terms of putting something out there that may not have the data it really ought to have.

Points: 0

#55 by jmaron // Apr 20, 2011 - 7:49pm

reminds me of Homer Simpson's advice to Bart when Bart failed at something

"What did you learn son - never try"

You're Homer Simpson with a little bit of education.

Points: 0

#56 by Thomas_beardown // Apr 20, 2011 - 7:54pm

I really thought this whole site was founded on "we're doing this for fun." Aaron isn't trying to get published in scientific journals or use his freely available information as to get hired by an NFL team. He's just trying to understand a game he enjoys better.

Points: 0

#90 by Mountain Time … // Apr 24, 2011 - 6:28pm

And the more variables you add in an attempt to 'improve' your model, the worse it gets

Prediction: future LCFs will improve as the shed variables that turn out to be unpredictive (such as BMI, perhaps?)?

Points: 0

#9 by MilkmanDanimal // Apr 20, 2011 - 2:19pm

Mallett's downside is Dan McGwire. His upside is "What if Drew Bledsoe was kind of a dick."

Heh heh heh heh heh . . .

Points: 0

#48 by JasonK // Apr 20, 2011 - 7:05pm

It's a great line, but isn't "Kerry Collins" the answer to "What if Drew Bledsoe was kind of a dick?"

Points: 0

#79 by Dales // Apr 23, 2011 - 4:20pm

It's colorful, no doubt.

Completely unfair to Mallett, unless there is some big 'dick' story that they managed to completely black out from Fayetteville.

Points: 0

#96 by Kevin from Philly // Apr 25, 2011 - 3:18pm

If I remember correctly, Bledsoe did a stage dive into a mosh pit full of slacker-sized teenagers circa 1998. Considering he was 240+ lbs jumping on 150 lb kids, I think that qualifies as "kind of a dick".

Points: 0

#102 by justanothersteve // Apr 26, 2011 - 1:52pm

I admit it's been over 20 years since I did a mosh pit. But the few I did had anything from 110# girls to guys weighing 250#. You knew the rules when you went in. I can't criticize anyone for stage-diving into a mosh pit, even someone the size of Meat Loaf. And just because you dive, doesn't mean they'll catch you.

Points: 0

#103 by Dean // Apr 26, 2011 - 2:46pm

I was the little kid in the pit in the 80s and I'm the old man in the pit now. The rules haven't changed any. About the only difference is nowadays you get douchebags who think the pit is somehow a place to impress people with their karate skills. You knock them on their ass a couple times and they get the hint - then they usually get the hell out.

Most places banned stage diving in the 90s just because it was such an insurance nightmare. Hell, in a lot of clubs you can't even crowd surf anymore. Of course, these days, they'd just drop me, but like you said, thems the breaks.

Points: 0

#10 by Will S // Apr 20, 2011 - 2:27pm

Excellent article. The changes make sense to me, and the paragraph putting it in proper context in regards to scouting reports and game film seems like the type of thing Phil Simms skims over over in these kinds of articles. Also that last line about Ryan Mallett cracked me up. Do you think him standing up the Panthers today because he was "out late on the town" is a sign of his immaturity or a calculated move showing he's willing to have his draft stock take a hit to avoid going to Carolina?

Points: 0

#15 by Randy Hedberg (not verified) // Apr 20, 2011 - 2:53pm

It has to be immaturity. Either that, or he's an egomaniac who thought he could possibly be considered with the #1 pick (since Carolina has no second round pick).

Points: 0

#28 by Aaron Brooks' … (not verified) // Apr 20, 2011 - 3:44pm

QBs who intentionally piss off the team with the #1 draft pick has a perfect correlation with Super Bowl wins. (Eli Manning and John Elway)

Points: 0

#49 by Aaron Brooks' … (not verified) // Apr 20, 2011 - 7:12pm

The correlation was this:

If a QB pisses off the team with the #1 pick, he will win a Super Bowl. Elway did win two of them.

Points: 0

#53 by JonFrum // Apr 20, 2011 - 7:25pm

No - Terrell Davis won two. Elway the three time loser rode on his back.

Points: 0

#76 by JoRo // Apr 23, 2011 - 1:25pm

Pretty sure he threw a pass or two, and I would imagine Davis would have had a hard time gashing the Packers carrying a 220lb man on his back the whole time.

Points: 0

#11 by andrew // Apr 20, 2011 - 2:33pm

The two "bottom 10" charts have the same heading.

I assume the first one is supposed to read v1.0

Points: 0

#12 by andrew // Apr 20, 2011 - 2:34pm

How do we know whether a QB sucks because a high BMI, or whether he has a high BMI because he sucks?

Points: 0

#17 by Randy Hedberg (not verified) // Apr 20, 2011 - 2:55pm

Unless he sucks down gravy, I'm guessing that any causation would go in the former direction.

Points: 0

#46 by Mr Shush // Apr 20, 2011 - 6:40pm

Jamarcus eats because he's crappy. He's crappy because he eats. It's a vicious circle.

Points: 0

#14 by cfn_ms // Apr 20, 2011 - 2:43pm

not sure if they'd be useful, but maybe they would:

1) Interceptions per attempt. I would think that this would tell you something, maybe even more than pure passer rating, since picks are death to a QB's future.

2) Difference between own team's rating (using F/+) and average opponent's rating. If this is really skewed, then it's probably a good indication the QB's stats are massively padded by easy opponents (I suspect that if this is material, it'd hurt Dalton in LCF v3.0). And if it's negative, then it's an indication the QB would have had better stats if he hadn't been on such a crappy team (good example: Jay Cutler at Vandy)

Points: 0

#19 by Aaron Schatz // Apr 20, 2011 - 3:02pm

I would like to use some of Bill Connelly and Brian Fremeau's stats, unfortunately we don't have them for before 2005.

Points: 0

#22 by cfn_ms // Apr 20, 2011 - 3:15pm

You could always use Sagarin as a starting point to see if that would be relevant. He's got numbers going back well over a decade:
etc. (I think '98 was the earliest he's got published)

Obviously Sagarin doesn't have the same level of detail F/+ does, but I would think it's got enough years of data that it should at least be able to tell you whether some of the high-level numbers would be useful (schedule strength most obviously, but perhaps other items I haven't thought of as well). You could then eventually transition to F/+ numbers when they have enough years of data.

PS I recall McCoy doing not much of anything in his senior year the three times he played good defenses (Oklahoma, Nebraska, Bama). That made me think he'd be a bust. Still skeptical of him due to that, and would be quite curious to see if there's a way to isolate how well QB's do against better defenses and see if it has predictive value or if I'm overreacting.

Points: 0

#29 by Aaron Brooks' … (not verified) // Apr 20, 2011 - 3:46pm

McCoy played a single series against Alabama before getting hurt. Texas scored on that series.

Points: 0

#32 by cfn_ms // Apr 20, 2011 - 4:10pm

Completion rate of about 55% if you add the three games together. 1 TD vs 4 INT's. ~ 150 yards / game throwing even if you ignore the Bama game. On the "Texas scored" drive, he was 2/2 for 9 yards.

Seriously, those are BAD numbers for a senior QB who's supposed to be a good prospect against quality defenses.

Points: 0

#43 by sjt (not verified) // Apr 20, 2011 - 6:12pm

Of course he struggled in those games. He's a QB, and whenever a QB spends half the game on his ass his numbers will suffer.

Against Nebraska, the top defense in the country that year, he was sacked 9 times. 4.5 of those came from Suh, and he had only 7.5 sacks during the previous 12 games. Suh also had 7 TFL in that one game, so clearly the Texas O-line was whipped from the start (as anyone who watched the game saw).

Against Oklahoma it was the same story. McCoy was running for his life most of the game. I can't find the stat numbers, but he ran 14 times for 33 yards, and 23 of those came on 1 play.

This story is also true for Bradford, who was very similar in a lot of ways to McCoy in terms of production. His biggest struggles came in games against top defenses where he was sacked and hit a lot (Orange Bowl vs. Florida, 2009 vs. Texas). Same is true for any QB.

Points: 0

#45 by cfn_ms // Apr 20, 2011 - 6:24pm

Except that plenty of other guys who weren't anywhere near the top of the draft board at least had better games against Oklahoma or Nebraska (not counting Bama since McCoy threw twice... but it's not like they constantly knocked out the other guy).

Max Hall:
Jacory Harris:
Taylor Potts:
Taylor Hansen:

And yes, I know I'm cherry-picking, and that plenty of others did worse than McCoy. The point is that it's not like these were defenses where it was impossible to do well against them. They COULD be beaten on the right day by the right guy, even when that guy really wasn't that great. On two (three if you count Bama) separate occassions, McCoy wasn't that guy.

Points: 0

#59 by sjt (not verified) // Apr 20, 2011 - 8:32pm

I didn't realize QB's are responsible for the fact that their O-lines couldn't block that day to save their lives. Let me repeat: 9 sacks. In one game. 4.5 by one guy, who also had 7 TFL in that same freaking game. Maybe some of that is on the QB, but that Nebraska defense was playing at a freakish level by that point in the season.

Even looking through all those games you cited, the QB play isn't always great. Hall and Harris both had 2 picks against Oklahoma. Hansen was 21-44 with 3 picks, mostly playing catch up. The only standouts were the Texas Tech guys, and they were apparently interchangeable within their system.

All this is assuming that the QB is the only guy who matters in this formula, as if there aren't 21 (or more accurately, 30, 40, even 50) other guys who affect how the game turns out. Matchups matter.

They COULD be beaten on the right day by the right guy, even when that guy really wasn't that great.On two (three if you count Bama) separate occassions, McCoy wasn't that guy.

They were beaten by McCoy. 16-13, 13-12.

Points: 0

#61 by cfn_ms // Apr 20, 2011 - 8:56pm

They were beaten by McCoy. 16-13, 13-12.
No, they were beaten by Texas. Mainly by Texas's defense. Unless you want to argue that Trent Dilfer won the Super Bowl in any kind of meaningful way.

All this is assuming that the QB is the only guy who matters in this formula, as if there aren't 21 (or more accurately, 30, 40, even 50) other guys who affect how the game turns out. Matchups matter.

And in most of those cases, Texas (and their OL) were better, maybe even substantially better, than what you'd find on the teams that actually had success through the air.


All that said, this really isn't something I care a whole lot about. My point was that McCoy struggled against good D's (which he did), and that suggested to me that he'd have issues at the next level. I'm curious if the FO guys will research into this to see whether or not there's merit to that type of argument, or if it actually doesn't much matter.

Points: 0

#62 by sjt (not verified) // Apr 20, 2011 - 9:12pm

No, they were beaten by Texas.

So when McCoy's numbers suffer, its totally on him and has nothing to do with his O-line, backs, receivers, coaches, or anything. But when they win, its the team. Got it.

And in most of those cases, Texas (and their OL) were better, maybe even substantially better, than what you'd find on the teams that actually had success through the air.

Except that they clearly weren't, especially their O-line. When you get your ass whipped by one guy for 11.5 negative plays in a single game, you aren't better than that guy! When you give up 9 sacks in a game, your O-line is not better than the opposing defensive front! That's what "matchup" means. It doesn't matter if you can block 9 guys out of 10, if you can't block then 10th the you aren't better than him. And just because some other guy couldn't block the other 9 doesn't mean he can't block the 10th. Football isn't transitive.

Points: 0

#67 by cfn_ms // Apr 20, 2011 - 11:12pm

I have to wonder if you're intentionally misreading me at this point. What I was saying was that Texas's O-Line (and rest of offensive talent) was most likely better than the O-Line and overall offensive talent for BYU, Miami, Texas Tech, and Colorado. So multiple QB's with lesser abilities and lesser supporting cast did less crappy than McCoy against Oklahoma and Nebraska. That's fairly troubling IMO.

If those defenses were ALWAYS dominant, then McCoy's struggles are more explainable, b/c it's reasonable to conclude the defenses were just too good for anyone to handle. Since that wasn't the case, then that potential saving explanation goes out the window.

wrt your other post, my point was that McCoy specifically struggled against good defenses, NOT that there happened to be random games he struggled. When you put up great numbers against mediocre to crappy defenses, and get shut down by good defenses, that suggests that you might have a problem at the next level, when ALL of the defenses are outstanding compared to what you see in even the best college defenses. When you randomly have crappy games from time to time, that just means you're inconsistent.

Of course, it could be that McCoy was just inconsistent, and that he simply happened to have his off games the times when he was facing quality defenses, but that seems a reach to me. IMO there was a fairly clear pattern, and it was troubling.

PS below is a box score more representative of what you'd want to see from a quality QB against one of the tougher defenses he faces. Obviously the numbers aren't as good as normal, but it's still a quality showing that doesn't come up as a red flag.

Points: 0

#69 by sjt (not verified) // Apr 21, 2011 - 12:07am

I have to wonder if you're intentionally misreading me at this point.

And I have to wonder if you think we are talking about baseball instead of football. Sample sizes here are tiny, the intervening variables are too many to count, and outliers happen. And most importantly: QBs cannot operate while planted in the turf.

I'm reminded here of the 2007 Patriots. While they were on their tear, going 18-0 and smashing every record in the books, Tom Brady rarely got his jersey dirty. Then in the AFCCG against the Chargers, who were playing good defense at the time, he struggled, throwing 3 picks and getting sacked 5 times. Clearly a down game compared to his season to that point. Of course things got even worse in the Super Bowl, where the Giants had enough pass rush to hold his offense down and win. Does this make Tom Brady a lesser QB, or does it remind us that matchups matter far more than any individual talent in any given game?

What I was saying was that Texas's O-Line (and rest of offensive talent) was most likely better than the O-Line and overall offensive talent for BYU, Miami, Texas Tech, and Colorado

And yet the results show that wasn't the case. If the Texas O-line gives up 9 sacks to Nebraska on its way to scoring 13 points, and the Texas Tech o-line does a lot better on its way to scoring 31, clearly the Tech offense was a better matchup for the Nebraska defense. Maybe on paper Texas looks better. Maybe they had more players drafted or they were better suited to play most teams, but clearly Tech had something going that Texas didn't against that particular defense.

So multiple QB's with lesser abilities and lesser supporting cast did less crappy than McCoy against Oklahoma and Nebraska

A few QBs did somewhat better in a few games, though none did great. BYU scored fewer points than Texas did against Oklahoma. Miami scored a few more. Both QBs threw multiple picks. As for Nebraska, the only team which really lit them up in a meaningful way was Tech. I'd posit that its possible, just possible, that the tech offense was just a better matchup for Nebraska than the Texas offense. Maybe it was the line, maybe it was the receivers or the QB or the play calling, but clearly on that given day Tech had Nebraska's number. I submit that if you substitute Potts for McCoy in that title game he also would have been dumped on his ass all game and his numbers would have dropped.

f those defenses were ALWAYS dominant, then McCoy's struggles are more explainable, b/c it's reasonable to conclude the defenses were just too good for anyone to handle.

Nebraska led the nation in defensive PPG that year, was # 2 in sacks, and was led by one of the most dominant college defenders of all time. Lots of teams struggled against them, especially later in the year when they were playing really well. And here's the important point: Texas couldn't block them. How much of that do we put on McCoy?

I also find it funny that you're so obsessed with McCoy and his "waring flags". Dude was a 3rd round pick. For the Browns. He's playing with house money at this point.

Points: 0

#63 by sjt (not verified) // Apr 20, 2011 - 9:25pm

And if we are gonna cherry pick stuff, we can certainly go back and do it for other players in this model. I bet we can find the occasional bad performances by pretty much every QB listed here, regardless of what the model says or what his actual numbers have been.

Case in point: Philip Rivers. As a senior he laid an egg against Maryland (a good team that year which was ranked). 16-30 (53%) with 0 TDs and a pick, in a game his team lost. That same Maryland team lost to Northern Illinois (for real) and got Hammered by an FSU team led by Chris Rix. The NIU QB hit 60% of his passes for 266 and 2 scores. Chris Rix outplayed Philip Rivers against Maryland (not by much, but he did). What does this tell us about Philip Rivers? Does the whole model get thrown out?

Points: 0

#21 by Karl Cuba // Apr 20, 2011 - 3:09pm

With regards to Ponder, should any accomodation be made for his decline possibly being due to his injured throwing arm? All reports from the medicals at the combine had him passing with flying colours. (Personally, his lightning release and ball placement remind me of a qb with a supermodel wife)

Points: 0

#30 by Aaron Brooks' … (not verified) // Apr 20, 2011 - 3:48pm

(Personally, his lightning release and ball placement remind me of a qb with a supermodel wife)

Nah, his hair is too good to be Jeff Garcia.

Points: 0

#40 by Steven T. (not verified) // Apr 20, 2011 - 5:45pm

LOL. I'd give that a +10 if I could. Awesome.

And is he comparing Ponder to the current Hall of Fame lock or the guy who lasted until the 6th round because no NFL team thought that much of him?

Points: 0

#99 by bravehoptoad // Apr 25, 2011 - 6:53pm

This is where the scouting reports acquire extra importance, when there might be mitigating factors for poor performance.

SackSEER on Robert Quinn is another example.

Points: 0

#23 by Dean // Apr 20, 2011 - 3:16pm

"It's important to understand that LCF is meant to be a tool used alongside the scouting reports, not instead of the scouting reports. Sam Bradford was still the proper number one overall selection in the 2010 draft. "

I bet Phil Simms ignores this part of the essay.

Points: 0

#24 by Aaron Brooks' … (not verified) // Apr 20, 2011 - 3:29pm

"(The data set has only three of these players: Josh McCown, Tarvaris Jackson, and Joe Flacco.)"

You mentioned Steve McNair in the intro. Alcorn is Div-IAA.

Points: 0

#37 by speedegg // Apr 20, 2011 - 5:22pm

Awesome stuff! I'd really like to see how Campbell does in Oakland and Quinn does in Denver in the next couple of years. I thought those guys got off to a bad start with bad teams...just hope they didn't go from bad team to worse.

For the guys that complain about over use of regression and under use of stats, maybe think about LCF v2.0 as the alternate hypothesis Ha (H sub a) to the scouts' take which is the null hypothesis or H0 (H sub 0) with the DYAR as the P-value. Does the P-value say we should accept or reject H0? Do we reject the null hypothesis that Newton and Gabbert are legitimate top 10 draft picks/quarterbacks? P-value/DYAR says yes, Cosells tweets say maybe.

Also, assuming an Alpha/significance level of 5% and Type I or Type II error (false positive or false negative) when all the numbers are right. Stuff isn't going to be perfectly, but it's a lot better than most. Besides numbers don't factor in intangibles like off the field issues (work ethic or lack of, slurping gravy, etc).

Points: 0

#41 by Steven T. (not verified) // Apr 20, 2011 - 5:58pm

I can see where your "bad start with bad team" theory comes into play with Quinn who struggled from Day 1 and couldn't get onto the field, but it makes very little sense in Campbell's case given he had extensive starts and enjoyed a fair amount of success. He started 52 games for Washington, has thrown more TDs than INTs every year he's been in the league, has a career completion percentage over 60 and a quarterback rating of 82.6. His stats make him look like a Hall of Famer compared to Quinn. That all helps explain why Campbell is going to get a chance to start again and why Quinn will never see the field unless somebody gets hurt. His chances of being cut by Denver are far higher than him ever starting there.

Points: 0

#44 by sundown (not verified) // Apr 20, 2011 - 6:24pm

Quinn is done and has been ever since he failed last season to do anything with Denver. Really not that much different than what he did in Cleveland--opportunity was there waiting, but he couldn't seize it. McDaniels wasn't in love with Orton and would have been happy to have seen Quinn become the starter, but he did absolutely nothing. By the end of the season he'd fallen behind Tebow on the depth chart and based on what they looked like in the preseason, he probably should have been behind Tebow for the entire year.

Points: 0

#31 by Olbermann is a… (not verified) // Apr 20, 2011 - 4:02pm

What were the p-values of each of the variables?

Points: 0

#35 by speedegg // Apr 20, 2011 - 5:20pm

Sorry, no P-values. I was just drawing a parallel between high DYAR and high P-value for a one-sided test. Higher the DYAR projection for a QB is like a high P-value for a one-sided alternative hypothesis.

For Dalton his projected DYAR is impressive, but for any statistical hypothesis there is a chance it's wrong and good scouting comes into play. His lack of arm strength is a concern and a big question is how much stronger can he get? Scouts don't know, yet.

Points: 0

#34 by Jim C. (not verified) // Apr 20, 2011 - 5:03pm

Of course I remember Matt Blundin. In addition to quarterbacking at UVA, he was a useful bruising power forward for an Elite Eight basketball team.

Now, do I remember his NFL career? No.

Points: 0

#47 by Mr Shush // Apr 20, 2011 - 6:49pm

I'd be very interested to know what variables you tried and rejected. Attempts, or attempts/start? Draft position? Some combination of draft position and college starts (if we think that starts are important in part because they indicate that scouts had enough tape to get it right, then surely more starts is a much better thing for a #1 overall pick than a 3rd rounder?)

Points: 0

#52 by jmaron // Apr 20, 2011 - 7:25pm

Let me just start off saying it's so nice to get all this interesting information for free. It really is kind of wonderful.

As a Viking fan I hope they don't waste a 1st or 2nd on a QB but I fear they will. I've thought this crop of prospects looked pretty weak, this article adds to that belief.

Maybe the Vikes can suck large and then they get Andrew Luck next year. I'd sure as hell take 1-15 if it meant a shot at him.

Points: 0

#57 by Drunkmonkey // Apr 20, 2011 - 7:56pm

I'm pretty sure that the Panthers were taking 2-14 so well last year around week 14, because they thought they were getting Luck. Just remember, Luck still has another year of school after this one left. So if he doesn't realize by then that no matter when he comes out he's still going to a crappy team, he may stay for his senior year. Just saying...

Points: 0

#91 by Mountain Time … // Apr 24, 2011 - 6:45pm

The Colts were crappy when they drafted Peyton Manning. The Chargers were crappy when they traded for Phillip Rivers. Good players make their teams good

Points: 0

#58 by jmaron // Apr 20, 2011 - 7:56pm

I wonder what would happen if you grouped QB's with similar LCF ratings and then looked at the records of the management teams that drafted them?

Maybe there's nothing to it, but if Matt Cassell is drafted by Detroit under Millen instead of NE...I'm guessing we've all forgotten who Matt Cassell is.

Points: 0

#60 by morganja // Apr 20, 2011 - 8:36pm

There is no way the Panthers are really considering Cam Newton with the 1st there?

Points: 0

#66 by Joseph // Apr 20, 2011 - 10:39pm

From what Pat Yasinskas of ESPN's NFC South blog, it's the most likely scenario.

[Now--before you knock ESPN's divisional bloggers--he started out as a Panthers' beat writer for newspaper--I think the Charlotte Observer, if I remember his bio info correctly. So I'm betting he has some good inside sources via ex-co-workers. And, if those reporter sources are inside the team, then they probably aren't speaking for the record, and then those reporters honor that and pass the word to PY that "well-placed source" is indicating "thus and such."]

Since I keep up with the NFCS blog as a Saints fan, he has slowly changed his opinion in the last +/- 6 weeks. Considering that Brees, Ryan, and Freeman are the other 3 QB's in the division, he says that the Panthers HAVE TO take Newton and hope he pans out. Because if not, then 2011 is a repeat of 2010--good talent on the team, horrible hole of suckitude at QB that dooms the entire team to double-digit losses. I mean, let's face it--are Panthers' fans going to be excited about Jimmy Clausen being the starting QB for the next 16 games???? Doesn't Cam Newton--even learning for part of the season on the bench--at least get your hopes up that there MIGHT be a decent NFL QB on the roster????

Points: 0

#65 by Anon (not verified) // Apr 20, 2011 - 10:09pm

Please please please tell us how McElroy rates.

Seems like LCF would like him a lot.

Points: 0

#68 by CraigB (not verified) // Apr 20, 2011 - 11:37pm

Correct me if I'm wrong, but shouldn't you use McNabb's BMI when he came out of Syracuse rather than his current BMI? He was a skinny thing, if I remember correctly. He might even have been an outlier in the other direction!

Points: 0

#73 by Shattenjager // Apr 23, 2011 - 11:51am

I did a little looking because I thought I remembered McNabb being big at draft time. I found a couple of articles about him from 1999 that say he was 6'2", 223 lbs. That would be a BMI of 28.6. However, neither says where they got these measurements.

Points: 0

#74 by Aaron Schatz // Apr 23, 2011 - 12:45pm

Good point. One of the things we have a problem with in our databases are the changing sizes of players. I need to figure out the best way to account for that. It's also hard to get "past sizes" for players -- if you look them up, you just get current size.

Points: 0

#75 by AlanSP // Apr 23, 2011 - 1:23pm

I would look for old combine measurements, which should be available for basically all of the round 1-3 QBs (even the guys who don't work out still get measured). If I recall, NFLDraftScout has them going back to 1998.

The advantage of the combine is that they actually, you know, measure the guy. It drives me nuts when I hear things like "He's listed at 6-4, but scouts think he's more like 6-2." Weight can change, but almost everyone in the draft is at an age where he isn't going to get taller or shorter.

Points: 0

#77 by Shattenjager // Apr 23, 2011 - 2:36pm

That's why I was concerned about the fact that the stories I found about McNabb didn't say where they got the info.

I did check NFLDraftScout and they have his from both the combine and his pro day as 6'2" and 223 lbs.

Points: 0

#70 by Another Sean_C // Apr 21, 2011 - 1:12am

Cool stuff. But the most remarkable thing about Lewin 1.0 (to me, at least) was that it relied on just the 2 variables - each of them a simple, easy to comprehend stat. Maybe it was a bit like catching lightning in a bottle, though.

The fact that Cutler, Pennington, Campbell, Quinn, and Palmer are on the LCF 2.0 top ten will have a few people wrinkling their noses though. I could swear I've seen each of these names on at least one "Top 10 Draft Bust" list in the mainstream press. The NFL really is a league of "what have you done for me lately".

Points: 0

#80 by Dales // Apr 23, 2011 - 4:26pm

I think the only reason Pennington gets mentioned as a bust by anyone is that he was drafted the same year as Tom Brady. Pennington had a solid career, and likely would have had a great one if not for his injuries.

Palmer has also been solid, and was on the great path until getting injured versus the Steelers.

Too soon to give up on Campbell or Cutler.

Quinn, on the other hand...

Points: 0

#85 by BigCheese // Apr 23, 2011 - 11:53pm

Any "Top 10 Draft Busts" list that has Cutler, Pennington, Palmer, and even Campbell in it must be something incredibly narrow like "Top 10 QBs taken in the last 6 years by teams that have a hideous throwback jersey Draft Busts"

- Alvaro

Points: 0

#71 by AlanSP // Apr 21, 2011 - 1:36am

As others have noted, it seems like there are some serious issues with over-fitting here. I realize it's not a huge data set, but that's actually even more of a reason not to introduce so many variables; you run a greater risk of including factors that are really just noise.

(Note: the following is copied from my post in response to the XP linking to the ESPN article, since this seems to be the thread where most of the discussion's happening, and the point is relevant here)

"I don't think the issue with the original model is one of over-fitting so much as a disconnect between the positive and negative predictive value of the model.

That is, players with a good projection may or may not turn out to be good players, but players with a lousy projection are nearly always lousy. Intuitively, this makes sense, at least when talking about completion percentage. There are several factors that can inflate a QB's completion percentage in college, most notably the type of system he plays in, but there are few factors that can really depress it (basically a really lousy supporting cast a la Cutler at Vanderbilt).

So a high completion percentage can mean a number of things, but a low completion percentage usually means the guy sucks.

Notably, this holds true for the mid-late round guys as well as the early ones (David Garrard being the only real exception I'm aware of, and not exactly one to write home about at that). Completion percentage can't tell you that Tom Brady's going to be Tom Brady, but it can damn sure tell you that Spergon Wynn is going to suck.

Because the model uses linear regression (as far as I'm aware), it can't account for a better fit at one end than at the other, and I think that this is something that you should at least try to address."

I like the fact that you're treating completion percentage as a logarithmic variable. I was actually going to suggest this as a way to help account for the fact that a very low completion percentage has more predictive value than a very high one. I think at least that aspect is a step in the right direction.

Points: 0

#81 by Anonymouse 2 (not verified) // Apr 23, 2011 - 4:52pm

I disagree. To model complex processes, you need variables. In this case, I'd say we need more variables since the previous model only had two. For other processes, like a fermentation process (of beer or interferon) I want to find what variables are most predictive for yield. If I only looked at pH and temperature I'd find my model too inaccurate. I wouldn't take away variables, I'd add variables to see if it models the process accurately and increases r2. I'd probably find that I need to include conductivity, time, nutrient flow, and mixer speed to get accurate results.

In a few years LCF will have a bigger n, so over fitting or small sample size won't be an issue.

Points: 0

#86 by AlanSP // Apr 24, 2011 - 12:44am

This isn't really analogous to a fermentation process. For one thing, we have no experimental control whatever. We can't, for instance, ask "what happens if we lower the completion percentage?" the way that we could ask "what happens if we lower the temperature?". You simply have to take what the draft gives you in that regard.

There's also something to be said for a) parsimony, and b) statistical significance. A model with fewer factors and similar explanatory power is more parsimonious. As far as significance, I would be stunned if all of these factors are independently significant predictors.

You can always increase the R^2 by adding in extra variables. It's actually impossible to decrease it. You could add eye color as a variable and it would increase the R^2 by some amount. The question is whether the variable tells you anything meaningful beyond the variables already in your model, and whether it's something that can be reliably used for prediction.

You'd get a nice little increase in your R^2 if you included "has a first name that starts with Peyt" as a binary variable in your model, but that's pretty stupid for obvious reasons. I worry that some of these variables could effectively do something along the same lines, but less obviously. The "is a big fatty" variable applies to fairly few guys in the sample, and one of them is JaMarcus Russell. Russell was indeed a big fatty and did indeed suck, but that doesn't mean the two are related. There are too few people that meet that criterion to say anything about it with any confidence.

The sample size isn't going to dramatically increase over the next few years, at least not if they intend to keep it as QBs taken in the first 3 rounds.

Points: 0

#88 by Anonymouse 2 (not verified) // Apr 24, 2011 - 6:05pm

Yeah, tell me about no control group. We're you're doing something new sometimes you substitute "this is what we did for the first five consecutive runs" or something along those lines. It gets difficult when you already have an established process, but need to make it more efficient without violating the license. In cases like that you just have to use previous runs as a baseline. I sympathize with LCF 2.0's control group.

As for not decreasing r^2 are you referring to not making r^2 negative or making it lower? Sometimes the r^2 just "hovers" around 0, so you get a weak correlation. That gets frustrating. What gets more frustrating is trying to find the best combination of different variables at different settings (increase temperature, lower pH, and decrease mixing speed?).

Points: 0

#100 by AlanSP // Apr 25, 2011 - 8:43pm

I was referring to not decreasing the R^2 of the model if you add in an extra variable. Regression's designed to maximize the R^2 given the variables you put in, and a variable can always be given a weight of 0, so it's impossible to add a variable to the model and have it decrease the R^2 (although many will have a negligible impact). This means that if you try enough different variables, you can make your R^2 very high because you're bound to hit on some spurious correlations.

There are measures that correct for this. For example, I know that if you do multiple regression in SPSS, you can get an "adjusted R^2" measure that adjusts for the number of variables in the model (though I don't know exactly how this is done, or if it's a particularly great metric).

Points: 0

#101 by qsi // Apr 26, 2011 - 3:32am

Adjusted R^2 is useful in determining whether each additional variable you add to your regression is meaningful. As you said, it imposes a penalty for additional variables, so if your adjusted R^2 goes down after adding a new variable, you're better off leaving it out, and adjusted R^2 can even be negative in extreme cases.

Note however that despite the similarity to R^2 the adjusted version does not live in the same space. You cannot compare an adjusted R^2 to a regular R^2, and more importantly, an adjusted R^2 cannot be interpreted as "explaining a percentage of variance" any longer.

Points: 0

#104 by Dan // Apr 26, 2011 - 2:46pm

All you need to calculate the adjusted R^2 is the sample size (number of quarterbacks), number of predictor variables in the model (2 in the old LCF, 7 in the new), and the R^2 (.24 for the old model, .58 for the new). The formula's on Wikipedia and there's an online calculator. If we guess a sample size of 40 (probably too small), the adjusted R^2 would be .20 for the old LCF and .49 for the new LCF. With a sample size of 100 (probably too big), it's .22 for the old LCF and .55 for the new.

This would be a pretty strong indication that the new model is better (although I'd rather see an F test) if the 7 variables that they included in the model were the only variables that they looked at. But the problem is that they didn't just add 5 more variables to the model, they tried a bunch of things and kept the ones that seemed to work best. That means that the adjusted R^2 is still an inflated estimate of how well the model applies - possibly by a lot.

Points: 0

#72 by lionsbob // Apr 21, 2011 - 2:01am

To be fair for Brodie Croyle's drop in QB rating from his junior to senior season, he did only play 2.5 games his junior season before tearing up his knee against Western Carolina (thanks Mike Shula), but I doubt Brodie would have kept that high QB rating throughout the year.

Points: 0

#78 by Dales // Apr 23, 2011 - 4:16pm

Would love to hear the supporting information for thinking Mallett to be a dick.

His teammates (Mario Manningham excluded) love him, and this has been true back to high school.

I live in Fayetteville, and he was not a troublemaker here. I've seen him out and about, in restaurants and such, and he was never showing attitude and seemed like a normal college kid. He was always good to fans around here, and when fans were waiting in line to buy tickets (I forget what it was for but they camped overnight) he served the fans pizza.

I don't know about all of the concerns over him and booze and/or drugs. But the idea that he was some sort of jerk just doesn't fit with everything I've seen.

Points: 0

#105 by CoachDave // Apr 26, 2011 - 4:39pm

He was a major dick in Ann Arbor and his teammates and the coaching staff wholeheartedly couldn't stand him.

Don't know about his time at Ark. maybe he matured and got his head on straight...but at UM, he was very much a "dick"...maybe that's where they are getting it from?

Not sure, but either way, I wouldn't pick him in the first round, nor any of these guys. Maybe Gabbert if I had a QB need, but not anywhere close to where they are projecting him to go.

Think about it, who would you rather have, Ryan Fitzpatrick + Von Miller or Cam Newton? To me, I just don't see this being a "draft a QB in the first round" draft...sure one or two of these guys might make a PB here or there, but the risk is just too large when you've got Miller, Dareus and Peterson...3 guys who look like the most "can't miss" picks on the board.

Should be one interesting draft to watch.

Points: 0

#106 by Dean // Apr 26, 2011 - 4:53pm

Just finished reading this on Mallet. Completely different picture than the accepted conventional wisdom. Maybe some money changed hands to get this puff piece written? Or maybe the rest of the world is wrong about the kid and it's all an elaborate smear campaign? Regardless, very much worth the read...

Points: 0

#107 by Dales // Apr 26, 2011 - 5:48pm

He was a major dick in Ann Arbor and his teammates and the coaching staff wholeheartedly couldn't stand him.

I have found no evidence at all to support this, excepting for Mario Manningham. Dave Hyde of the Sun Sentinel spent a good amount of time looking into those rumors and could not find anything to substantiate it. Further, he found plenty that directly contradicts your assertion that "his teammates and the coaching staff wholeheartedly couldn't stand him."

Part of his article:

"But let’s look at the evidence; anyone that’s ever played competitive sport at any level does, at one stage or another, clashes with coaches or team-mates. And as you’ll see later on, Mallett is a perfectionist. He doesn’t expect mistakes from the offensive squad that he leads. And why should he? So could these “clashes” have been down to merely shouting at his linemen not to jump offside or hustling them back to the line of scrimmage in a two-minute drill? Who knows? But unless these rumours are given some sort of context then it’s awfully tough on Mallett to use them as a stick to beat him with.

If you check Ryan’s Twitter account he spends an awful lot of time talking with former Michigan players such as Steve Breaston and Lamarr Woodley. Both of them are respected NFL players with solid reputations. They were also strong leaders at Michigan. Mike Hart spoke strongly in Mallett’s favour about the way he carried himself.

So who clashed? I’m more inclined to believe the following quote from our own Jake Long who’d been around Mallett for more than a year at this point, than I am any unsubstantiated internet hyperbole. This came ahead of Ryan’s first career start as a freshman for 0-2 Michigan against Notre Dame:

“Ryan Mallett has really stepped up. He's gotten better all through camp and spring ball and it's kind of like he's a red-shirt freshman. He was here all spring and learned the offense then. He was here all summer, and then during camp, and he just fine tuned what he needed to with the offense. I think that gives him confidence. It gives everyone around him confidence. He didn't hesitate and he took charge. I can guarantee you, he's not intimidated by anything. I think that's one of the things I like about him. He's got a lot of confidence. He's got a great arm, and yet he's got to go into this week and play within himself and within the context of the game plan, because it's really about winning. And with him it's not about Ryan Mallett. It's about doing the things that will help this team win.”

That’s possible Hall of Famer Jake Long, in case you missed it.

Towards the end of his freshman campaign things did start to go a little sour. It wasn’t a great season for the Wolverines and the regular season ending defeats to Wisconsin and Ohio State were hard to take. In that Wisconsin game, Mallett and receiver Mario Manningham got into some verbal jousting with one paper reporting that the QB “got brave and lipped off to Manningham”. The TV camera caught Manningham tapping his temple as if to say: ‘It's your future, kid. Think about it.’

After a miserable 11-for-36 performance that included two interceptions, general inaccuracy and numerous poor decisions in that game, he stood alone on the sidelines and then later was isolated on one bench with the rest of his offense on another. He vented his frustration again at his teammates during the game although Long again defended him:

“It's frustrating getting hit, and I'm sure he was getting really frustrated because we got him hit and pressured too many times.”

That of course speaks to the class of Long but surely, time and again teammates going to bat for Mallett in the face of exterior hyperbole speaks to something of Mallett as well?

In the end, when Carr was fired and Rich Rodriguez took over and introduced the same offense that had made Pat White a star at West Virginia, the writing was on the wall. The final straw was the firing of QB coach Scott Loeffler who had recruited Mallett to Michigan and a decision was made for him to leave. On his departure, his father made a comment to a local reporter:

“It just wasn't a fit and yet, it was a hard decision. When he came home for Christmas, he told me and my wife that he really was falling in love with being at Michigan. Ryan wants to thank coach Carr, his staff and all the players for helping him have a truly great experience the past year."

Lloyd Carr also cited that perhaps there had been some difficulties but that he’d overcome them when he admitted that it was a shame Mallett was departing as he’d “matured dramatically in the last two or three months”. The context of that quote is unknown; does he mean on or off the field? If it’s on field, then you can see how easily something like that could be taken out of context and used against him."

So, again, I ask. Where is the substantiation of the idea that Mallett was a dick?

Points: 0

#108 by CoachDave // Apr 27, 2011 - 1:55pm

You aren't the only guy on this site who lives in a town where Mallett played college football.

Lloyd Card couldn't say anything bad about Adolf Hitler, the Jake Long quote clearly comes before he imploded at the end of the season when the team turned on him, who he tweets with is meaningless and the kid was drunk in Scorekeepers so much the back half of his Freshman year that they still to this day point to the seat he always sat at as the "Mallett seat".

Look, you are clearly pulling for the kid and seem to have a personal interest in him and quite frankly I hope he's turned his life around and has a great career and lives a great life, but I'm not the one who has been writing "drug use and character issues" articles about this kid for two years...and one cherry-picked quote puff piece article by a guy in 2011 using old and not relevant quotes about 2007 isn't going to change what's out there.

Points: 0

#111 by Dales // Apr 28, 2011 - 7:46am

No, but you are one who is throwing out things without providing any substantiation.

Should not be too hard, if the problems were as bad as you say, to find articles written at that time (rather than now) talking about the issues. Should not be too hard to find quotes from former teammates that are less than supportive. Should not be too hard to speak from your own personal experiences, if you have any with him.

Am I rooting for the kid? Absolutely. He did well for us in and around Razorback Nation. Do I have a personal interest in him? Not at all. And did I make a ludicrous argument involving Adolph Hitler, in a discussion about a kid's NFL draft prospects? No.

Points: 0

#112 by CoachDave // Apr 28, 2011 - 10:06am

Oh dear God...did I compare the kid to Hitler...of course not, stop being so purposefully obtuse with what others are writing...the point is that Lloyd Carr is a very lovely, silver-lining kind of guy who goes out of his way to see the good in people, especially when he's been quoted by the media...which isn't a bad thing, but also means that you should view his comments with a grain of salt.

And one article with dated quotes that don't even fit the timeline for his disastrous end of his Freshman year and running into the kid in a pizza place doesn't equal substantiation, nor does it effectively counter the mountains of things on this kid and his drug and character matter how badly you'd like them to.

Points: 0

#87 by PurpleJesus28 (not verified) // Apr 24, 2011 - 12:34pm

You list Blaine Gabbert's completion percentage at 65.4 percent. Shouldn't that be 60.9 percent (568/933)? Not sure if the error was in the formula or just in the article.

Points: 0

#95 by Aaron Schatz // Apr 25, 2011 - 11:30am

Sorry, I mistakenly listed his final year completion rate above instead of his career completion rate. The proper one is used in the projection. I will fix.

Points: 0

#92 by Mountain Time … // Apr 24, 2011 - 7:02pm

No complaints on "most unique?"

Points: 0

#94 by qsi // Apr 25, 2011 - 2:55am

Very interesting article. It's good to see previous work being updated and re-evaluated in the light of new data, and in this case also the recent failure of the LCF.

I do have a few concerns about the results though, some of which have been commented on previously. In order to assess the improvement in LCF 2 vs the original, there's a danger in looking at R^2 in isolation, as comparing R^2 values for regressions with different numbers of factors is misleading. While the improvement from 0.24 to 0.58 is large enough to suspect that the new LCF is indeed significantly better, a better way of comparing the two would be to show the F-values for each regression as those take degrees of freedom into account. Comparing the F-values would allow us to judge the "real" improvement in the explanatory power.

The other issue is the jump in the number of factors as the article does not show the significance of each. A t-stat for each factor can clear this up easily, and I suspect that at least one or two factors are not significant at even a 5% level (but that's just a wild guess based on experience with many regressions that end up incorporating this many factors).

It would be very helpful (and indeed would address the majority of the concerns voiced here) to give us the full results of the regression, or failing that (if you wish to protect your proprietary work) just to give the t-stats for each factor, and the F-value for the overall regression. These numbers should be readily available (or computable) based on the regression work that you've done.

Many thanks for doing the work though. I know a lot more effort goes into this kind of thing than is apparent in a write-up!

Points: 0

#110 by Anthony (not verified) // Apr 27, 2011 - 6:22pm

I don't understand the Kaepernick being too skinny thing. He's not 6'4 220 pounds; he measured at 6'4 233 pounds at the combine, which is about the same measurements as Gabbert who came in at 6'4 and 234 pounds. There are no concerns that Gabbert is too skinny so why is there concern about Kaepernick? College listed heights and weights are meaningless.

Points: 0

#113 by Shattenjager // Apr 28, 2011 - 12:53pm

I looked around and I keep finding 6'5", 233 pounds as the combine measurements for Kaepernick. That one inch, one pound difference actually makes his BMI 27.7 to Gabbert's 28.5. However, that would actually mean that Kaepernick is closer to ideal, since the formula includes difference from a BMI of 28.0.

Aaron's earlier comment about player sizes in FO's database makes me think he's already going to be looking into this issue.

Now, if we want something damning on Kaepernick, there's the fact that Al Davis apparently wants him, which has not exactly been good for anyone since Ken Stabler (Steve Beuerlein was actually pretty good but couldn't hold onto a starting job until he was 31, and he's undoubtedly the best QB Davis has drafted since 1968):

Points: 0

#114 by sterr (not verified) // Jun 20, 2011 - 1:20pm

Anyone have an idea of how Terrelle Pryor would score in LCF?

Points: 0

#117 by tabsports // Nov 29, 2011 - 1:49pm

Quick! Somebody crunch the numbers on TJ Yates! (Hopefully, this bumps up to the FO editing crew... this certainly will be interesting.)

Points: 0

#118 by Mr Shush // Nov 30, 2011 - 6:29am

Career starts: 44
Completion percentage: 62.3%
BMI: 26.8%
Run-pass ratio: 0.187
Rushing yards: -77
Passer rating differential: 30.1
AQ Conference: yes

All very positive apart from the bogglingly awful rushing yards figure. Given that he's reputedly pretty mobile, that suggests he was a sack-taking machine. Obviously as a fifth round pick none of this strictly applies to him anyway.

You might also be interested to look at this old article from the PFR blog about late round QB success stories. It concludes that you should be looking for a tall player from a BCS school. I would take from the Lewin findings that you almost certainly don't want someone with a bad completion percentage.

Long and short of it: I think that what's quantifiable about Yates is about as good as you're ever likely to find in a fifth round pick. He's still a fifth round rookie, which means he's likely to amount to nothing even in the long run, and likely to suck something fierce this year even if he is ultimately going to be a good player.

Points: 0

#119 by Thomas_beardown // Dec 01, 2011 - 5:33pm

Interesting. I wish there was an easy to use database of this kind of stuff so I could see how he compares to Kyle Orton and Marc Bulger who are about as good as anyone has any right to expect from a late round pick.

Points: 0

#120 by Mr Shush // Dec 02, 2011 - 10:34am


Career starts: 35
Completion percentage: 59.3%
BMI: 28.4 (6'4 233)
Run-pass ratio: 0.206
Rushing yards: 112
Passer rating differential: 23.4
AQ conference: yes


Career starts: 32*
Completion percentage: 61.9%
BMI: 26.7 (6'2 208)
Run-pass ratio: 0.079
Rushing yards: -92
Passer rating differential: 25.6
AQ conference: yes


Career starts: 21*
Completion percentage: 55.6%
BMI: 27.4** (6'4 225)
Run-pass ratio: 0.233
Rushing yards: 24
Passer rating differential: 22.6
AQ conference: yes


Career starts: 22*
Completion percentage: 62.3%
BMI: 25.0 (6'5 211)
Run-pass ratio: 0.109
Rushing yards: -47
Passer rating differential: 10.6
AQ conference: yes

Some thoughts: These guys seem to broadly bear out PFR's hypothesis that for late round picks height matters but bulk probably doesn't. All are at least 6'2 and three of the four are 6'4 or taller, but BMIs range from 25.0 to 28.4. Maybe being fat would be a problem.

The four improved by an average of 20.5 points of passer rating over their junior seasons, with Brady the lowest at 10.6. Maybe NFL scouts are failing to sufficiently adjust their pre-season expectations. This might be a very positive sign for Yates, with his whopping 30.1 point improvement.

All of them were pretty accurate, at least as seniors. Hasselbeck's low career percentage splits into a terrible junior season and a pretty solid senior year. Yates should probably be adjusted down a little compared to them for era, but his performance here seems fine - there's no reason to think he's not accurate enough to be a good pro.

Yates started a lot more games than any of them. At the back of the draft, that may well be a negative indicator, as more tape may suggest more accurate judgement from scouts.

None of them were remotely as sack-tastic as Yates as seniors. Hasselbeck was even worse as a junior, though.

On balance, I would say that Yates probably has a better chance than your average 5th round pick of being a decent pro. That's still not a massive chance. I would also say that we should expect Houston opponents to be racking up a lot of sacks in any game where they have to throw the ball. Orton's 2005 season is actually probably not far from what we should expect. Of course, the 2011 Texans defense, while pretty good, is not the 2005 Bears.

* I can only find games played data for Bulger and Hasselbeck. Based on attempt totals, I am assuming that Bulger started every game he played between 1997 and 1999, but no games in 1996. Similarly, I assume that Hasselbeck started every game in 1996 and 1997, but no games prior to that. For Brady, I have game-by-game attempt counts which lead me to conclude that there was one game in 1998 (Hawaii) and two in 1999 (Rice and Syracuse) which he may well not have started and I have not counted. Equally, it's possible he was benched or injured, so the correct number could be as high as 25.

** I can only find current measurements for Hasselbeck. Presumably his height has not changed, but his weight may well have done.

Points: 0

#121 by Thomas_beardown // Dec 03, 2011 - 9:37pm

This is some cool work you've done.

One thing I want to say about Orton 2005. He played better than his stats looked. He was by no means good, he was bad. However, he was just a normal kind of bad, while he stats say he was unbelievably horrific. The Texans will be helped a lot by having Arian Foster instead of Thomas Jones. So that's two reasons for mild optimism.

The Texans aren't likely to win any playoff games with a QB like that, but if he can be steady they should be able to finish off getting to the playoffs. Especially with the schedule that's coming up.

Also, Brady's whole college career is so strange I'm not sure there is anything to learn from it to apply to other QBs. Other than coaches have weird tendencies.

Points: 0

#122 by Mr Shush // Dec 04, 2011 - 8:29am

I pretty much agree. I actually think the Texans would have a decent shot at beating the likely #6 seeds (Jets, Bengals, Titans) at home. Unfortunately, the head-to-head loss with Oakland means they're likely to be seeded 4th, which means Steelers or Ravens, which means one and done.

Points: 0

#123 by Lewin Hater (not verified) // Apr 11, 2012 - 1:10am

Fuck this guy he is a fucking moron pussy bitch

Points: 0

Save 10%
& Support Aaron
Support Football Outsiders' independent media and Aaron Schatz. Use promo code SCHATZ to save 10% on any FO+ membership and give half the cost of your membership to tip Aaron.