Writers of Pro Football Prospectus 2008

26 Apr 2016

Introducing BackCAST

by Nathan Forster

In 2008, Football Outsiders introduced Speed Score, a metric that projected the likelihood of success for running back prospects available in the NFL draft. Speed Score found a correlation between NFL success and the prospect's score on the 40-yard dash after adjusting for the prospect's weight. In short, players who tested as big and fast at the NFL combine ultimately found more success than players who tested as small and slow. Speed Score has had a number of hits, projecting success for players like Chris Johnson (who was considered a huge reach as a first-round pick at the time) and third-rounder turned All-Pro Jamaal Charles.

However, Speed Score also yielded some counterintuitive results that ultimately proved to be incorrect. For example, Speed Score gave high marks to Arizona's Chris Henry, who dazzled by running a 4.40-second 40-yard dash at 230 pounds. Henry, however, was otherwise unimpressive. In his best college season, Henry recorded just 581 rushing yards and he averaged only 3.3 yards per carry -- exceptionally poor for a running back headed to the NFL. Henry was severely overdrafted by the Tennessee Titans in the second round, and his career ended with just 122 rushing yards.

Over the past few months, we have worked to expand Speed Score to include more than NFL combine results, with a new system called BackCAST. We are still developing the new model, and have some kinks to work out, but we wanted to share with you our preliminary findings -- and what they suggest about the prospects available in this year's NFL draft.

The current database contains the 279 Division I halfbacks drafted in the period from 1998-2014 (2015 was excluded because those prospects have not had enough NFL experience to meaningfully evaluate). For obvious reasons, we included only players rostered as halfbacks for at least one full college season.

Right now, BackCAST is based on projecting each player's total NFL rushing and receiving yards over his first five years in the NFL. However, one change we may make in the future is to switch from measuring success by yards to measuring success by DYAR. Such a switch might be meaningful because total yards may make certain "compilers" seem successful simply because they received a lot of carries early in their careers. For example, Trent Richardson has managed to amass nearly 3,000 total yards to date, which is more than the "average" drafted running back. In reality, however, Richardson is an epic flop whose aggregate numbers were entirely a function of the Browns and Colts essentially banging their heads against the same wall by continuing to feed Richardson the football.

The first two factors in BackCAST are the familiar ones: BackCAST preserves Speed Score's use of 40-yard dash time and weight. In terms of success in the NFL, 14.5 pounds is roughly equivalent to 0.1 seconds in the 40-yard dash (so a 214.5-pound player with a 4.5-second 40-yard dash has the same prospects of success as a 200-pound player with a 4.4 40-yard dash, all else being equal). We also looked at other combine metrics (especially the 3-cone drill) to see if they also correlated to success. However, 40-yard dash was by far the strongest indicator of success of all the combine metrics, and none of the others were independently significant.

The first new factor in BackCAST is a metric called yards over expected per game (YOE/G). YOE/G is a sophisticated way to look at college yards per attempt. Basically, YOE/G creates a baseline for the strength of each college team's running game and compares the performance of the prospect against that baseline. More specifically, YOE/G compares the player's yards per attempt during his entire college career to the yards per attempt of all other teammates to record rushing attempts during the player's career, as well as the year before the player started at the college. We use the year before in case the prospect so dominated his team's carries that the other players on his team did not provide a large enough sample size for a meaningful comparison.

For example, LaDainian Tomlinson, recorded 5.7 yards per attempt for the Texas Christian Horned Frogs. That stat, in isolation, is not particularly spectacular for a collegiate running back. However, his numbers are extraordinary when placed in context. Texas Christian's running game was in poor shape. During the period from 1996-2000, TCU players not named "LaDainian Tomlinson" carried the ball 1,078 times for 2,146 yards -- a cover-your-eyes awful 1.99 yards per attempt. For the Horned Frogs, the difference between Tomlinson and anyone else to run with the football was just over 3.7 yards per carry. Tomlinson carried the ball 943 times, and thus produced 3,510 yards (3.7 * 943) over the numbers expected from "other" Texas Christian players. Because Tomlinson took 45 games to reach that number, his YOE/G is 78.0 (3,510 / 45). Tomlinson's 78.0 YOE/G is the best of any prospect in BackCAST's database.

The aforementioned Chris Henry is an illustrative counter-example to Tomlinson. Henry averaged only 3.3 yards per attempt in college, which was only slightly better than the 3.0 yards per attempt averaged by his teammates (nearly every drafted running back averages more yards per attempt than his teammates). The difference is a paltry 0.4 yards per attempt (after rounding), and Henry only had 269 college attempts. The result is a weak YOE/G of 2.8 -- the worst number recorded by any player in our dataset who was drafted in the first or second round.

The beauty of YOE/G is that it can also identify players whose numbers are buoyed by a strong supporting cast or a run-friendly system. For example, YOE/G would have suggested caution on Trent Richardson (although he did score decently on other BackCAST metrics). Richardson recorded 5.8 yards per attempt, which is slightly better than Tomlinson. However, Richardson did not play for the early aughts Horned Frogs; he played for an absolutely dominant incarnation of the Crimson Tide. In fact, the Crimson Tide was not much better running the ball with Richardson than without him (5.1 yards per attempt).

The next new factor is the prospect's peak percentage of team rushing attempts. Specifically, BackCAST measures the prospect's share of his college team's rushing attempts in each of his college seasons, and picks the highest one. For example, if a junior running back had 5 percent, 25 percent, and 65 percent of his team's rushing attempts in his three college seasons, BackCAST would use the 65 percent.

This measure recognizes that nobody knows the quality of a college running back better than his coaches, and that most coaches are smart enough to feed the ball to a player who is a true NFL-level talent. BackCAST recognizes that sometimes circumstances will dictate that a player may not receive as many attempts as his talent would suggest -- such as slow development, injuries, suspensions, or early mistakes in assessment by the coaching staff -- but for at least one season, he should dominate the backfield.

One objection to this metric might be that it downgrades a running back who shares his backfield with another elite NFL talent. However, this appears to be a feature, not a bug. The small sample size of first- and second-round running back prospects who entered the draft with a college teammate who was also a first- or second-round pick is filled with disappointments:


Teammate RBs Drafted in First or Second Rounds, 1998-2014
Running Back Year Round Pick College
Darren McFadden 2008 1 4 Arkansas
Felix Jones 2008 1 22 Arkansas
Reggie Bush 2006 1 2 USC
LenDale White 2006 2 45 USC
Ronnie Brown 2005 1 2 Auburn
Cadillac Williams 2005 1 5 Auburn

However, this is one factor that we will be scrutinizing more closely as we are developing BackCAST. There is a semi-strong relationship between peak attempts and YOE/G, and the two metrics might in fact be measuring the same thing. So it is possible that one of these factors may be dropped, or that one or both could be reconfigured. However, it is clear that the final version of BackCAST will include one or more metrics that factor in efficiency and usage.

BackCAST's final new factor is each prospect's receiving yards per game over his college career. Receiving yards per game does not correlate at all to the prospect's rushing yards in the NFL, but, as intuition would suggest, it does correlate strongly to NFL receiving yards.

BackCAST is expressed in terms of the percentage that the running back is projected to over-perform or under-perform the average running back prospect. For example, a player who has a +50% BackCAST score is expected to be 1.5 times as productive as the average drafted running back. Conversely, a player with a BackCAST score of -50% is expected to be only half as productive as the average drafted running back. Below is a table of the top 25 BackCAST scores from 1998-2015:


Top 25 BackCAST Scores, 1998-2015
Name Year Rnd Pick College BackCAST
LaDainian Tomlinson 2001 1 5 TCU +182.1%
Ricky Williams 1999 1 5 Texas +177.9%
T.J. Duckett 2002 1 18 Michigan St. +155.1%
Matt Forte 2008 2 44 Tulane +131.7%
DeAngelo Williams 2006 1 27 Memphis +130.4%
Ron Dayne 2000 1 11 Wisconsin +128.1%
Steven Jackson 2004 1 24 Oregon St. +124.6%
Rudi Johnson 2001 4 100 Auburn +122.5%
Luke Staley 2002 7 214 Brigham Young +119.6%
Michael Turner 2004 5 154 Northern Illinois +114.6%
Edgerrin James 1999 1 4 Miami (FL) +112.6%
Garrett Wolfe 2007 3 93 Northern Illinois +112.0%
LaMont Jordan 2001 2 49 Maryland +107.5%
Name Year Rnd Pick College BackCAST
DeMarco Murray 2011 3 71 Oklahoma +99.3%
Anthony Thomas 2001 2 38 Michigan +96.1%
Toby Gerhart 2010 2 51 Stanford +92.3%
Chris Johnson 2008 1 24 East Carolina +90.5%
James Starks 2010 6 193 Buffalo +90.2%
Jerome Harrison 2006 5 145 Washington St. +88.4%
Mewelde Moore 2004 4 119 Tulane +88.1%
Brian Leonard 2007 2 52 Rutgers +87.4%
Adrian Peterson 2007 1 7 Oklahoma +87.4%
Amos Zereoue 1999 3 95 West Virginia +87.0%
Charles Sims 2014 3 69 West Virginia +85.9%
Todd Gurley 2015 1 10 Georgia +85.0%

BackCAST also includes an output to measure the type of running back the prospect is likely to become -- whether the player is likely to be a ground-and-pound two-down back, a player who catches passes out of the backfield more often than he takes handoffs, or something in between.

Coming up with a metric to predict the prospect's likely usage patterns required a somewhat convoluted process. First, we took the historical NFL data and measured, for each prospect, the number of receiving yards under or over the amount that would be expected based on the player's NFL rushing yards. For example, take Alfred Morris. Morris has more receiving yards than the average drafted running back. However, nobody would fairly call Morris a receiving back, because he has an extremely low number of receiving yards for a running back who has seen the field as much as he has during his career. Once we had a measure of how many receiving yards the player gained relative to the number of rushing yards gained, we had a more accurate measure who in the past turned out to be a "receiving" back in the NFL and who was more of a ground and pounder.

Second, we transformed the metric into a Z-score. For lack of a better term, we call this the prospect's "RecIndex." It turns out that there are a few running backs who were almost exclusively pass catchers, so the median running back actually has a slightly negative RecIndex.

Finally, we ran a regression to predict which players are likely to have a high (receiving back) or low (rushing-focused back) RecIndex. The two factors that are significant in predicting RecIndex are receiving yards per game in college and weight, as smaller players are more likely to be receiving backs.

Here are the top running backs available in this year's draft according to BackCAST:

Derrick Henry, Alabama

BackCAST Score: +63.1%
RecIndex: -0.41

I was a bit surprised to see Derrick Henry top the inaugural version of BackCAST, given the recent strength of the Alabama backfield. However, Henry has several points in his favor, and he is a little better than Trent Richardson as a prospect in most metrics that BackCAST cares about. Henry's YOE/G is approximately 50 percent better than Richardson's. Henry was also more of a focal point in the offense, as 61.5 percent of Alabama's rushing attempts went to Henry. Henry is a bit slower than Richardson, but he more than makes up for the difference by being larger. Also, unlike Richardson, he is more appropriately valued as a late first- or early second-round pick, rather than a top-five prospect. The one big downside to Henry is that his chances to shine as a receiver coming out of the backfield are low, so if it turns out that he is not effective on the ground, he is likely to be a useless player.

The fact that Henry is our top back, however, also shows the weakness of this class according to BackCAST. (This is an interesting counterpoint to using just Speed Score, in which this year's class was particularly impressive.) Last year's top back, Todd Gurley, had a +85.0% BackCAST score.

Ezekiel Elliott, Ohio State

BackCAST Score: +46.2%
RecIndex: -0.04

Ezekiel Elliott is a nice prospect who is probably overrated as a high first-round pick. On the positive side, Elliott has a nice size/speed combination -- he recorded a 4.45-second 40-yard dash at 225 pounds. Elliott also had a great 6.69 yards per attempt for the Ohio State Buckeyes. However, Ohio State's offense was highly prolific during Elliott's stay, and it averaged 5.61 yards per attempt on non-Elliott runs.

Interestingly Elliott's projection also suffers because he only carried the ball on 51 percent of Ohio State's rushing attempts during his junior year, which was his best season. However, Ohio State gave few carries to running backs other than Elliott. Rather, Elliott lost carries to Ohio State's quarterbacks. We actually looked at re-calculating peak rushing percentage by subtracting out quarterback runs to see if it made the model stronger. However, it turns out that subtracting quarterback runs actually makes this factor (and the whole model) much weaker. It's interesting to think about why this might be so. It could be that running backs compete with quarterbacks for rushing attempts in the same way they compete with other running backs, or that rushing in a backfield with a running quarterback could cause the running back's numbers to be overstated in a way not captured by other metrics.

(Also, the rushing totals used here are based on official college stats and thus include sacks as runs, not passes; at some point, depending on how far back we can get clean data, we plan on analyzing these percentages with sacks removed from team rushing totals.)

Given his higher draft projection, Elliott is still probably the best bet to be the best back of this class. However, he lacks the statistical indicators that have been harbingers of elite running backs in the past. Especially given the relatively low value of the running back position in the modern NFL, a team picking in the top ten might be well advised to go in a different direction with its pick.

Devontae Booker, Utah

BackCAST Score: +23.4%
RecIndex: +0.67

Devontae Booker should be one of the best values in this draft at the running back position. He has a 27.5 YOE/G, which is the higher than all other running backs invited to the NFL combine this year. He was also a highly productive receiver out of the backfield, averaging 27 receiving yards per game for the Utah Utes. So if Booker does not work out as a traditional running back, he might at least be useful as a third-down back.

The biggest knock on Booker is speed. Booker actually has not run the 40-yard dash in pre-draft workouts. However, scouts believe that Booker's 40-yard dash would be in the mid to high 4.5s, and BackCAST downgrades him accordingly. (BackCAST presently uses the projected 40-yard dash times from NFLDraftScout.com for those prospects who do not run the 40-yard dash in pre-draft workouts).

C.J. Prosise, Notre Dame

BackCAST Score: +18.2%
RecIndex: +0.64

C.J. Prosise was a playmaker at Notre Dame. Prosise averaged 6.9 yards per attempt during his career, even though Notre Dame otherwise averaged only 4.6 yards per run.

If anything, BackCAST underrates Prosise because of the somewhat unusual circumstances of his college career. Prosise was rostered as a wide receiver his first two years and he was not pushed into action at running back until Tarean Folston suffered a season-ending injury against Texas. After that point, Prosise dazzled until suffering his own injury late in the season, which limited his carries on the year.

Prosise has such a small sample size, it could be that his strong play was simply a mirage. BackCAST is conservative on his prospects, although perhaps overly so. However, Prosise is generally considered a mid-round prospect and is well worth that price.

The following table provides the BackCAST and RecIndex numbers for all of the halfback prospects invited to this year's NFL combine.


2016 RB Prospects by BackCAST Score
Name School 40 Weight YOE/G PeakATT RcYds/G BackCAST RecIndex
Derrick Henry Alabama 4.52 242 15.2 61.5% 7.3 +63.1% -0.41
Ezekiel Elliott Ohio State 4.45 225 18.3 51.0% 12.8 +46.2% -0.04
Devontae Booker Utah None 212 27.5 51.7% 27.0 +23.4% +0.67
C.J. Prosise Notre Dame 4.43 220 12.0 32.8% 28.0 +18.2% +0.64
Jordan Howard Indiana 4.59 225 21.5 56.8% 8.2 +18.1% -0.23
Kenneth Dixon Louisiana Tech 4.56 212 27.1 51.6% 20.6 +16.6% +0.40
Daniel Lasco California 4.40 210 11.6 47.7% 11.2 +15.2% +0.02
DeAndre Washington Texas Tech 4.46 198 22.1 53.0% 22.3 +14.2% +0.58
Tyler Ervin San Jose State 4.36 178 24.5 55.4% 17.0 +8.6% +0.52
Paul Perkins UCLA 4.53 210 20.3 51.6% 18.9 +8.1% +0.34
Wendell Smallwood West Virginia 4.42 202 18.3 38.3% 16.3 -2.9% +0.30
Kenyan Drake Alabama 4.31 210 7.0 20.0% 13.9 -8.6% +0.13
Name School 40 Weight YOE/G PeakATT RcYds/G BackCAST RecIndex
Devon Johnson Marshall None 202 23.3 36.9% 9.8 -16.3% +0.02
Josh Ferguson Illinois 4.48 195 18.7 35.4% 33.9 -18.7% +1.10
Alex Collins Arkansas 4.64 218 18.4 52.9% 4.4 -22.6% -0.34
Keith Marshall Georgia 4.31 202 4.7 22.3% 6.8 -29.8% -0.10
Jonathan Williams Arkansas None 223 12.7 37.9% 9.6 -39.0% -0.16
Tra Carson Texas A&M None 202 0.6 47.9% 6.6 -40.4% -0.11
Tre Madden USC None 225 3.8 25.8% 10.4 -41.2% -0.14
Kelvin Taylor Florida 4.60 205 5.9 50.8% 5.1 -58.4% -0.20
Brandon Wilds South Carolina 4.54 202 3.7 29.1% 13.2 -77.2% +0.17
Peyton Barber Auburn 4.64 225 -17.5 40.4% 6.2 -88.4% -0.32
Shad Thornton N.C. State 4.75 202 17.5 32.7% 14.0 -100.0% +0.20

Posted by: Nathan Forster on 26 Apr 2016

21 comments, Last at 18 May 2016, 1:13am by tuluse

Comments

1
by Aaron Brooks Go... :: Tue, 04/26/2016 - 5:08pm

I see who BackCast liked (and it's not an overwhelming list), but who that got drafted did it hate? How did those guys turn out?

2
by tuluse :: Tue, 04/26/2016 - 5:33pm

I would have hoped that Ron Dayne would have been a prospect an advanced stat would warn about, not endorse.

12
by dryheat :: Thu, 04/28/2016 - 10:57am

For the love of Jim Brown, this. If a predictive system spits out Duckett, Dayne, and Stayley in the top 10, there's garbage going in.

15
by osoviejo :: Sun, 05/01/2016 - 3:56pm

This has a long way to go to become a useful predictive tool.

It was mentioned as a possible change, but I don't know why FO would be interested in anything other than using the tool to find future productive DYAR and DVOA backs. Otherwise, what's the point?

A chart showing that historical BackCAST results correlate well with future DYAR/DVOA production (on both the top and bottom) would be fantastic.

3
by Dan :: Tue, 04/26/2016 - 5:50pm

Does YOE/G also include quarterbacks? It seems like that could be misleading - QBs who take a lot of sacks would have a low (or negative) YPC, and QBs who are great scramblers would have a high YPC.

16
by Dan :: Sun, 05/01/2016 - 11:43pm

For example, Devontae Booker has a 4.95 career YPC (2014-15). Non-Booker Utes had 3.97 YPC from 2013-15, giving Booker a 0.98 YPC edge over his teammates. But non-Booker RBs on Utah had 4.41 YPC, giving Booker only a 0.55 YPC edge over his fellow RBs. Utah's QBs had only 3.48 YPC (including sacks), and accounted for 43% of Utah's non-Booker carries.

Another issue is that RBs have different roles. For example, this year Christian McCaffrey had 6.0 YPC and Stanford's #2 RB Remound Wright had 2.9 YPC. But Wright also had 13 touchdowns on 82 carries (vs. McCaffrey's 8 TDs on 337 carries). Wright was a short yardage specialist who was in there to pick up first downs & touchdowns, not yards. McCaffrey got pulled in those short yardage situations which inflated his YPC (and that's a negative indicator for his NFL prospects, not a positive one).

4
by Dan :: Tue, 04/26/2016 - 11:16pm

A lot of players are listed at the wrong weight in the table. Devon Johnson, Keith Marshall, and Tra Carson all weigh a lot more than 202 lbs.

6
by Dan :: Wed, 04/27/2016 - 1:27am

Also, what's your source for 40 times? nfldraftscout has Kenyan Drake with a 4.45 (not a 4.31) and CJ Prosise with a 4.48 (not a 4.43). There may be other discrepancies beyond those.

5
by Subrata Sircar :: Wed, 04/27/2016 - 12:53am

Ezekiel Elliot was a dominant college back enhanced by Urban Meyer's spread-to-run-power offense and stable of good-to-great linemen. It is not clear to me how he'll do at the next level; there's nothing he didn't do in the college ranks, but there's not necessarily anything he did that some other good collegiate back couldn't have done.

7
by bsims :: Wed, 04/27/2016 - 1:43pm

I'd be interested to see what BackCAST says about him with Barrett vs Jones at QB. Forster references the role of quarterback runs in the model; I wonder if looking at his YOE/G and other metrics with a good running quarterback vs a mediocre one would give us a clearer, or at least more robust, picture of Elliot as a back. Might even provide insight into the role of quarterback runs in general.

Although this seems like an area where CFB's sacks-as-rushing-attempts thing could really mess with your findings.

8
by speedegg :: Wed, 04/27/2016 - 4:08pm

Interesting. Derrick Henry's pro comparison is DeMarco Murray, and both need a running start and "space" in the backfield to be most effective. They don't have that lateral agility and acceleration, so that limits their usage/scheme but would do well with a team like Dallas.

9
by roy_miami :: Wed, 04/27/2016 - 4:22pm

Melvin Gordon's score?

10
by Scott C :: Wed, 04/27/2016 - 6:07pm

YOE/G probably needs some work.

I suspect a few things not mentioned in the article:

1. It is likely quite different to average 1 yard over your teammates when it is 7 vs 6 yards than 3 vs 2. The former RB is probably better than his teammates in space and in long runs, but it says less about making something out of nothing. While the latter RB is probably great at making something out of nothing -- a bigger indicator of success in the NFL where you don't have OLs dominate defenses like you can in college.

2. The quality of the other RB teammates is going to matter, but its not going to be easy to tease apart. Perhaps we can look at those teammates in years before or after the dominant years of the RB in question.

13
by bsims :: Thu, 04/28/2016 - 11:28am

Point 1 is interesting to me, and I wonder what scaling will end up working. Extending your hypothetical, what if instead of 1 yard better, the back in question is 3 times better. How does a back who gets 6 yards to his teammates' 2 compare to one who gets 15 for his teammates' 5?

New metrics are an exciting frontier.

14
by tuluse :: Thu, 04/28/2016 - 11:48am

I'm not sure your assumption in your hypotheticals is true. A back who averages 7 ypc when his teammates get 6 could be gaining those yards at any level, not just the final level. He could be avoiding rare tackles in the backfield or turning small holes his teammates get say 4 yards from into 6 yard runs. Remember even with really high ypc, it's still an average there are plenty of plays below that number.

20
by TheSportistician :: Tue, 05/17/2016 - 9:15pm

Have we considered looking at % difference over absolute difference? To use the example, 3 is 50% more than 2, while 7 is only ~17% more than 6. Perhaps this is a simple way to capture the difference.

21
by tuluse :: Wed, 05/18/2016 - 1:13am

On the other hand, each additional yard is probably harder to earn. So this is probably not what you want to measure.

11
by Karl Cuba :: Wed, 04/27/2016 - 8:12pm

Please sir, can we have some more?

17
by Keep Chopping Wood :: Sun, 05/08/2016 - 2:19pm

I love the concept, but what could be missing that accounts for guys like Ron Dayne, T.J. Duckett, Rudi Johnson and Toby Gerhardt all being ranked significantly higher than Adrian Peterson?

"Nobody in football should be called a genius. A genius is a guy like Norman Einstein."
-Joe Theismann

18
by LionInAZ :: Sun, 05/08/2016 - 10:52pm

These are the wrong questions. Players underperform or overperform for reasons that can't be explained by simple models. Dayne, Duckett, and Gerhart were all high draft picks for reasons that had nothing to do with FO projections. You have to expect some random variations. More important is whether BACKCAST is a better predictor of success than Speed Score, which was unfortunately (or conveniently) not discussed here.

19
by tuluse :: Wed, 05/11/2016 - 12:37pm

Both are interesting questions. What biases does Backcast have? Can they be eliminated or adjusted for? Can another model be equally predictive with different biases?