The Value of Mock Drafts
Guest column by Benjamin Robinson
It's April 26, 2018, and I'm at a college friend's apartment in Pittsburgh, Pennsylvania, watching the first night of the NFL draft as we have done pretty much every year since 2008, almost as if it were a ritual. He, a Steelers fan, and I, a Bengals fan, can coexist, because tonight anything seems at least a little possible. The draft unfolds in surprising fashion as the Cleveland Browns select Baker Mayfield, the Heisman Trophy-winning quarterback from Oklahoma, over the wunderkind prospect Sam Darnold from USC.
The NFL draft is an event full of optimism and hope. Each team has had some time since the end of the season to regroup, refresh, and put the previous season in the rearview mirror. In some ways, growing up as a Cincinnati Bengals fan during the depths of their futility in the 1990s, the draft and training camp were about the only things I could look forward to until Carson Palmer came on the scene in 2003.
Halfway through the first round, I sit back and lob a question over to my friend after a questionable take from the table of NFL Network analysts comes across our television screen. "How do we know if a draft pick is really a reach or a steal?" We quickly came to agree that you could only determine that a pick was a reach or a steal if you had an objective standard by which to judge a player and where they might be expected to be drafted. As analytically minded people, we wondered what kind of data could someone bring to bear to answer this question. This was the start of my mock draft data journey.
Over the years, mock drafts have been derided as unrealistic at best and meaningless at worst by fans and the media, but I could think of no other source of data that could adequately answer what I was trying to get at that first night of the 2018 draft. So I did what any analytically minded person does when they don't have data: I collected it myself.
Over my years following the draft from afar, I had seen the proliferation of a new generation of "armchair analysts" that had cropped up online. These analysts mostly focused on using a combination of film study, athletic measurements, and anecdotal comparisons to other players to project which college players were better prospects than others and to guess where a player might get drafted and by which team. Taking this approach, I took a note from decades of established social science research showing that aggregate measures of forecasts often perform better on average than any single forecast alone.
Using this crowd-sourced method would allow me to calculate summary statistics for the value of players appearing in mock drafts and to use historical data to make predictions about the multitude of ways in which the draft might play out. In this way, mock draft data provides a very nice way to estimate how the draft could shake out as well as provide insights into how the draft marketplace at large reacts to different events like college all-star games, the scouting combine, injuries, and media reports.
Starting with the 2018 draft and continuing each year since, I have collected data on thousands of mock drafts with the goal of generating expected draft positions (from here on out referred to as EDP) for each draft-eligible prospect with at least 10 unique mock drafts from at least five different mock drafters. Using this data, we can generate a range of likely outcomes for individual players and get a sense of where they might land on draft day.
Let's begin by looking at how Baker Mayfield's draft stock changed over time. Using mock drafts, we can see the story of Mayfield's draft process and his steady rise up the boards all the way to the fourth-ranked player by EDP in the 2018 class. Each mock draft used in a player's EDP model receives multiple weightings, mainly accounting for the date of the mock draft (with mocks made closer to the draft weighted higher than mocks made further away) and a the accuracy of the mock drafter in their final mock drafts in each draft year in my dataset (with more accurate mock drafters given higher weights than less accurate ones).
From there we compare a player's logarithmically adjusted EDP with a similarly adjusted actual draft position to evaluate the predictive power of EDP, weighting a player's market share of mock drafts to ensure we aren't overweighting EDP that derives from a relatively smaller sample size. Using this methodology, EDP alone explains about 80% of the variation in actual draft selections. This relationship is modeled on a logarithmic basis to account for the nonlinear relationship between EDP and actual draft position and to value earlier picks higher than later picks, and it results in drastic improvements in measures of model fit.
Now that we've calculated EDP, we can begin to evaluate how it performs along a number of dimensions. First, we'll explore how EDP fairs over rounds, and then across positions to see which are over- or undervalued, and finally which teams have been able to reap the most surplus value in the draft.
EDP by Round
Residuals are the differences between the actual and predicted values from a statistical model. Residuals can help us understand how the model's predictions perform across different dimensions. We use the median residual in this case to account for the low number of draft picks on a per-team basis over two draft years, and because averages can be influenced by outliers in smaller samples.
Here we introduce the concept of draft surplus value. All that draft surplus value is the residuals we just mentioned: the difference between EDP and actual draft position. If a player has a positive draft surplus value associated with their draft selection, all it means is that a player was drafted later than expected. Let's start digging into draft surplus value round by round.
When we look purely at the linear draft surplus value, the first, third, and seventh rounds have the lowest median residuals. This means that the midpoint of the distribution is lower for those rounds versus others. These trends persist when we look at logarithmic draft surplus value, with the exception of the second round.
A benefit of the logarithmic transformation is that it implicitly weights earlier picks in the draft higher. This makes the difference between later picks that might be of the same magnitude linearly much different in logarithmic terms. For example, the linear difference between four and one is the same as the linear distance between 104 and 101: three. However, the difference between the logarithm of four and one (0.60206) is larger than the difference between the logarithm of 104 and 101 (0.012712) by a factor of about 47, accounting for the higher value of earlier picks in the draft.
EDP by Position
Another question of interest is which position groups did EDP do better at predicting than others? EDP did particularly well predicting the draft position of running backs, tight ends, offensive tackles, and defensive tackles, but did very poorly with special teams positions like punter and kicker, and also fullback (most likely because there are so few of them and they don't appear in many mock drafts) and on a logarithmic basis, surprisingly, with quarterbacks. It seems that mock drafts have overrated quarterbacks the most overall, expecting them to go far earlier than they do in the actual draft. Only one position was underrated by mock drafts at the median, and that was the center position (although that might be more due to the fact that only 12 centers qualified for inclusion in modeling than anything else).
This finding has implications for future adjustments to EDP to make sure that it weights positions in a way that lines up with historic draft data and doesn't make poor out-of-sample predictions.
EDP by Team
We can extend the concept of draft surplus value to teams' drafting behavior as well. Teams that fall on the right tail of the distribution of draft surplus value tend to draft players later than their EDP; on the other hand, teams that fall on the left tail of the distribution tend to select players earlier than their EDP. Teams that have accrued the most draft surplus value on a per-pick basis (both linearly and logarithmically) have been the Tennessee Titans, Buffalo Bills, Dallas Cowboys, Baltimore Ravens, and New Orleans Saints. Teams that have accrued the least amount of draft surplus value on a per-pick basis have been the Cleveland Browns, Atlanta Falcons, Seattle Seahawks, San Francisco 49ers, and Tampa Bay Buccaneers.
I believe that there is an important lesson to be learned here: surplus value isn't everything when it comes to the NFL draft. Just like in fantasy football, players that are drafted above and below expectation can perform above and below expectations on the field, and having a heterodox drafting strategy can be beneficial if teams are using their drafts to capitalize on market inefficiencies.
This research is a first step in using large-scale data collection and analysis of mock drafts to craft an evidence-based framework surrounding the NFL draft process. Additionally, this methodology can be applied to drafts in other professional sports where a sufficient amount of data is available to be collected. In fact, other data scientists have already applied a variation of my approach to the 2019 NHL draft with intriguing results.
"Building an algorithm of all the mocks … allows teams to know and understand which players are considered in the top 100, top 200 and top 300," wrote long-time NFL front office executive Michael Lombardi in The Athletic last year, "But it's not foolproof."
Lombardi is right that aggregating mock drafts is not foolproof. However, in light of this evidence, it would be unwise to ignore this data as a tool to inform decision-making as a complement to the hard work of scouts and team personnel. Teams have limited resources, and it is in their best interest to gain the greatest possible efficiency and value out of their draft selections. As analytics and data science takes hold in the NFL, teams that can utilize these tools in varied and innovative ways are more likely to find "edges" to give their organizations a stronger chance to succeed on draft day and eventually on the field.
Benjamin Robinson is a data scientist living in Washington, DC and the creator of Grinding the Mocks, a project that tracks how NFL prospects fare in mock drafts. You can follow him on Twitter @benj_robinson and find the Grinding the Mocks project at grindingthemocks.com.
22 comments, Last at 11 Apr 2020, 1:19pm
#1 by Aaron Brooks G… // Apr 07, 2020 - 11:58am
Over the years, mock drafts have been derided as unrealistic at best and meaningless at worst by fans and the media
I would argue that the worst-case is intentional misinformation -- teams seeding erroneous false picks in order to bias their opponent's projections.
Have you compared EDP to career value? This comparison is basically using "surplus" in the sense that pick according to expected wisdom. But that's not actual value, that's just guessed-value.
#2 by Scott P. // Apr 07, 2020 - 12:10pm
But most mocks aren't measuring which player should be drafted at a position, but what player will be drafted. If every mock draft suggests Josh Rosen will be over-drafted, that's a win for the mocks, not a loss.
#3 by theslothook // Apr 07, 2020 - 12:29pm
That's the thorny issue. You would have to compare against Kiper's big board or whatever, but that has problems because it stops at 100 or so and its a sample of one.
In any event, passed studies(including my own) show that pick location tracks very strongly with career performance. Thus, one can surmise that mock drafts also track well, but not as well, with the pick slot. I think some of that variation loss is probably explained by the team itself, drafting scheme fits that are harder to predict via the mock draft.
#6 by Benjamin Robinson // Apr 07, 2020 - 12:56pm
Thanks for reading! I've thought of looking at aggregated Big Boards (like what Arif Hasan from the Athletic does) to see how the mocks differ from the Big Boards but I believe that mocks and big boards have different intentions so I don't include them. I agree that pick location tends to track pretty strongly with career performance but any little edge in the draft a team can get is valuable!
#20 by Duff Soviet Union // Apr 09, 2020 - 11:35pm
I wonder how much of this correlation is basically a self fulfilling prophecy. If you're a low pick, you'll probably get one chance to prove you can play. If you're a high pick, you'll get a million chances to prove you can't.
#21 by theslothook // Apr 10, 2020 - 3:48pm
There's definitely some of that going in. A player like Trubisky if he were drafted in the 2nd round is probably not given a second season to start, let alone 3. But people spend a lot of time studying the nfl tape and doing analysis. Its hard to believe all of that has no bearing on successfully scouting a player.
#5 by Benjamin Robinson // Apr 07, 2020 - 12:42pm
I think its a mix. Over the long haul of the draft process you get a mix of what they think teams should do and what they think teams will do. As you get closer to the draft, you'd like to think you see more of the what they think teams will do. This allows the prior mock draft data to build up and provide different scenarios than just the modal ones we tend to see over and over, so I think its healthy since we don't know if mocks based on what they should do are more or less accurate than the what will happen ones! It all depends on what your objective function is and for me its to try to get the draft slotting as close to correct as possible and provide some confidence intervals.
Thanks for reading!
#19 by Duff Soviet Union // Apr 09, 2020 - 11:34pm
Yeah, the guy I think of here is Jamarcus Russell.
He wasn't really on anyone's radar as a high pick until Oakland got the number 1. Then people started saying "god, it would be just like the Raiders to take Jamarcus first overall, wouldn't it?" Then mock drafters started putting him first overall, and the fact that he was first on everyone's mock draft was used to justify the Raiders taking him first overall.
#4 by Benjamin Robinson // Apr 07, 2020 - 12:34pm
That would definitely be a worst-case scenario. Not sure how much that happens or how we would even detect that! I haven't compared EDP to career value. This project started with the 2018 draft class so we only have two years of data to look at. Its definitely something I'd be interested in looking at over the off-season! I would expect that my findings wouldn't be too different than actual draft position since they're pretty correlated but it would definitely be interesting to look at.
Agreed that as I define it "draft surplus value" doesn't refer to actual on the field value. Thanks for reading!
#9 by Aaron Brooks G… // Apr 07, 2020 - 2:46pm
My sense of the more famous mocks is that they do talk to GMs and scouts that they know, and that filters into their projections.
I can absolutely see teams giving false answers to those questions trying to throw other teams off their intentions.
How did you handle trades? Or was this purely on the basis of who chose who in which draft slot?
#10 by Benjamin Robinson // Apr 07, 2020 - 5:12pm
Lots of the more well known draft analysts have cultivated sources in the league over the years. You'll not see many of them admit that their intelligence was wrong (though I have seen Matt Miller from Bleacher Report admit that he's been lied too or misled before, who knows if it was accidental or purposeful).
I include trades because its all about accumulating enough scenarios of what might happen. Ultimately, its my most educated guess and hopefully more often than not the players fall within their range of expected outcomes more often than not!
#7 by ChrisLong // Apr 07, 2020 - 1:12pm
Hi, good read! You mostly show the medians in this. Anything interesting from the CIs? Once accounting for these, were there teams that reliably drafted better or worse, or positions overdrafted or underdrafted?
#8 by Benjamin Robinson // Apr 07, 2020 - 1:45pm
Hi, Chris! Thanks for reading. I mainly show medians in this because the sample size is so low for these positions because I've only got a couple years of data. After this year's draft, I'll consider showing those CIs but for now I think they might just confuse folks!
#11 by Joseph // Apr 07, 2020 - 6:14pm
I can only really comment on my team, the Saints.
One thing to caution you about is how much surplus value you assign to a team that picks a player later than the mocks say he should go (or the reverse). For a simple example, everybody had Reggie Bush as the #1 overall pick in 2006--but then the Texans took Mario Williams, the Saints chose Bush, and both teams got the player they wanted. (It probably worked best for both teams that way.) Anyway, I don't think any team should be assigned too much surplus value in a case like this. Realistically, outside of maybe the top 5 picks overall, there should probably be a tolerance interval where if team A chooses the "consensus" 10th best prospect at #9, there is no credit or loss for that. They may have had a need at that position, and they really didn't "reach" for their player--esp. if the #9 overall player was at a position where the team already had good talent.
Another story is from ~10 years ago where I remember the Saints picked a CB from Penn St. in the 6th round who Kiper had rated the best available player left for 10-15 picks before the Saints picked him. I don't believe he made the team, or even the practice squad. Compared to Mel's big board, it looked like great value on TV. Of course, at that point in the draft, anybody who contributes to the team for a couple of years is good value.
Tangent: I remember several years ago the Football Perspective blog had an average of AV for draft slot X for the previous 20 years or so, with an example of a player that had accumulated that much AV during his career. It was to give you an idea of what kind of player your team could be expected to get at that pick slot. It was very enlightening how quickly the expected NFL production dropped after the top 10 or so.
#12 by theslothook // Apr 07, 2020 - 9:11pm
Just because the Saints won the SB doesn't mean they drafted the right pick. Mario Williams was better than Bush who is not quite a bust but probably a disappointment considering how his NFL career turned out.
#14 by Joseph // Apr 08, 2020 - 12:31pm
Not implying that the Saints pick was right or wrong, nor the Texans. I simply mean that the Saints really shouldn't get extra credit for drafting him at #2 when every mock I saw had him at #1. (Nor should the Texans get "dinged.") Simply pointing out that taking a player less than 5 slots from where he was projected by the consensus is probably not a team reaching for a player, nor a result of that player sliding.
However, I think a person could argue that the Saints made the right choice selecting Bush. While Bush never delivered on his potential, his selection was absolutely the right choice at that time (right after Hurricane Katrina--brought needed hope and excitement to the team and the region), and everybody would have panned any other choice. Looking at pfr, the best player (per career AV) selected in the first round of that draft was Haloti Ngata at #12; second was Jay Cutler at #11. (Overall, the best per AV is the guy the Saints took in the 4th round--G Jahri Evans; 2nd is T Andrew Whitworth, who just beats out Ngata for #2). So considering that the only person in the top-10 that has more career AV is Mario Williams, I think that it was at the very least quite defensible, looking at how the players' careers went. (Also, they would not have chosen Cutler, as they had just signed Brees; and Ngata, while a very good player, would have been a reach at #2 overall.)
Also, regarding the Saints draft--that year they selected Bush, Roman Harper, Evans, Rob Ninkovich (who ended up as a decent starter for NE), Zach Strief, and Marques Colston. All 6 played for 10 years in the league and participated in over 130 games. It's probably the 3rd best in their history, behind 1981 (HOF Rickey Jackson, George Rogers, + 3 other guys who played 13 years for the Saints) and 2017 (Kamara, Lattimore, Ramczyk, and others).
#15 by Benjamin Robinson // Apr 08, 2020 - 5:30pm
Some good points in here. For example, the Browns got "penalized" in log adjusted draft surplus value for drafting Baker Mayfield at 1 (EDP rank of 4) and Denzel Ward at 4 (EDP of 10). At the same time, if you think the log adjusted draft surplus value over-values top picks then you can look at the linear charts where each pick is valued the same.
I tried to be careful with my language in the article by saying that no pick is "wrong" and that one person's "reach" can be seen as another person's market inefficiency/mis-valuation! I also am not privy to each team's draft board and which players they even have on their boards. One team might not have a guy I have a high EDP for on their board so that's not even an option for them!
Your Kiper story is right on. We're biased by what we can see and ESPN and NFL Network analysts are for better or worse the "faces of the draft" for a lot of people. Which is why I try to get a lot of voices/opinions involved!
#17 by Joseph // Apr 09, 2020 - 10:12am
This is where I would tweak the formula with something like "one standard deviation away" for credit or penalty.
In other words, EDP of 10, pick between 1-6--penalty. Same player, picked between say, 7-13, no credit nor penalty. Pick him after 13, credit. Obviously, as the picks get higher numbered, further down the draft, that "no credit nor penalty" band becomes larger. IIRC, there is very little difference after about pick 200 (including UDFA's). Anybody who contributes more than 1-2 years, outside of special teams, is good value.
Regarding the Browns specifically for 2018, I would have Mayfield as a minor penalty (just outside of that 1 SD band), and Ward as just a bit more. For me, mathematically, both would be considered slight reaches. Personally, I don't think the Browns made bad choices. Time will tell.
#18 by Benjamin Robinson // Apr 09, 2020 - 4:15pm
This is definitely something that I've been looking into especially if a pick falls within the 95% confidence interval of the estimates. I mainly use the log difference in order to account for the value of picks like you say. I'll look into this at some point post-draft but I'm open to new ways of thinking about draft surplus value, it just might take a while!
#13 by Dan // Apr 08, 2020 - 4:13am
One interesting thing that could be done with this data set is to estimate a probability distribution for when each player will be drafted. A lot of models that project how good a career a player will have take draft position as an input (including FO's models like Playmaker & QBASE), so if we had a probability distribution for when Jalen Hurts will be drafted then those models could estimate a distribution for his post-draft expected career value.
Another interesting thing that could be done is to predict what players will be available at a particular draft slot. One way to set this up would be to let someone enter their draft board - so if we're looking at the Patriots' first round pick I'd enter my ranking of my top 23 players - and then the mock draft based joint probability distribution could tell us how likely it is that each of those 23 players will still be available at pick 23, and how likely it is that he'll be the BPA according to my ranking. Or simpler versions of that where we just look at (e.g.) who will be the best wide receiver available at pick 23, given some ranking of the WRs.
There are various websites that do simulated mock drafts where the user(s) pick for one or more teams and the computer picks for the rest. Another thing that could be done with this data set is to make a more realistic one of those, where it automatically makes the picks based on the mock draft based probability distribution, except for the picks that the user makes.
If this is going to be used to grade teams' drafts, then it seems important to do some analyses to see if the mock draft info still matters in predicting the player's career value after controlling for his actual draft position. Do players who are drafted earlier than they were mocked have worse careers (relative to their draft position) than players who are drafted later than they were mocked? How does that scale with the gap between mock pick and actual pick, and are there other variables that help predict it (such as school size, injury history, or variability in mock pick)? It would be nice to be able to look at excess value in a way that had that sort of empirical basis.
#16 by Benjamin Robinson // Apr 08, 2020 - 5:33pm
Thanks for the great comment, Dan! I agree with you on all points. My last "offseason" was focused on automating data collection to the extent I possibly could and this "offseason" will focus a ton more on R&D and algorithms as well as looking more into the relationship between EDP and future performance. I don't think I'm going to wade my foot into the mock draft simulator market at this time but would be open minded to working together with someone who would be interested in such a cause!
#22 by GwillyGecko // Apr 11, 2020 - 1:16pm
While during most of the mock draft season people had Darnold going first, it seemed that the day of the draft everyone realized that Cleveland was taking Mayfield and adjusted their final mocks accordingly if they updated in the afternoon that day.