Writers of Pro Football Prospectus 2008

FOOTBALL OUTSIDERS SIMILARITY SCORES

Similarity scores were first introduced by Bill James to compare baseball players to other baseball players from the past. The general idea was to start at 1000 points and subtract for the various differences between two players; the players closest to 1000 were the most similar. The method is all over the great Baseball Reference website and, just as UNIVAC eventually led to your Palm Pilot, can be seen as the ancient predecessor to advanced baseball projection methods like Nate Silver's PECOTA.

It was only natural that the idea would spread to other sports as statistical analysis spread to other sports. NBA analyst John Hollinger has created his own version to compare basketball players, and we have created our own version to compare football players.

Similarity scores have a lot of possible uses, and we aren't the only football analysts who use them. Doug Drinen of the website footballguys.com has his own system that is specific to comparing fantasy football performances. The major goal of our similarity scores, however, is to compare career progressions and try to determine when players have a higher chance of a breakout, a decline, or -- due to age or usage -- an injury. Therefore we not only compare not only numbers like attempts, yards, and touchdowns, but also age and experience. We often are looking not for players who had similar seasons, but for players who had similar two- or three-year spans in their careers.

Similarity scores have some important weaknesses. The method compares standard statistics like yards and attempts, which are of course subject to all kinds of biases from strength of schedule to quality of receiver corps. The database for player comparison begins in 1978, the year the 16-game season began and passing rules were liberalized (a reasonable starting point to measure the "modern" NFL). We also project statistics for 1982 and 1987 as if the strikes did not happen, although we cannot correct for players who crossed the 1987 picket line to play more than 12 games.

The method is subject to change in the future; we want to tweak it and perfect it. But here is how it works right now:

QUARTERBACKS

  • Subtract 15 points for each year difference in age between the two players
  • Subtract 10 points for each year difference in career experience between the two players (based on the year the player came out of college, not necessarily his first year in the NFL)
  • Subtract an additional 15 points for each year in difference in career experience as a starting quarterback between the two players, based on the first year where a quarterback started at least six games
  • Subtract 5 points times the difference in games played
  • Subtract 20 points times the difference in games started
  • Subtract 0.225 points for each difference of 1 pass attempt
  • Subtract 2.5 points for each difference of 10 passing yards
  • Subtract 1.6 points for each difference of 0.1% in completion percentage
  • Subtract 3 points times the difference in passing touchdowns
  • Subtract 2 points times the difference in interceptions
  • Subtract 2 points times the difference in sacks
  • Subtract 150 points times the difference in yards per pass attempt
  • Subtract 1 point for each difference of 4 rushing attempts
  • Subtract 1 point for each difference of 10 rushing yards
  • Subtract 1 point for each difference in rushing touchdowns

RUNNING BACKS

  • Subtract 15 points for each year difference in age between the two players
  • Subtract 10 points for each year difference in career experience between the two players
  • Subtract 5 points for each year difference in NFL experience between the two players
  • Subtract 15 points times the difference in games played
  • Subtract 5 points times the difference in games started
  • Subtract 6 points for each difference of 5 carries
  • Subtract 1 points for each difference of 5 rushing yards
  • Subtract 10 points times the difference in rushing touchdowns
  • Subtract 100 points times the difference in yards per carry
  • Subtract 1 point times the difference in receptions
  • Subtract 1.5 points for each difference of 10 receiving yards
  • Subtract 3 points times the difference in receiving touchdowns
  • Subtract 3 points for each inch difference in height
  • Subtract 10 points times the difference between the two players in Body Mass Index

WIDE RECEIVERS and TIGHT ENDS

  • Subtract 15 points for each year difference in age between the two players
  • Subtract 10 points for each year difference in career experience between the two players
  • Subtract 5 points for each year difference in NFL experience between the two players
  • Subtract 3 points times the difference in receptions
  • Subtract 1 point for each difference of 2 receiving yards
  • Subtract 8 points times the difference in receiving touchdowns
  • Subtract 12 points times the difference in yards per catch
  • Subtract 1 point times the difference in carries (wide receivers only)
  • Subtract 3 points for each inch difference in height
  • Subtract 5 points times the difference between the two players in Body Mass Index

DEFENSIVE SIMILARITY SCORES

The defensive similarity scores system is a bit too complicated to explain fully. The coefficients are different for each position. The similarity scores measure a number of stats, basically split into three categories:

  • 1) Biographical facts, such as age, experience, height, weight, and BMI.
  • 2) Standard stats, such as sacks and interceptions.
  • 3) FO advanced individual defense stats, such as Stop Rate and Defeats. These are stats from PBP, not game charting, so they do not include defensive coverage metrics for defensive backs.

FOR ALL SIMILARITY SCORES

When measuring a two-year or three-year span, we use a mathematical method called the "harmonic mean." The harmonic mean is higher when the items being compared are closer together. We measure the most recent year twice, then add either the previous year (for a two-year span) or the previous two years (for a three-year span). For example, over a two-year span, Player B with similarity scores of 900 and 900 will come out as more similar than Player A than Player C with similarity scores of 950 and 850.

Offensive players similarity scores are based on stats back to 1978. Defensive players similarity scores are based on stats back to 1997.