SCIENCE! - Brendon Thorne
Sports Illustrated fails Stats 101 and proceeds to elevate recruiting rankings far, far beyond what they deserve.
Going into signing day, Sports Illustrated dropped a statistical bomb on all of us with their lede college football story on Monday. It's a triumph of post-Nate-Silver data worship. Based on some iffy statistical comparisons, writer Ed Feng determines that Rivals recruiting rankings are accurate at determining whether teams are any good; kinda, sorta, maybe. Or at least, the article finds that they're nearly-pretty-close-to-as-good-as Preseason rankings, which, as we all know, are basically science.
In an exquisite bout of foreshadowing, it's not exactly clear what method was used to arrive at the conclusions of the article. According to the piece, it seems to combine a weighted average of Rivals recruiting rankings with the writer's own ranking formula and then compares it with the AP preseason Top-25. Then, it pits the two against each other based on their ability to predict the Final AP Top-25.
To determine how recruiting rankings impact team performance, I ran a regression model on Rivals' recruiting ratings over a four-year period in order to predict each team's performance for the subsequent season.
To measure team performance for a given season, I used the team rating from The Power Rank algorithm . . . The team with the higher rating won 62.8 percent of bowl games over the last 11 years (219 out of 349), a higher mark than the 62.2 percent picked by the Vegas line.
The regression model on recruiting data assigns each team a rating that outputs an expected margin of victory against an average team, as alluded to above. Sorting schools by this rating creates a ranking system that can be compared with the preseason AP Poll.
I don't know how THE POWER RANK ALGORITHM factors in here, besides being MORE MATH THAN YOU WILL EVER BE. Nor can I figure out how they determined said algorithm, and the article never really explains. However, if this POWER RANK has as big a role as he obliquely suggests, then it completely invalidates any real comparison between Rivals and the preseason AP. You aren't representing the predictions of Rivals if you're inserting your own analysis into the mix. In order for the analysis to be remotely meaningful it has to truly represent the recruiting rankings as they stand, without contamination by data outside of the rankings themselves.
And even if this Power Rank weren't involved, he still resorts to overt cherry-picking to make his point. To whit:
[Rivals]. . . routinely underestimates the rankings of Boise State and TCU, smaller programs that have repeatedly busted the BCS over the last four years. During that span, Rivals never ranked Boise State higher than 61st or TCU higher than 34th. The preseason AP Poll has been more accurate seven out of eight times on these two schools, with TCU's 2012 season serving as the only exception.
Alright, so this particular claim is true. There's clearly something out there that consistently leads to a mismatch between recruiting rankings and non-AQ on field performance. It's well documented, and we see it with mid-level members of the BCS conferences as well: Wisconsin, Baylor, Stanford, etc...
Let's see where they goes with this!
Excluding these two teams from the data set, the Rivals' model performed as well or better than the preseason AP Poll 45 of 92 times.
This is no big surprise; if you exclude all of the teams that invalidate your argument, it starts to look a helluva a lot more valid. Ignore the shit in a shit-sandwich and it looks like a nice, tasty, regular sandwich. In the end, though, it's still a shit sandwich. The article stretches this carefully manicured, meaningless analysis beyond all breaking point, and after setting up this tremendously flawed analysis, concludes by saying:
In sum: Recruiting rankings are more than a distraction used to pass the time until spring practice. Despite their extremely limited scope, they're nearly as accurate -- and in some cases, more so -- as the preseason AP Poll in predicting future on-field success.
But look; recruiting rankings are just a distraction. That's okay. They don't need to be some profound window into the underlying, 'Capital T' Truth at the heart of College Football, like some sports fan Rosetta Stone. They're relatively good at what they are, which are predictions of how well players will do at the next level. But as perfectly illustrated by SBNation's own Bill Connely, when it comes to predicting the fate of entire teams it's simply foolish to try and build a model out of the Rivals top-100.
All of which is not to say that this final point isn't without some merit. Certainly, recruiting is a fundamental part of success, not just in sports but in all facets of life. There's no denying that. Without a solid foundation of talent, even the best coaches and programs founder in mediocrity. But this piece doesn't show that. Instead, it's basically lazy data at it's finest. Throw a bunch of numbers at the wall, and start scraping away until they say what you want them to.
With signing day upon is, it's tempting to overshoot the mark on questions of recruiting. In that sense I suppose it's forgivable to get excited by a model which almost-but-not-quite beats a bunch of dudes guessing based on who's jersey is most recognizable. Still, that's a remarkably shitty standard to meet, and Sports Illustrated should know better.