Raise your hand if you’ve read an article this offseason citing a team’s points per game, yards per game, or NCAA ranks using those measures. The vast majority of reviews and previews, right? This is all kinds of bad, and needs to change.

Do not accept these numbers at face value. Take those rankings using points per game and total yards and use them for kindling. These are bad ways to rank offenses and defenses, and most writers know it yet continue using them anyway, thinking that anything more complex will be too much for most readers (you’re smarter than that!). With limited sample sizes and so much variance in pace, possessions, and styles in college football, these counting stats become more misleading by the year.

Flaws of Points per Game

Pretend for a minute you’re a head coach in an insane situation – your family is held hostage by the mob, who is placing an enormous bet on your next game. They want you to score 80 points, or else (their betting systems are weird, you don’t ask questions). You don’t care about perceptions or fear it will be obvious something is up, all you care about is getting your family back safely, so you’ll do whatever it takes to get as many points as you need.

In that scenario, what would you do? I’d start by playing the fastest hurry-up possible – the clock is the enemy when I need to hit a ridiculous number of points in just 60 minutes of game time. It’s time to go for it on every 4th down, never punt, and onside kick every possession (and some of this aggression would be wiser than conventional football wisdom indicates, but that’s a discussion for another day). I’m also praying my opponent to play as fast as possible – I want to score 80 points, I need the ball! If you’ve played enough hours of Madden or NCAA you’ve attempted this at some point – you decide to try to break every single-game record possible, and you don’t care if you give up 50 points in the process (or even lose).

The above scenario is obvious hyperbole, but some coaches play much closer to that style than others. These differences in tempo and aggression create stark contrasts that make comparisons in points and yards per games apples to oranges. Let’s use Ohio State and Tulsa’s offense in 2015 as examples:

Points / Game Points / Drive S&P+ Rank FEI Rank
Ohio State 35.7 (28th) 2.75 (22nd) 14th 10th
Tulsa 37.2 (21st) 2.39 (45th) 51st 45th

 

Using our traditional counting stat, points per game, Tulsa’s offense is a shade better than Ohio State. Already this seems like faulty logic based on the conclusion, but let’s roll with it for now. By making one simple change to account for pace differences, using points per drive instead of per game, we start to see separation and an indication that the Buckeyes are probably better. That’s without even incorporating strength of schedule, which we know will ding the Golden Hurricanes.

Tulsa ran nearly 15 more plays per game thanks to an extreme no-huddle hurry-up (2nd fastest nationally) and also played a ton of high possession games thanks to a woeful defense. This combination vastly inflates their offensive numbers on a points per game basis. The Buckeyes, in contrast, ran their offense at an average pace (73rd in FBS), and as a result had 20% fewer possessions over the course of the season.

That’s significant, but you’d never know any of it looking at points per game. If for icing on the cake you were to add a few filters that S&P and FEI utilize, like taking out garbage time and incorporating strength of opponents to adjust for the Big Ten vs. AAC, now you’re working with data that are much more indicative of actual quality and strength.

Flaws of Yards Per Game

The same context issues apply to total yards, with an even more exaggerated impact when you aren’t accounting for garbage time. Teams that are consistently playing from behind appear like they have strong passing offenses, and defenses easing off the gas pedal with big leads see their numbers diluted. And counting stats used to justify the “run the ball to win” argument get put on steroids by great teams building untouchable early leads and then running out the clock for huge swaths of game time.

Yes, being good at running the ball is better than not being able to. But correlation isn’t causation, and it’s then easy to use this as evidence to support the theory that “great teams win because they run the ball / stop the run” (with a strong implication that the same idea doesn’t apply, or at least not as strongly, with passing). These counting run stats become a self-fulfilling prophecy for any team that’s consistently winning, especially by large margins.

Why do almost all great teams have those outstanding rushing yard totals and limit their opponents? Because they’re better than virtually everyone else to start with, and that means they stomp a lot of teams, so they get to run the ball the entire second half. Their objective changes – it’s no longer as important to score and increase the lead as it is to run out the clock (limiting time for a potential comeback or threat) to finish the game. Likewise, their opponents are then battling the clock and must throw to have any chance of coming back.

Let’s look at an example of these implications for counting stats in the run and pass categories – Alabama has been the paragon of defensive excellence for years under Nick Saban. I don’t think it’s controversial to claim that the Tide’s pass defenses by any objective or subjective measure should consistently be close to the best in the country, right?

Well, look what happens when you use passing yards per game as the metric to assess if this is true or not –  Bama ranks 24th in 2016, 30th in 2015, 59th in 2014, 11th in 2013. Maybe next year Saban will get his act together and his secondary can crack the top ten.

On the opposite side of the ball, 2016 Baylor is a decent example of empty rushing yards. The Bears finished 14th in rushing yards per game, on paper a dangerous running attack. A closer look at the schedule shows huge rushing totals built up in blowouts over Northwestern State, SMU, Rice, Kansas, and a close win over defense-optional (105th in S&P+ Rushing Defense) Iowa State.

In those five games, Baylor averaged 312 yards per game, which would put them 5th behind only run-based option offenses if applied to the season. In the remaining seven games against conference opponents, the Bears averaged 206 yards per game, a number that would put them 36th nationally. Factor in weak Big 12 defenses and garbage time, and S&P+ ranks them as the 52nd ranked rushing offense – an average rushing attack but nothing special. I’ll confess to not watching a lot of Baylor last year, but I think film would back up the idea that their running game was much closer to S&P+’s ranking than how yards per game would make it appear.

We intuitively judge things in the same manner advanced stats do

The funny thing about critiques of advanced stats is that in real life we regularly employ the same type of logic that these metrics use. We frequently question numbers at their face value to figure out if they’re misleading or not when we don’t agree them.

Whether it’s rec league basketball or fantasy football, you’re forced to ask more detailed questions when you try to assess the quality of a team or player based on small sample sizes. Who have they played – strong teams or the weakest? How much have they won and lost by? Did one player score the most points simply because he shot it almost every time he touched the ball but wasn’t actually that good? Was a final score misleading because the other team let up and put all their bench players in at the end? Are the results of some games more indicative than others?

But once the tag “advanced stats” or “sabermetrics” gets applied, some people tune it out instantly as nerd math, bloggers in basements, etc. Many times it’s because this evidence is unfamiliar, or more commonly disagrees with their initial opinion. And while advanced stats can definitely be cherry-picked out of context to support a questionable conclusion, almost all of them have some kind of overarching ranking (S&P+, FEI, Massey-Peabody, Sagarin) of offense and defense that rolls up more specific stats into measures for meaningful comparison.

A Final Example

Let’s take an example dear and near to our hearts with the 2016 Irish defense. A quick look at passing defense reveals … a top 25 passing defense! It’s true – at 196.4 pass yards allowed per game, Notre Dame ranks 21st nationally, just ahead of Alabama and LSU.

You already can guess some of what’s behind this – having Army and Navy comprise 1/6 of the schedule will do wonders for your traditional passing stats. Even a simple move away from counting stats, going from yards per game to yards per attempt, is illuminating. It’s super simple to adjust for how often opponents attempted passes against you, and what do you know? The Irish defense yielded 7.5 yards per passing attempt, which plummets their national rank down to 82nd.

Now you can incorporate even more detailed adjustments – if you take a closer look, the Notre Dame defense didn’t face many good passing offenses. Under Brian Van Gorder the secondary made Shane Buechele, Tyler O’Connor, and Duke’s Daniel Jones look pretty awesome!  And when you layer even more details through a system like S&P+ – which incorporates strength of opponent, filters out garbage time, and takes a more detailed look at efficiency and explosiveness – ND finishes at 92nd in Passing S&P+ Defense. That feels a helluva a lot more accurate than 24th, doesn’t it?

And that’s the point of all of this – advanced stats aren’t intended to be the final, perfect word on anything. The intent isn’t to replace what you see watching a game, or try to claim that win-loss record isn’t the most important thing as long as your F+ rating is strong, or manipulate numbers to validate an opinion. The value is in using these numbers to tell a more accurate depiction of quality than most numbers that have traditionally been used. Like with any statistics, advanced stats also have their blind spots and weaknesses and need be paired with what’s observed on the field.

S&P+, FEI and other systems are far from perfect, but they go a long way in helping validate (or invalidate) theories about what’s working and what’s not for different teams. They can also help prove or disprove theories about strengths and weaknesses of a team. If you think last Brian Van Gorder’s defenses weren’t fundamentally sound, IsoPPP (an explosiveness measure where BVG’s defenses were consistently close to the bottom of all of FBS) backs up that point. If you think depth and conditioning were issues in 2016, the fact that both the offense and defense had their worst S&P+ rankings in the fourth quarter would be a good supporting point.

But if you think that the offense is pathetic running the ball in obvious running situations, advanced stats indicate there may be some confirmation bias influencing your memory. In 2016 the ND offense ranked 16th in power run success rate, which measures how often a first down was gained on run attempts on 3rd/4th and 2 or less (goal-to-go runs on any down inside the opponent two are also included).

Pair the descriptive value of these advanced stats with their predictive value, and they do a much better job than traditional stats of both describing why a team won or last past games and who is more likely to win future contests. So next time you see someone pulling counting stats from NCAA.com or ESPN, take a look at the advanced stats, or even just adjusted per-play or drive numbers, to see if the evidence cited really tells the whole story.