Commentary: How the Goose Egg Failed to Change Baseball (And How it Still Can)

Photo by Marcelo Cidrack

Photo by Marcelo Cidrack

It’s no secret to many in the baseball community that the Save is one of the most problematic statistics in the game. Among other offenses, it has drastically overvalued the role of the closer, led to a frequent misuse of top relievers, helped to consistently undervalue non-closers, and caused a fair amount of losses due to poor bullpen strategy. But this article is not going to be me complaining about the Save—there are plenty of other places you can go to for that. Instead, I’d like to talk about why we still use it and why it still controls the narrative for relief pitchers, even with much better alternatives on the table.

 

On April 17, 2017, Nate Silver of FiveThirtyEight presented his solution to the save problem, a new statistic he dubbed the Goose Egg. The idea is that a relief pitcher receives a Goose Egg anytime they throw a scoreless inning in the seventh or later with their team either tied or ahead by no more than two (you can find a full description here). Unlike the save,

 

  • Multiple pitchers can record an Egg in the same game

  • Pitchers can record multiple Eggs in the same game

  • Pitchers in tie ballgames are recognized for keeping the game tied

  • There is no room for any runs to be given up

  • Eggs can be recorded in the seventh or eighth innings

 

This provides the Egg a lot of advantages over the save, as it de-emphasizes the role of the “closer”, provides a better comparison point for all relief pitchers, and more accurately recognizes the entire bullpen for its contributions. It is, without a doubt, a much better statistic for evaluating relief pitcher performance, and it had the potential to completely change how managers utilize their bullpens. So the question is, why didn’t anyone ever seriously use it?

 

It’s been over three years since Silver put out his original article on the subject, and despite the fact that the Egg is a high-quality stat introduced on a widely-visited website by one of the most famous data scientists in the world, the Egg has barely made a dent in the baseball community. Aside from a mention on MLB Now and a few articles spread across 2017, talk about the statistic has largely disappeared, and the Egg seems far from reaching the baseball mainstream.

 

Now, this doesn’t mean that the Egg is a bad statistic, but it does prove that there is sometimes a difference between a great statistic and a useful one. While the Egg is a much greater tool to determine the utility of relief pitchers and solves many of the problems brought about by the save, it is just not useful in its current iteration. This has nothing to do with the actual design of the statistic; rather, it’s more about the name that Silver chose for it. Any metric is only as useful as it is understandable, and that understanding starts with an accessible name. For a statistic to truly be useful and impact the game around it, people must be able to figure out what it’s measuring just from the name alone. Take WAR for example:

 

With the way we talk about it now, it’s hard to remember that most variations of WAR are barely a decade old. In fact, almost no discussion of player greatness goes without mentioning it: Rookie of the Year, MVP, Hall of Fame, you name it. The way that WAR has come to dominate our baseball discourse in such a short time speaks to not only how great of a stat it is, but its usefulness as well. Of course, similar statistics that measured a player’s total performance – Linear Weights, for example – had been around since the 1980’s, so what was it about WAR that allowed it to break into the mainstream? What made it so useful

 

In a word, comprehensibility. It didn’t matter how tedious the calculations were to reach the number, and neither did the minutiae of figuring out what a “replacement-level player” looked like. What mattered to the public was that you could bring up WAR in a conversation with casual fans, and they would be able to understand what the numbers represented without a tedious explanation. It was all there in the name: Wins Above Replacement. 

 

Looking back through baseball history, we see that many of the statistics fans care about – the ones we see on the back of baseball cards and on TV – have passed this “Name Test”. These include old-school staples like Batting Average and Runs Batted In, as well as more recent adoptions like WAR and Defensive Runs Saved. Under this logic, it’s easy to see how the Save was so effective in gaining its monopoly on relief pitcher analysis.

 

The usefulness of the save comes from its simple definition and its ability to create a narrative. You don’t have to know the specifics of the stat or how it is recorded to figure out that if someone “saved” a game, they probably had a big role in the team’s win. With this logic, it’s easy to look at a pitcher who records a lot of Saves as being a very valuable ballplayer, and that their value to the team comes from being the guy “saving” the day at the end of the game. This was how the baseball community came to base our ideas of bullpen usage around the Save, creating a system of dedicated roles that have tied the hands of managers for almost forty years now.

 

In contrast, the Goose Egg faces a major identity problem. When I looked up the term on Google, I found a lot of things. I learned that one goose egg is roughly equal to the size of two large chicken eggs; I saw a lot of pages describing the goose’s reproductive system; and I figured out that there are many recipes that you can make using goose eggs. But unfortunately, I did not learn a lot about baseball. In fact, Silver’s original article on the Egg is buried about halfway down the second page of results.

 

Most people would not associate the term with legendary reliever Goose Gossage, for whom the statistic is named. Putting aside the irony of a sabermetrician naming a stat after him (seriously, just look up “Goose Gossage sabermetrics”), there’s probably quite a few casual fans of baseball today who don't actually know Goose Gossage is. Besides that, the name doesn’t give us any insight as to what the statistic actually represents. For all the casual observer knows, it could be measuring how many lumps on the head a player received during a season.

 

Thus, if the Goose Egg wants to reach its potential as a meaningful statistic, it must rebrand itself to make it more recognizable and useful to the baseball community. Luckily for the Egg, there is an obvious replacement name that is uniquely suited for this purpose: the Quality Inning (QI).

 

As opposed to the original name, people can figure out the gist of what a Quality Inning is without needing to weed through online recipes or a refresher course in baseball history. The fact that the term can only be used in a baseball context and is more clear in what it’s measuring will make it much easier for analysts, TV hosts, and fans to use it when discussing relief pitchers. Plus, the name provides for a nice dynamic with the Quality Start. What the QS has been for starters, the QI can be for relievers, which could completely change the way we look at relief appearances from now on.

 

A perfect example of how this could happen can be found in one of the greatest moments in the 2018 postseason: Nathan Eovaldi’s performance in Game Three of the World Series. After taking the ball in the 13th inning, Eovaldi gave up only 3 hits and 1 earned run over the course of 6 innings and 97 pitches. During that time, he shut down a dangerous Dodger lineup while simultaneously keeping the Red Sox in the game and saving the only two pitchers Boston had left. However, when looking at the box score after the game, the only figure that seemed to define his performance was the L next to his name. Why? Because the stats we currently use are not enough to capture just how special his performance was on paper. 

 

During his six innings of work, he recorded four Quality Innings, a feat that has only happened a handful of times in MLB history and has grown more uncommon in the modern age of baseball. With his performance, Eovaldi joined names like: 

 

  • Orel Hershiser, who recorded seven in a game in 1989

  • Juan Agosto, who did it twice—recording four in the same game as Hershiser and seven in a game in 1984

  • Yusmiero Petit, one of the only other pitchers to do it in the playoffs, who recorded 6 in Game Two of the 2014 NLDS

  • Steven Wright, the most recent Boston pitcher to collect three in a game, who did so against the Yankees in 2015 (The starting pitcher for New York that day? Nathan Eovaldi).

 

To have an outing like Eovaldi’s requires not only the perfect circumstances (you can only record up to three quality innings in a nine-inning game), but flawless execution over a number of high-leverage innings. Throwing a four-QI performance is hard enough on its own, but to do so on the biggest stage in the world is a testament to Eovaldi’s grit, determination, and incredible stuff. Putting his outing in terms of the QI allows us to appreciate all of this in its proper context, despite the fact that he was given the loss.

 

This is why having a measurement like the QI is so important. Relievers have always been the most undervalued and underappreciated parts of a ballclub, and when you look into the numbers, it’s easy to see why. Starters have wins, closers have saves, and hitters have basically every offensive metric known to man; however, relief pitchers don’t have a statistic they can call their own, or at least one that puts their success in its proper context (some may argue for the hold, but it allows too much room for error, gives no credit for long outings, and doesn’t take into account tie ballgames). Because of this, most young relievers are viewed as either future starters or future closers, both of which are molds that many of them cannot and should not try to fit into.

 

However, it doesn’t have to be this way. With the bullpen revolution starting to kick into full swing, there is no better time than now to start giving these arms the respect they deserve. But before that can happen, we need a tool in place that can help us see their true worth. To this end, the Goose Egg has already had its shot, but perhaps a reintroduction as the Quality Inning is exactly what it needs to reach its full potential as a metric and fulfill Silver’s goal of fixing the way we view relief pitching. 

 

There’s only one way to find out for sure. Let’s hatch the Egg and see what happens.

Previous
Previous

Research: What’s the Difference?

Next
Next

Research: The Pinch-hitter Problem