What's the Difference?

Visualizing Player Performance Using Indifference Curves

Those who know me best know that I am nerdy when it comes to baseball and very nerdy when it comes to economics. So when I came across this tweet from Jeremy Frank (@MLBRandomStats), it really piqued my interest:

Screen Shot 2020-07-20 at 9.45.15 AM.png

Jeremy’s tweet thread discusses an economic concept called the Production Possibilities Frontier, which is a graph showing the maximum amounts of two goods that can be produced with a limited amount of resources available. The line of the graph, sometimes referred to as the Pareto Frontier, marks the point where one cannot make more of one good without sacrificing some of the other good. An example of this is shown below:

Screen Shot 2020-07-20 at 9.45.48 AM.png

In Jeremy’s thread, he applies the basic logic of this concept to baseball, specifically when it comes to the trade-off between Home Runs (HR) and Stolen Bases (SB). He then goes on to cite the list of HR/SB seasons that currently make up the “Pareto Frontier” in baseball:

Screen Shot 2020-07-20 at 9.46.11 AM.png

He also mentions Christian Yelich, who at that time last season was on pace to have a Frontier-type season with 60 and 35. Since I was curious as to how this would look on a graph, I fired up my computer, created a simple data set with this information (minus the two seasons from the 1800s), and made a graph from the data—which looks like this:

Screen Shot 2020-07-20 at 9.46.39 AM.png

Now, there were two things that stood out to me about this graph. The first was how insane Yelich is playing compared to these other incredible seasons (look how far beyond the Frontier he is!), and the second was that the graph had a convex curve.

This is important because Pareto Frontiers are always concave by their nature. It’s the law of limited resources—the more that we produce of one good, the greater share of resources are dedicated to that good, and the less we can produce of the other good. What we see here is that this logic doesn’t necessarily apply 1:1 when it comes to home runs and stolen bases, largely because the same resources aren’t necessarily used to make both.

Stealing a record number of bases requires a lot of speed, and hitting a ton of homers requires a lot of strength. This “frontier” comes from the fact that speed and strength are usually antithetical of each other. Speedy players are usually very light, thin, and athletic, but as a result don’t have much pop in their bats. In the same way, power hitters are usually massive, muscular, and lumbering—traits that usually don’t translate well to speed.

But here’s where things stray away from the “Pareto” framework. While it’s rare to have both strength and speed, these traits bring greater results once they are specialized. This is why you see the graph tail away at both ends: the speedsters give up on the long ball and start stealing record numbers of bases, while the sluggers give up running the bases in an effort to jog around them as much as possible. As such, any graph of this kind will usually have a convex shape—in other words, not a true Pareto Frontier. But luckily for us, there is another economic concept that does fit nicely with these principles: the Indifference Curve.

Indifference Curves are convex curves that represent different combinations of two goods that give someone the same amount of utility. In this case, we could think of it as the combinations of Home Runs and Stolen Bases that result in the same amount of production for the team. Multiple indifference curves are usually shown next to each other, each one representing a different level of utility. As utility increases, the curves move further up and to the right, as in the example below:

Screen Shot 2020-07-20 at 9.47.27 AM.png

Let’s go back to our graph from earlier, this time looking at it as an indifference curve for HR and SB with Wins Above Replacement (WAR) as the measure of utility. Of the ten seasons that were graphed, the average WAR was about 8.5. Thus, if a player were to have a combination of HR and SB that approached that curve, you would expect that player to be bringing a utility of around 8 extra wins to his team. However, we can explore this concept in even more depth.

In order to do this, I scraped the season data on FanGraphs for every qualified player from the past 50 seasons (1968-2018) and put it in a single data set. I then broke up this data set into smaller ones based on WAR levels, creating separate sets of seasons with 0.9-1.1 WAR, 1.9-2.1 WAR, 2.9-3.1 WAR, and so on. This gave me seven data sets in total with the following sample sizes:

Screen Shot 2020-07-20 at 9.47.45 AM.png

I graphed each of these sets as a scatter plot with Stolen Bases on the y-axis and Home Runs on the x- axis. Using these plots, I then removed any outliers that seemed out of place—usually about three or four for each set. Once the data was all organized, filtered, and cleaned, it was time to put the theory to the test. Using a data visualization program, I mapped each data set as a smoothed trendline and put those lines together on the same plot. The result of this is shown below:

Screen Shot 2020-07-20 at 9.47.59 AM.png

As you can see, the result looks almost exactly as you would expect. Each line has a convex shape, the lines go farther up and to the right as utility (WAR) increases, and they (loosely) run parallel to each other. While the lines ended up a bit closer together than I originally thought they would, the output was pretty clear: each level of WAR had its own distinct line on the plot, and those lines generally acted like typical indifference curves. But even though my hypothesis turned out to be correct, I still felt that this model could be improved in order to be a better measurement of total output.

One thing that really surprised me was how poor Home Runs tended to be as a predictor of WAR totals, particularly toward the right side of the plot. On the left side, the gaps between Stolen Base averages were generally bigger as more utility was produced; however, going to the right tells the opposite story. In fact, this data suggests that at a level of about 35 Home Runs, it would be hard to pick out a two-WAR player from a six-WAR player based on Home Run and Stolen Base totals alone. Thus, contrary to intuition, it appears that homers are not an effective indicator of total production at the plate.

With all of this being said, I had to find a better way of capturing that production. With Home Runs still on my mind, it seemed that Slugging Percentage would be the best place to start. So, I went back to my original data sets and plotted the data in the same way, this time using Slugging Percentage as a replacement for Home Runs. The plot for this is shown below:

Screen Shot 2020-07-20 at 9.48.22 AM.png

For the most part, the same features the last plot had were also shown in this one—the main difference here being the more-defined distinction between each line. In the other plot, I had to omit the three- and five-WAR lines because they made the plot way too crowded. Here though, each level of output generally has its own lane and there are fewer overlaps in the lines. The reason for this is that Slugging Percentage better takes into account your production in at bats that don’t end up with a ball over the wall. Thus, as you go more toward the right side of the SLG plot, you are likely to see better levels of other various hitting indicators – such as a lower Strikeout Percentage – that are associated with higher offensive production. These differences in hitting ability are what cause this plot to spread out where the last one was condensed.

This puts us more on the right track. However, just to be more thorough, I decided to add On-base Percentage to the plot by making the x-axis a measure of OPS. The result is shown below:

Screen Shot 2020-07-20 at 9.48.47 AM.png

Now, that’s more like it.

Here, we still see a clear progression from left to right, except this time there is none of the overlap we saw from the previous plots. This plot is also much more uniform in its spacing (each increase in WAR corresponds with about a 25-50-point increase in OPS) and slopes (the rate of substitution for each line is roughly 1 Stolen Base for 10 points in OPS). The result is something that resembles a set of real-life indifference curves—parallel lines representing increasing levels of utility based on the combination of two quantities of goods.

Of course, there are plenty of other factors that go into WAR (that’s what makes it such a useful metric), but the fact that we were able to create a model this neat using only these two stats says a lot about their usefulness as predictors of utility. Sometimes, it can be difficult to explain what exactly makes a six-win player so much better than a three-win player, especially when they get the same amounts of home runs or similar batting averages. Here though, we can see the statistical difference plain as day— even when the eye test betrays us.

Could this model be improved upon? Absolutely. There are plenty of more advanced stats than OPS to measure offensive production – wRC+, DRC+, even OPS+ – but I’m not sure how these 100-average stats would affect the shape of the model. I also did not use any other stats besides Stolen Bases for the y- axis, which could also be a source for improvement; BsR comes to mind as a potential replacement on the baserunning side, or even a fielding-centric stat such as Defensive Runs Saved could make a difference. These are all different combinations I will have to look into going forward, but for now I think the OPS/SB model is certainly the one to beat.

The bottom line, ultimately, is this: even after 150 years, there is still much to learn about baseball and how to understand it. As much as those in the old guard like to complain about the analytical revolution and the infusion of stats and economics into the game, those things have no doubt strengthened our overall knowledge of baseball and who the best players and teams are. It’s an ever-evolving process, coming to new conclusions about the game by drawing inspiration from how we measure the “real world”–things like economic theory, statistical analysis, and predictive modeling. It’s one of the things I love most about baseball: perhaps more than any other game, it reflects – and learns from – the world around it. Right now, I think that spirit of the game is more alive than ever, and I hope that this new model plays at least a small role in this tradition going forward.

What’s the Difference?

Visualizing Player Performance Using Indifference Curves

Following the Ball: Modeling the Juiced Ball in MLB

The Pinch-hitter Problem