OPS: The Weirdest Stat Conundrum in Baseball

In 2003, Michael Lewis wrote the book Moneyball, detailing the reasoning behind the Oakland Athletics seemingly unfounded success in the 1999-2002 seasons. In the book, as many fans are familiar with, he emphasized the team’s focus on a relatively new field, sabermetrics. While originally pushed forward in a series of books called the Baseball Abstracts by Bill James in the early 1980s, teams were just then catching on. At the heart of this new baseball philosophy lies three things: the importance of On-Base Percentage, the importance of Slugging Percentage, and the unimportance of batting average. Until then, batting average was regarded as the golden stat, being the tell-all for a batter’s level of skill. The only problem was that this statistic invented in 1887 (Yes, it is that OLD!) told teams very little about the value of a player. Through analysis by numerous statisticians, they found that other known statistics such as OBP and SLG more closely correlated with a team's success than the outdated BA.


On-Base Percentage, simply defined, is the broad calculation of the percentage of time a player reaches base when he appears at the plate. Unlike batting-average, it disregards the difference between plate appearances and at-bats, counting every way of reaching as equal. Slugging Percentage is also a very easy-to-understand stat, with its main goal to factor in the amount of power a hitter possesses. For the calculation, each type of hit is weighed to the number of the base (Single = 1x, Double = 2x, etc), which is then divided by the number of at-bats a given player had. Recognizing the importance of both of these stats, known baseball author Pete Palmer invented On-Base Plus Slugging (OPS), which added both slugging and on-base into a single statistic; no weights are involved. And to an extent, this new stat was very predictive of team success. To measure this, analyst Keith Law measured the correlation coefficient between several stats and team runs between the 2011 and 2015 seasons. If you’re not familiar with this concept, it nicely measures the correlation between data on a 0-1 scale (or -1 - 0 for Negative Correlations), with 0 being no correlation and 1 being a full correlation. Below is his findings:



As is obvious in the data, OPS was deemed the most successful at predicting the number of runs a team would have out of these select stats. Hence, we should just regard OPS as the new “golden stat,” correct? Not exactly.


In evaluating this stat, I will start with the individual part of the equation - slugging percentage. On its surface, it looks great, providing an aesthetically convenient weight for each measured base. A home run is seemingly worth 4x more than a single. Yet, that is where it is flawed. Through new baseball studies, it is estimated that a home run is only worth 2.28x a single (as of 2021 measures) in value produced. That is a gargantuan difference in percentages. And that is not even the only mismatched weight. Each weight value fails to come close to the actual by the margin set by the standard slugging math. Slugging, frankly put, fails to weigh correctly, which in turn brings down OPS as a whole. The problematic issue with this measurement is that this is not even the only large fault.


The second critique of OPS focuses more on the formula as a whole. In laying this issue out, I want to explicitly show the formula to the reader.


SLG = (1×1B + 2×2B + 3×3B + 4×HR) / AB

OBP = (H + BB + HBP) / (AB + BB + HBP + SF)

OPS = SLG + OBP


The statistic is basic addition - add one valuable baseball stat to another and assume good results. In that line of thinking, one has to assume that Slugging and On-Base are equal. Not to spoil the fun, but they are not. For 2021, the average MLB batter had a .317 OBP and a .411 SLG. According to Fangraphs, that 1 point of OBP is worth 1.8x more than the 1 point of SLG. One can simply not add two different types of numbers and expect the sum to be an accurate conclusion about the nature of the subject. It is just not true. Yes, teams with high OBP and SLG will likely score more runs than those that don’t, as shown in the chart above. But that does not mean that adding together two numbers will give you a more accurate measurement. If anything, they should both be regarded as factors. A great OBP and/or a great SLG will lead to great success; put together they are truthfully meaningless.


To address this issue, baseball analyst Tom Tango, author of The Book: Playing the Percentages in Baseball, invented wOBA (Weighted On-Base Average). wOBA, as the name states, created a weighted average of a player getting on-base through the actual values of reaching. Designed in response to the overwhelming flaw of the accepted OPS, wOBA attempts to provide a way for people to better understand how much value a player produced. A bit more complicated than the OPS calculation, the wOBA formula for 2021 is below:


wOBA = (0.692×uBB + 0.722×HBP + 0.879×1B + 1.242×2B + 1.568×3B +

2.007×HR) / (AB + BB – IBB + SF + HBP)


As demonstrated in the formula, each method of reaching is weighted according to their worth, which is compiled into one larger number. The scale of skill for this composite can be viewed as the same for OBP. This calculation puts a deep twist on OPS and seems like the obvious winner. Yet, some studies say otherwise.


In a 2013 blog post, Cyril Morong, an economics professor at Northeast Lakeview College, published some interesting data that tested Tango’s theory of the inaccuracy of OPS. Studying the correlation coefficients (Sorry, Math is back) between the number of runs produced and different offensive statistics, he published the following data for runs produced between the 2003 - 2012 seasons:



This data was then backed up by Baseball Prospectus. OPS, while seemingly an unsophisticated brute calculation, actually correlated to runs scored better than the complicated wOBA. Even if that is only by 0.00305 correlation points, sabermetric fans would have you believe that wOBA translates into runs much better - that is clearly not the case here. As is common, success often lies in the middle ground. The most successful statistic out of this data set was the weighted value of OBP added to slugging. As I mentioned prior that OBP is estimated to be worth 1.8x that of SLG, this calculation would objectively fix that issue, which is exactly what it did. It managed to correlate better by 0.0013 compared to OPS, and .00435 to weighted on-base average. Who would have thought? And although this is an incredible finding, it is worth noting that these are very minute differences. In the real estimation of data, small correlation differences such as these would have very little impact on a person’s interpretation of the relationships between such numbers. Between a 0.95 - 0.96, any estimation involving a degree of on-base percentage and slugging strongly relates to the amount of runs.


Coming into this article, I can admit that I was with a clear bias. Before I started writing this piece, it was called "The Ultimate Failure of OPS". Hearing supposed statheads yammer on about the necessity for more complicated measurements started to take an effect, and I had not considered if actual data supported this. Turns out, the case of OPS was not so clear-cut. The stat was more effective than everyone thought - after all, it beat a meticulously crafted value evaluator without anything but two common stats added together. And while this can be acknowledged, I think it would be a terrible mistake to dismiss wOBA altogether and focus on the former. Weighted on-base average, in its logic, is of complete soundness. As I emphasized earlier, its soundness is way above OPS. The value of Bases per Hit and getting on base are completely different, and that cannot be forgotten. Perhaps the equal results have something to do with the weights involved. With the Statcast era only beginning in 2015, it is possible that the lack of advanced information led to less trustworthy weights. Maybe, certain aspects of hitting are included that shouldn’t be (although it is very similar to OBP, so I highly doubt this is the case). There could be several issues, but as of now, it is not certain why.


While the advanced metric just mentioned may not be the most successful predictor of success, it is still quite accurate. And as more and more statisticians continue to study baseball, these sabermetric calculations will only get more and more accurate. They already have, if one followed the history of sabermetric evolution. Yet, most of these new numbers fail to be mentioned. A simple statistic like OPS, while accurate, will continue to dominate broadcasts and modern-day print media everywhere, even when more advanced indicators are invented that are much more accurate. Most fans are still not even aware of the existence of such numbers, nor would some care to find out. It is in the nature of the majority that people fight change - baseball fans are no different. As Moneyball emphasized over and over, A’s General Manager Billy Beane hated complacency, which is what drove him to find alternative methods to winning. While Tom Tango’s solution to complacency may not be better than the standard yet, time might prove otherwise.


Sources:

Baseball-reference.com

Cybermetric.blogspot.com

Fangraphs.com

MLB.com

Moneyball (2003) by Michael Lewis

Smart Baseball (2017) by Keith Law

The Bill James Historical Baseball Abstract (1985) by Bill James

The Book (2006) by Tom Tango