As children, coaches and parents would often tell us the stories that players competing in Home Run Derby-type competitions would mess up their swings. This has become dogma around the sport, with many fans advocating for their favorite players to opt out of participating to avoid the “Home Run Derby Curse”. They seem to remember crisp examples of players not swinging correctly afterward and failing to produce. In 2021, Juan Soto challenged that notion, stating in a postgame interview, “I was thinking about it, and it (the Home Run Derby) really helped me a little bit get that feeling of how to put the ball in the air and everything.” With the dogma and a superstar's opinion opposing each other, it is easy to get confused about the reality of the effect of the Derby. Therefore, let’s navigate the truth behind this baseball tale and answer the question: does the Home Run Derby affect player performance?
Defining the Sample
To prove a basis of fact or fiction, the answer needs to be based on hard data, not anecdotal stories. As entertaining as said stories may be, their legitimacy is highly questionable, making them untrustworthy sources to base a claim on. For this data set, the main focus will be the players that did a competition during the Statcast era, spanning from 2015 to 2021. As 2020 did not have a home run derby, it will be completely disregarded from this set. This sample includes 43 players, with 5 players (Alex Bregman, Giancarlo Stanton, Joc Pederson, Pete Alonso, and Todd Frazier) playing in multiple years. Our sample size becomes 48 seasons - a large-enough sample to make the proper conclusions.
As a sizeable amount of statistics are needed to determine a clear change, BB%, K%, OPS, ISO, wOBA (Weighted On-Base Average), and wRC+ (Weighted Runs Created Plus) are all included measures in the determination. Vast improvements or losses in these numbers relative to the average can correctly indicate a bigger issue, allowing for a better answer. Having a clear idea about the players being investigated, let’s measure the actual data.
Analyzing the Data
To make the Derby effects clear, first-and second-half stats must be compared to each other. Players' given years from the last 6 seasons have been split into a before- and after-Home Run Derby pile. Offense is constantly changing, and such conclusions from these findings could completely differ from those of the past or the future. It is doubtful that a major deviation exists within 5 years of either side of the sample, but it is worth noting. With that in mind, the range for each statistic is in the table below:
When looking over the sample, the data does not appear to be evenly distributed, suggesting that the median needs to be considered over the mean. As a median cannot be manipulated by a few oversized outliers, it gives a more accurate presentation of the data in certain cases (such as this). The median changes for these statistics are pictured in the chart:
While the nominal statistics behind the half-over-half changes are interesting, they mean nothing without the relatively of a control group. While there were many options out of the criterion needed for this control group, a composite of the top 50 players (by wRC+) from each year seemed to be the best fit as to the nature of Home Run Derby participants' first-half performance. This control will allow for the experiment to somewhat differentiate between the effect of the Derby and just plain regression. If the Derby Group experienced a similar change to the control group ( a sub-5% difference), then the change can likely be owed to regular regression for a high-performing player. If the Derby group experienced a change that was much different than the control group (above a 5% difference), then causation could likely be owed to the Derby. Pictured below is a chart of each year for the control group that will be compared.
While some variance is evident year-to-year, all six factors from the control group saw negative factors. When comparing these to the sample, an effect (if evident) should appear.
The median change in walk rate was a positive 0.2% over the sample, while the control group experienced a -0.3% median change. Both are very minute. Dogma suggests that a player will attempt to swing for the fences more after getting used to the Home Run Derby, which would lead to a lower walk rate. In the study, the walk rates got better for our Home Run Derby swingers and worse for the control. On that note, these numbers are both very minuscule, making any type of effect unlikely. Therefore, I can conclude that the Homerun Derby has little to no effect on walk rates.
After the Home Run Derby, the median strikeout rate dipped 0.3% for those involved, while the control hitters' median rate rose 1.2%. Once again, these both are very small percentages. According to fan theory, strikeout rates should worsen for these players as a result of both a “messed-up swing” and an attempt to try to hit more baseballs out of the park. Instead, they got better, with regular regression hurting the control group worse. These sub-5% changes still are not enough to mean anything - the Home Run Derby has a negligible effect on strikeouts.
The median drop in On-Base Plus Slugging production after the sample participated in the Home Run Derby was 80 points, while the controls dipped 103 points. An 80-point drop is nothing to scoff at - but it appears that these players actually improved in relative to players that did not participate. The sample's median OPS was .892 in the first half - a drop of this size means that a player's offensive performance decreased by 9% in the second half. That may seem extreme, but the control experienced a similar loss of 7.8%. Hence, the Home Run Derby cannot be solely blamed for this extreme drop in performance, going against common fan logic.
ISO subtracts AVG from SLG, meaning that it tries to solely measure a player's power. Common conceptions assume that power would drop after the Derby. When these players participated in the Home Run Derby, their median Isolated Power was 38 points less in the second half in comparison to their first-half numbers. For comparison, the control group experienced a 41-point drop. In this instance, the sample's median was .261. A 38-point drop-off equates to 14.6%, which is an astonishing downturn in performance... but not out of the ordinary. The control experienced a 17.1% drop, which in theory would mean that players did better after the Derby. But, a 2.5% difference is not enough to prove that claim and is likely owed to variance - the Home Run Derby had little to no effect on ISO, again going against traditional logic
Weighted On-Base Average is meant to be a more accurate version of OPS, and a change in the median in this will be weighed very heavily in the conclusion. Again consulting the conventional fan hypothesis, wOBA is said to decrease after the Homerun Derby. And in this sample, the median change for the players' wOBA was -33 points, or an 8.7% loss in production when considering the sample's median of 0.378. The control exhibited a comparable loss of 28 points, equating to a 7.5% loss. These sub-5% differences between the control and the Derby participants are not enough to prove any type of causation - the effect of the Home Run Derby on players for wOBA is inconsequential.
Weighted Runs Created Plus is considered to be a tell-all statistic regarding offensive performance. It accounts for league averages and ballparks to give the most accurate description of run production. Similar to wOBA, this will be considered heavily in a final answer to the question of the effects of the Derby. For the first half of the season, the sample had a median 137 wRC+ (considered to be about 37% above average). During the second half, that fell 22 points, accounting for a 16.1% decrease in performance. This did not differ much from the control group - they had a median 141 wRC+ in the first half that fell 20 points, or 14%. Both groups of players remained above league-average with their second-half numbers but were much less productive. Not being anywhere close to the 5% difference needed to possibly justify causation over regression, the Home Run Derby didn't likely cause players to produce less in wRC+ terms.
Keeping these findings in mind, we can confidently answer for the level of struggle after the Home Run Derby.
In a prior study, the Society of American Baseball Research (SABR) attempted to answer this question in 2010, citing HR% and OPS changes as evidence that the effects of this competition were negligible on Major League performance. And while I can acknowledge the importance of these statistics, considering just those factors can only take one so far. The Homerun Derby is supposed to have implications all across a player's game - not just his ability to hit home runs. Using OPS to account for the other factors is obviously dating the research, as statistics such as wOBA and wRC+ are considered to be more telling of the ability to add offensive value.
Seeing as I’ve identified the shortcomings of the original work, the reader may start to believe that I went overkill with my criteria. After all, K% and BB% are somewhat factored into wOBA and wRC+. Yet, I believe K% and BB% were necessary to isolate individual parts of the game and see if there was a change, similar to the reason that ISO was included. Using non-specific offensive statistics is great in answering the original question with a singular basis. Using a mixture of specific and non-specific statistics is great in fully identifying the effects of the Derby and answering the question with in-depth scrutiny. I prefer the latter - hence my need to provide extra points of data.
As is evident in the data, players that participated in the Home Run Derby also did experience a major drop-off in performance. But as statistics professors are so keen to remind us: "Correlation does not mean causation." The two factors, at first glance, appear to be correlated. But when compared to a similar control group (a group of the 50 best-performing first-half players from each year), it becomes obvious that this correlation is not causation. The Home Run Derby participant group failed to separate themselves above the 5% difference threshold in any statistic when compared to the control group. The sudden dip in performances can likely be owed to a factor that is well-known in baseball statistics - regression. When players are playing as one of the best in the league, it is likely that they are overperforming. Regression to the mean, over the course of a season, will likely take them back to their average. This means that sharp differences between 1st-and-2nd-Half splits may not be as extreme as they appear, rather, it is just the natural law of averages being restored to its balance.
Through looking over these numbers, I can state that participation in the Major League Baseball Home Run Derby does not lead to decreased performance, disaffirming traditional baseball's suspicions about the event. Every statistic experiencing relatively normal drop-offs makes it hard to justify that the Derby had any effect that couldn't just be owed to variance or regression.
However, it is worth mentioning that the tiny differences between the control and the Derby groups could be owed to a Home Run Derby effect. This is highly unlikely, but not impossible, as answering this with 100% certainty is flawed. If said effect did exist, its size would deem it unimportant. Individual players may vary in the slightest, but as a whole, it is negligible. Given that, fans should stop trying to fight their favorite players from participating in the event for the sake of their performance - it will not make a difference either way.