BARRIERS TO BASEBALL
15
Each of the variables present were then categorized as continuous or qualitative, with
Total Contract Value, Average Annual Value, Age, Initial Year, Prior Year WAR, and Career
WAR deemed continuous variables while Team, Position, Player Agent, Country of Origin, and
Ethnicity being categorized as qualitative variables. Continuous variables could be left in the
data table unaltered, but qualitative variables were altered to render them useful in creating
regression models and determining significance of the variables. As such, the data analysis here
employs the use of dummy variables to transform the qualitative variables into binary variables
that correspond with a certain qualitative value.
For the Team variable, each division’s data tables contained five columns to represent
each team; a 1 in that column signified that the player of that row had signed with that team
while a 0 indicated that the player of that row had signed with another team in the division. This
allows the model to identify significant differences between teams in divisions who generally
exist in the same geographical vicinity of the United States (and Canada for the AL East).
The binary effect imposed on the Position variable separates players into pitchers and
position players with a 1 identifying the player as a position player (1B, 2B, 3B, SS, OF, or C)
and a 0 being a pitcher (SP or RP). This distinction was made to identify if there existed a
discrepancy between contract value for position players and pitchers with the potential for future
in-depth analysis of statistical differences between positions possible.
For the Player Agent variable, players are categorized based on whether they are
represented by a top six sports agency or by another party. Brett Knight at Forbes identifies
CAA, Wasserman, WME Sports, Excel Sports Management, Octagon, and Boras Corporation as
the top six most valuable sports agencies (Knight, 2022). Players represented by these agencies
are denoted with a 1, and those not represented by one of the six are given a 0. The basis of this
distinction was to identify if larger sports agencies make a significant difference in bargaining
for higher contracts for their clients as compared to other agents.
The Country of Origin variable seeks to explore statistical differences that may arise in
the value of contracts between players born inside or outside of the United States. In Major
League Baseball, the public generally thinks of non-US born players originating from Latin
American countries such as the Dominican Republic and Cuba, but in the period of 2017 to
2023, we see players coming from thirteen countries outside the United States, including the
aforementioned countries as well as others such as the Netherlands, Australia, and Japan. In this