The Evolution of Racehorse Clusters in the United States: Geographic Analysis and Implications for Sustainable Agricultural Development

: Sustainability is frequently deﬁned as the need to place equal emphasis on three societal goals: economic prosperity, environment, and social equity. This “triple bottom line” (TBL) framework is embraced by practitioners in both corporate and government settings. Within agriculture, the horse-racing industry and its breeding component are an interesting case study for the TBL approach to local development. The sector is to some extent a “knowledge industry”, agglomerating in relatively few regions worldwide. In the USA, choices made by breeders or owners are likely a ﬀ ected by sudden changes in speciﬁc state policies, especially those related to gambling. Both of these unusual conditions—for agriculture at least—have been playing out against a background of national decline in the number of registered racehorse breeding stock. This study traces changes, between 1995 and 2017, in the geographic distribution of registered Thoroughbred and Standardbred stallions. We ﬁnd that isolated, scattered registered stallions have largely disappeared, strengthening one or more core states (or counties) that had an initially high percentage of stallions. The gainers and losers among previously core regions appear to be heavily inﬂuenced by state-level policies. It follows that such policies can inﬂuence the conservation of agricultural landscapes as well as racing revenues.


Conceptual Frameworks: Sustainability, Economics, Geography
Viewed as a normative framework, the word sustainability is often used to mean that public policies should place equal emphasis on economic, social equity, and environmental outcomes. This definition has been called, variously, the three pillars of sustainability, the three-legged stool of sustainability, or any number of other metaphors that exist in triplicate [1,2]. Business leaders prefer the phrase "triple bottom line" (TBL), which suggests that decision-makers should pursue social and environmental objectives with the same single-mindedness that managers have long applied to profit-making [3].
Agricultural development seems like a policy strategy that can achieve all three TBL objectives simultaneously. Profitable farms generate income and jobs while at the same time preserving landscapes, scenic views, aquifer recharge, and at least some form of wildlife habitat. One could also make the

Applying Cluster Theory to Agriculture and to Racehorses
Industry cluster theory is rarely used in the study of agriculture because of the assumption that climate and soil are the only important location factors. However, this issue is not either-or: path dependent geographic clustering can co-exist with biophysical constraints on agricultural location. The existence of business-driven clustering behavior has been well-established for such diverse agricultural commodities as floriculture [19], beets [20], canola [21], mushrooms [20], and cranberries [21].
It is generally hypothesized that an industry will cluster in space if inter-firm communication is important to success; if "tacit knowledge" that cannot be codified and transmitted in written form is important to production; and if the product is highly differentiable [10,22,23]. It also helps if the product is able to earn economic rents from this differentiation. These rents can compensate for the higher labor and land costs we would expect in an agricultural setting that employs a highly specialized labor force and is dense with cluster participants.
All of these factors leading to a prediction of cluster behavior apply unambiguously to the breeding of racehorses. Racehorses vary widely in quality: producing the best will earn you significant economic rents in purses and stud fees. The skills required to raise and train a winner are not widely distributed, so they command a premium in the labor market [24]. Racehorse breeding is essentially a craft industry where tacit knowledge and casual conversations at the racetrack backstretch are more important than written documents, like the patents that are used in other knowledge industries. There is competition within the industry, but also cooperation because of the respect for the well-being of the animals, the need to prevent epidemics, and regulatory issues in which breeders have a shared interest [24]. The agricultural industry that racehorse breeding resembles most is wine grapes and wineries [25]. Wine has been the subject of numerous studies designed to test industry cluster theories [26][27][28].

Studies on the Triple-Bottom-Line Benefits of Equine Activity in Particular Places
Scholarly articles and industry-sponsored reports have long recognized the unique role that a thriving equine industry can play in local economies and quality of life. Several case studies, for example, have been written about equine operations at the peri-urban boundary. These works were driven in part by the mid 20th-century rise in the number of horses kept by hobbyists in the suburbs [29]. When discussing the equine industry's role in local sustainability, these studies make frequent mention of its unique assets, like picturesque landscapes and the way that nonfarmers participate directly in the industry as consumers of recreation [29][30][31].
With funding from horse breeding and agricultural trade associations, the U.S. states of New Jersey, Virginia, and Kentucky have conducted studies of the role that the equine industry plays in state and sub-state economies. One of these studies focused on the agricultural base of the Fayette County economy in Kentucky [32]. This study quantified the dominant economic role played in this county by its world-class equine sector. A survey of local economic development leaders also noted that "the quality of life in Lexington was consistently identified as a source of Lexington's economic vitality, with landscape being a key aspect." "If the landscape and the horse farms disappeared," said one interviewee, "we would have no brand" [32].
Since the turn of the century, the three states named above have also conducted state-level surveys designed to get a true estimate of equine inventory by county. (The U.S. Census of Agriculture does not count most pleasure horses). All three states contracted out this survey to the same agency that administers the Census of Agriculture, and all three used the IMPLAN© program to quantify the industry's positive economic impacts. The studies varied in the detail that they provided on breeds and uses, and in the way they handled environmental and amenity benefits.
The first of the three studies, written by researchers at Rutgers University in New Jersey, estimated the impact that New Jersey's horses have on acres devoted to the production of hay. Together with the equine operations themselves, it was estimated that the equine industry supported about one-fifth of the state's total agricultural acreage [33]. Using a similar chain of logic incorporating feed and bedding, the Virginia impact study estimated that the equine industry preserves about 2.6% of that state's total land area [34]. Elgåker [30] elaborates on the synergistic benefits of equine hay production for the rest of local agriculture: "The production of cereal grains is often combined with fodder production for horses. This has a stabilizing effect on the economy, decreasing the sensitivity to price changes and providing a good crop." The last of the three equine survey studies was conducted in Kentucky [35,36]. This study includes a discussion of equine-related tourism from out of state, which is more important here than elsewhere in the U.S. It also reports data from a separate survey of Kentucky residents, in which they expressed a positive willingness to pay to preserve the state's equine industry. This preference was based not only on economic benefits, but also on the perceived benefits of (making) Kentucky a nicer, more beautiful place to live" and preserving the state's "culture, heritage, and history" [35].

Three Place-Based Case Studies of Particular Interest
Three case studies in three different countries confirm the relationships we have identified among geographic clustering in the horse-racing industry, economic outcomes, and environmental outcomes.
First, McManus [37] presents a case study of the Upper Hunter Valley region of Australia. In his description of the region's three main industries, coal mining, wine, and Thoroughbred breeding, he implicitly acknowledges cluster theory, defined as the local concentration of several elements of each industry's supply chain, leading to success in export markets. McManus states that "The Upper Hunter Valley is reputed to be the second most significant Thoroughbred breeding region in the world after the Bluegrass Region of Kentucky, USA." In contrast to other studies relating local equine activity to sustainability, McManus's perspective is very much environment first. His goal is to define the concept of "sustainable region" and provide guidance to policy makers about how to cultivate and monitor this objective.
Although McManus does not formally address industry cluster theory in connection with Thoroughbred breeding in Hunter Valley, this has been done by other authors who have observed the same three industry clusters in this same region, and who view them as being of national, not merely local, importance [38,39]. A second case study profiles the leading Thoroughbred region in the world, the region surrounding Lexington, Kentucky, USA [24]. If McManus focused primarily on sustainability theory for a particular equine region, the Garkovich study focuses primarily on industry cluster theory for its study region. Indeed, its goal-similar to what we have done briefly above-is to prove that the Kentucky racehorse cluster has all of the attributes of modern cluster theory, especially those laid out by influential economic development consultant Michael Porter [7,8]. This study provides a comprehensive mapping of the racehorse supply chain in north-central Kentucky. It uses sales data to make our point about the significant rents that are earned by Kentucky breeders. It describes levels of skill and wages that are higher in this cluster than they are in other agricultural industries (or locations).
Although the Kentucky case study does not dwell on environmental benefits, it does mention the need to preserve Lexington's "cultural landscape," which is described as both cause and effect of Kentucky's successful racehorse breeding cluster (see also [32]). Finally, the article describes what appear to be the very dramatic effects on the industry of particular state-level policies. Lack of any alternative gaming revenue in support of horse racing in Kentucky, for example, is blamed for a collapse in the number of Standardbred stallions from the mid-1980s to the mid-2000s [24,40,41]. Enabling legislation and the extension of a new breeding incentive to Quarter Horses had the opposite effect. It worked very quickly from 2006, but once the market reached equilibrium many of the stallions left Kentucky [42]. It should be noted that Kentucky began operating its first historical horse racing machines in 2012 and supplements breeder incentive awards with tax dollars since 2005, with the Kentucky Thoroughbred Breeders Incentive Fund and the Kentucky Standardbred Breeders Incentive Fund. This article, with its United States context and stories about state policy, provides a key motivation for our own empirical work below.
Finally, in a study of the North Wessex Downs equine cluster in the UK, Parker and Beedell [43] place equal emphasis on environmental sustainability and the economic competitiveness of the cluster. "The paper demonstrates," they write, "how a long-standing land-based rural economic cluster provides significant economic and environmental benefits." Parker and Beedell [43] proclaim the key economic benefit of clustering for this region: "The critical mass of facilities and activity helps the cluster to retain its strength and internal competitiveness." Equally important for our purposes, these authors stress the environmental benefits of this particular cluster, taking pains to compare actual environmental outcomes to a presumed counter-factual: It is likely that the presence of the industry has contributed to the overall biodiversity value of the AONB and promoted grassland managed at relatively low intensity in a landscape characterised by large-scale arable farming. The cultivation of certain crops used by the feed industry, such as Lucerne, is also likely to add to the biodiversity of the area.
Note: AONB stands for "Area of Outstanding Natural Beauty." It is an official regulatory designation in the UK.
Our unique contribution to this literature will be geographic. Instead of profiling particular equine clusters, as was done in McManus [37], Garkovich, et al. [24], and Parker and Beedell [43], we will map several clusters at once on a map of the United States. Our primary units of spatial observation will be the 48 contiguous U.S. states plus Puerto Rico. One reason to stay with this simple unit of geography is that critical policies are implemented at the state level. We will also "zoom in" on a set of Northeast/Mid-Atlantic states to explore changes in the number of standing stallions across their counties. (Some cluster studies use empirical cutoffs to identify clusters on a map, as groups of adjacent jurisdictions with high concentrations of operations in the same industry [44,45]. The present article will identify and discuss some multi-state clusters that are apparent from map inspection).

Dataset Assembly
Specific geographic location data across the horse population in the United States is difficult to obtain, and in many breeds is not collected for all horses (or potentially any registered horses). It is more common for breed associations to retain location data (i.e., zip code) for an owner than for the animal itself. The purpose of the current study was to evaluate geographic clustering within the breeding sector of the horse racing industry. Within the population of breeding animals in the U.S., comprehensive and accurate location data is most available for stallions (as compared to mares or foals). Additionally, stallions represent a critical asset in this industry due to the potential genetic impact a stallion has because of the large number of offspring it sires in any given year.
Therefore, for this study of the U.S. racehorse breeding population, registered Thoroughbred and Standardbred stallions were selected for evaluation. Thoroughbred data were sourced from the national registry database maintained by The Jockey Club (Lexington, KY, USA). The dataset included all registered Thoroughbred stallions in the United States for the periods 1995-1999 and 2012-2017. For each year that a stallion appeared in the dataset, the zip code of the physical location of that stallion was provided. Standardbred data were obtained from the United States Trotting Association (USTA; Columbus, OH, USA), and the dataset included all registered Standardbred stallions in the United States across the fifteen-year period from 2002-2017. Unlike The Jockey Club, the USTA only retains location data at a state-level resolution. Therefore, zip code data had to be obtained on a state-by-state basis from organizations or regulatory bodies tasked with such record-keeping responsibilities at the state level. The initial year in the dataset for each breed (Thoroughbreds, 1995;Standardbreds, 2002) represents the earliest year in which complete digitized records were available from The Jockey Club and USTA, respectively. The geographic location provided by all of the national and state breed registries is where the stallions lived in the reference year and where mating/sperm collection was done.
For Standardbreds, organizations working to document and retain stallion location records are not active in all states. Even in states with higher Standardbred populations-where more than one established organization, agency or other body may exist-records of location data were, for some states, incomplete. Therefore, comprehensive zip code data for registered Standardbred stallions could not be obtained for all 50 states. The Northeast/Mid-Atlantic region was selected for a more in-depth case study at the county level, as complete zip code data was available in each of the following states: Delaware (Delaware Standardbred Breeders' Fund), Maryland (Maryland Standardbred Race Fund), New Jersey (Standardbred Breeders and Owners Association of New Jersey), New York (New York Racing Commission), Pennsylvania (Pennsylvania State Horse State Gaming Commission), and Virginia (Virginia Harness Horse Association). Another reason to focus on the Mid-Atlantic region is that the states are relatively small, competition for the gambling dollar is high: racing customers and stallions can both move easily across the region's borders.
The datasets also included lifetime performance data for each stallion and its offspring. Multiple performance metrics are collected within an individual breed, and the individual breed association was consulted in order to determine the metric that would be most informative when evaluating performance quality. For Thoroughbreds, The Jockey Club provided records for races won in specific race classes (Grade 1 (G1), Grade 2 (G2), Grade 3 (G3), Black-Type (BT)). The highest race class in which a stallion won a race was used as the metric to designate the stallion's own performance quality. Progeny performance metrics included the number of graded and/or black-type winners sired by each stallion. For Standardbreds, USTA supplied lifetime stallion and progeny earnings records. To separate stallions by performance level, a performance index was created using the above-mentioned metrics. Because the performance parameters provided by each breed registry differed (race level for Thoroughbreds vs. money earned for Standardbreds), the performance indices for each were constructed separately ( Table 1, Thoroughbreds; Table 2, Standardbreds). For the Thoroughbred data, the stallion's own performance record was scored as follows, with the race level representing the highest race class won by an individual stallion: 1 = G1; 2 = G2; 3 = G3; BT = 4; and 5 = no recorded graded or BT wins. A second score was assigned for lifetime progeny performance. Progeny performance data provided by The Jockey Club indicated if a stallion had sired graded stakes or BT winners. However, the grade level (1-3) won by the offspring was not specified in the dataset. Therefore, progeny performance quality scores were defined as: 1 = sired graded stakes winners; 2 = sired BT winners, but no graded winners; 5 = did not sire any graded or BT winners. These stallion and offspring scores were then summed to determine the final combined performance quality score (CQS). This CQS was then used to rank stallions based on level of performance quality. In the final performance index used for further data analysis in this study, Level One stallions had a CQS of 2-3. Level Two stallions had a CQS of 4-6. A CQS of 7-9 placed stallions in Level Three, while the lowest level, Level Four, had a score of 10, which represented stallions that neither won a stakes or BT race themselves nor sired any offspring winning at either of these levels. The total number of stallions assigned to each of final performance index levels for both the initial year (1995 or 2002) and the final year (2017) for Thoroughbreds and Standardbreds can be found in Table A1, respectively. Combined Quality Score 4 2-3 4-6 7-9 10 Performance data for registered Thoroughbred stallions were provided by The Jockey Club. G = graded stakes race (1, 2, 3 indicate grade level); BT = black-type race. 1 Each registered stallion's own lifetime race performance. 2 Lifetime progeny performance for each stallion. 3 Highest race class won by any offspring of each stallion (lifetime). Progeny performance data provided by The Jockey Club did not specify level of graded race won (grade 1-3), only if a graded, black-type or no graded/black-type winners had been sired by each stallion. Therefore, only scores of 1, 2, or 5 were assigned. 4 The performance scores assigned for stallion performance and progeny performance were added to generate the combined quality score (CQS) for each stallion. The CQS was then used to rank stallions based on level of performance quality.
It should be noted that any performance score for stallions whose location is specified for the earlier year of our trend analysis (1995 or 2002 depending on breed) incorporates information on subsequent performance of the stallion or its progeny that we possess, but which the breeder in that year did not. This is not a serious problem for making inferences about geographic behavior using the performance data that we have available. First, the final racing record for the stallion itself was known to the breeder in the majority of cases. Older animals dominate the equine age distribution in any given year, and very few members of these two breeds race after being put out to stud. Also known by the breeders in real time was the quality of the bloodline based on the performance of the stallion's ancestors. Finally, given the detail that our analysis requires on date, horse quality, and geographic location, no alternative to the lifetime performance data provided by the two breeding organizations exists. Combined Quality Score 3 2-5 6-9 10-13 14 Performance data for registered Standardbred stallions were provided by the United States Trotting Association. 1 Each registered stallion's own lifetime race performance. 2 Lifetime progeny performance for each stallion. 3 The performance scores assigned for stallion performance and progeny performance were added to generate the combined quality score (CQS) for each stallion. The CQS was then used to rank stallions based on level of performance quality.
After the CQS was calculated for each registered stallion, the number of stallions was summed over the geographic units of interest. For each breed, this aggregation task was done for the earliest and latest years for which complete data were available. For Thoroughbreds, geographic subtotals were calculated for the years 1995 and 2017. For Standardbreds, geographic subtotals were calculated for the years 2002 and 2017. All necessary match-merge operations and subtotal calculations were done using the SAS Statistical Software program (SAS Institute, Cary, NC, USA). The total number of registered stallions in each year of the evaluated periods (1995-2017, Thoroughbreds; 2002-2017, Standardbreds) were examined to confirm that the above specified years selected for analysis did not deviate from global trends for the stallion population within each breed. A representation of these total stallion trends for Thoroughbreds and Standardbreds can be found in Figures A1 and A2, respectively.
The Jockey Club's data on Thoroughbred stallions contained zip codes for each stallion-year combination. In this dataset, each stallion is located in only one zip code in a given year. Each zip code in the 1995 Thoroughbred file was matched to a unique county and state using the Geographic Correspondence Engine of the Missouri Census Data Center (MCDC) [46]. This online software tool provides zip code-county crosswalk files for the census years 1990 and 2000. The crosswalk files for both years were used in order to maximize the probability of finding a match (zip codes are created and retired over time). If a stallion's zip code of residence was split between two counties, the county that had the largest share of the zip code's census population was selected and the second county was ignored. After a county Federal Information Processing System (FIPS) code was assigned to each zip code in the 1995 file of registered stallions, a state code was assigned using state-county crosswalk files from the US Census Bureau. This second geographic crosswalk file is stable compared to the zip code files.
The same procedure was used to assign standing Thoroughbreds to states in the year 2017. There is one difference, however. Instead of using zip code-county crosswalk files from the Missouri Census Data Center, we used the equivalent file for the first quarter of 2017 from the US Department of Housing and Urban Development (HUD) [47]. The Department maintains a continuous series of geographic crosswalk files for recent years, while the MCDC is the best source for older geographic files.
The data on national Standardbred stallions included a code for state of residence for each year. Therefore, it was not necessary to match zip codes to higher-level geography before creating subtotals for U.S. states. For the case study on six Northeast/Mid-Atlantic states, however, Standardbred data were collected at the zip code level (see above). These data were matched to county FIPS codes using a procedure similar to that described above for Thoroughbreds, except that the year 2000 MCDC crosswalk file was used for the year 2002 data, and the 2017: Q1 HUD crosswalk file was used for the year 2017 data [46,47].
For the six Northeast/Mid-Atlantic states, it was therefore possible to analyze the geographic distribution of both breeds across counties, not just across states. The Northeast/Mid-Atlantic region is the only part of the U.S. for which county geography is the geographic unit of analysis in this study.

Descriptive Data Analysis
The primary method of summarizing the data will consist of maps of the numbers or percentages of standing stallions in each areal unit in the earliest available year, which varies by breed, and in the most recent year, which is 2017 for both breeds. Subtotaled data were match-merged into the ESRI Arc-Map GIS program for this purpose. In our maps of the 48 contiguous states and Puerto Rico, data bars for the early year and for 2017 are shown side by side for each state. These bars represent each state's percentage of the total number of stallions standing in each year. Because we use percentages and not raw counts, these national scale maps depict the change in the geographic distribution of stallions: The eye is not misled by the significant decline over time in the absolute number of registered stallions across all states.
In contrast, the county-level maps for the six Northeast/Mid-Atlantic states show actual head counts of stallions in each year, not percentages of the six-state total. Each year appears on its own map. The data were mapped this way in order to use graduated circles instead of bars. Graduated circles are easier to read when a high number of geographic units are clustered together in a small space. Notwithstanding the display of raw counts for the years side by side, these Northeast/Mid-Atlantic maps were designed to focus attention on cross-area distributions and not on the decline in absolute numbers of stallions over time.
The maps described above can tell interesting stories on their own. It is sometimes useful, however, to quantify geographic phenomena. In particular, we wish to quantify the extent to which stallions are concentrated in fewer states (counties) as opposed to many states (counties). We would like a single national metric in order to test the hypothesis that key business assets-registered stallions in our case-concentrate into a smaller number of geographic locations in response to a decline in total numbers.
The Herfindahl-Hirschman index (HHI) is the simplest measure of the concentration of countable objects across cells, categories, or jurisdictions. It is calculated as the sum of the squares of the shares that each jurisdiction has of the total national count. Because we use decimals, not percentages to calculate the index, our HHIs are all less than or equal to 1.0, which is the index's maximum value if all stallions were to locate in a single jurisdiction. The data required as inputs to the HHI are precisely what we are mapping with our vertical bars shown on national maps. The HHI provides a single measure of spatial concentration for the entire U.S., or, for the entire six-state Northeast/Mid-Atlantic Region.
An index such as the HHI must be used with caution when one wishes to compare the degree of concentration across cases where the overall count is very different. Imagine two scenarios in which there is never more than one stallion per county. In the 1995 scenario, assume that there are exactly 310 stallions distributed across the 310 counties in the six Northeast/Mid-Atlantic States. In the 2017 scenario, however, imagine that there are only 100 stallions, which are distributed across these same 310 counties at a rate of one stallion per county. In 1995, the smallest possible HHI that could exist would be 310 × (1/310) 2 . In 2017, however, the smallest possible HHI that could exist would be 100 × (1/100) 2 . The second of these numbers is larger than the first, in spite of the fact that both scenarios represent the maximum possible dispersion of a set of indivisible units, horses, across a set of 310 counties.
This total size problem is not serious in our dataset, because the number of stallions for which we calculate HHI never falls below the number of cells over which they might hypothetically be distributed. The closest we get to this threshold is 69 stallions distributed over 50 states.
In order to be conservative with any conclusions we draw based on the HHI, we will adjust each calculated index to control for total head count at the national or regional level. Specifically, we rescale the HHI so that it can take any value between 0 and 1, but 0 is set equivalent to the HHI minimum that is specific to each national/regional total. The formula for this adjusted HHI is as follows: where HHI raw is calculated without any adjustment. 1.0 is the universal HHI maximum.
where N = number of jurisdictions over which the stallions could be distributed; T = total number of stallions; W = T/N rounded down to the nearest whole number; F = T/N − T.

Causal Data Analysis
We conducted two sets of analyses to test hypotheses that relate our geographic measures to: (1) the economic rents earned by different classes of horses, or across different states; and (2) state-level policies that incentivize local breeding and help to retain stallions over time. Our hypotheses about the relationship between geographic clustering and economic rents are addressed largely by comparing HHI results for high performing stallions to the results for lower performing stallions. We also use Census equine sales data for the largest Thoroughbred states to explore the relationship between economic rents and success at retaining Thoroughbred stallions over our study period. This analysis is preliminary, and the method is described in the Results section. We describe here the method that we used to test our hypothesis linking supportive state policies to stallion retention over a very challenging couple of decades.
A statistical test on the effect of state policies requires data on changes in stallion numbers (see above), as well as a judgement call on the extent to which each state's portfolio of policies is supportive of local breeders. To make the policy characterization portion of this research project manageable, we focused only on the top ten states for each breed in the earliest available year, 1995 or 2002. Some states ranked in the top ten for both breeds, bringing the combined number of states examined to fifteen.
We collected information on policies related to breeder incentive programs, presence of casinos and racinos, gambling revenues, gaming activities at the track, and state subsidies. We classified as "supportive" all states that (1) funded breeding incentive programs or purse supplements using monies other than the pari-mutuel handle, and (2) implemented these incentive programs early enough in our study period to affect the trends we measure in this article. States that did not meet both of these conditions were classified as "not supportive." Two states, Kentucky and Illinois, met our supportiveness criteria for Thoroughbreds, but not for Standardbreds (Kentucky supports Standardbreds, but at a much lower percentage of the relevant revenue stream-15%). In the analysis below where geographic data for both breeds are combined, these two states will be classified as "mixed." With these policy categorizations in hand, we then created two numeric measures of each state's degree of success or failure at retaining stallions over the study period. We calculated the change in each state's national rank between the earliest year in the data and the latest year, using number of head as the ranking criterion. For each breed we also calculated the percentage point change in each state's national share between the earliest available year and the latest available year. (These are the numbers shown in the right-most columns of Tables A2 and A5). These state-specific success variables were compared across policy categories using box plots and t-tests on the differences between means, although the feasibility of the t-test varied by breed. We were unable to conduct hypothesis tests on the ten-state Thoroughbred sample, because only two states fall into the NS category. In contrast, the top ten Standardbred states are divided evenly between S and NS policy categories. Table 3 below shows the full sample of states used for our analysis of the effects of state breeder incentive policies. When all fifteen states are used, for example in a t-test, both breeds' changes in national share are pooled in the dataset: any distinction between breeds is lost. Table 3 shows our judgement of the degree of policy supportiveness by state and breed. It lists the top ten states (in # of stallions) in the early year for each breed and also shows the change in each breed's rank on the list of all 49 states.

Results and Discussion
The figures and tables discussed in this section will chronicle changes in the national and regional distribution of registered stallions.
Depicted in Figure 1 is the map comparing the distribution (as a percentage of the total) of Thoroughbred stallions in the United States in 1995 and 2017. In 1995 there were 5203 Thoroughbred stallions in the U.S., while in 2017 that number was reduced to 1710, a decline of 67%. A numeric version of the map in Figure 1 may be found in Table A2.
There were increases from 1995 to 2017 in the percentage of Thoroughbred stallions standing at stud in Florida, Indiana, Kentucky, Louisiana, Ohio, and West Virginia. While Kentucky was not the sole dominant state in 1995, it was the big relative gainer over the 22-year period, in the face of national decline. Increases in the percentage of standing Thoroughbred stallions are also evident in the neighboring states of Indiana, Ohio, and West Virginia, where the purse and breeder incentive structure is supported by revenue from alternative gaming such as slot machines and table games [48][49][50]. Our analysis found that a significant percentage of Indiana and Ohio Thoroughbreds are standing near the Kentucky border, suggesting that there is a three-state cluster centered on the cultural heartland of the breed in Fayette County, Kentucky. In contrast, more than two-thirds of West Virginia's Thoroughbreds are standing far to the east, in a county that is almost completely surrounded by Maryland and Virginia. They are therefore part of a cluster of Thoroughbreds in those states. Shown in Figure 2a,b are maps comparing the distribution (as a percentage of the total) of top-tier versus lower-tier Thoroughbred stallions in the U.S. in 1995 and 2017. For these maps, top-tier is defined as Level 1 or 2 based on our CQS, while lower-tier is defined as Level 3 or 4. Detailed state-by-state data for each tier can be found in Table A3 (top-tier) and Table A4 (lower-tier). Although lower-tier Thoroughbred stallions are located in most states throughout the U.S., it is evident that Kentucky is dominant when it comes to standing top-tier Thoroughbred stallions in the U.S. for both years.
had no registered Standardbred stallions in either year. Attraction of at least one member of this breed was possible over the study period, however, so the observation of no change in national share conveys information. For these states, we assume that the supportiveness category for Thoroughbreds applies for Standardbreds. The results of our formal hypothesis tests are, in any case, fully robust to the inclusion or exclusion of these two observations (see results section).

Results and Discussion
The figures and tables discussed in this section will chronicle changes in the national and regional distribution of registered stallions.
Depicted in Figure 1 is the map comparing the distribution (as a percentage of the total) of Thoroughbred stallions in the United States in 1995 and 2017. In 1995 there were 5203 Thoroughbred stallions in the U.S., while in 2017 that number was reduced to 1710, a decline of 67%. A numeric version of the map in Figure 1 may be found in Table A2.  Kentucky not only dominates the nation for top-tier Thoroughbreds, it is also the only state that significantly increased its share of these stallions in the face of national decline. Looking at Figure 2b for lower-tier Thoroughbred stallions, the neighboring states of Kentucky, Ohio, Indiana, and West Virginia have done well over time, but so have the states of Florida, Louisiana, and Arkansas. While Kentucky historically has been a leader in the breeding of Thoroughbred racehorses in the U.S. and has been able to survive the lack of alternative gaming revenue support for horseracing, this has not been the case for the Standardbred breeding business in the Commonwealth. The relative decline of the Kentucky Standardbred industry is detailed in Garkovich [24] and is also shown on the map that follows. Figure 3 compares the cross-state distribution (as a percentage of the total) of Standardbred stallions in the United States in 2002 and 2017. In 2002 there were 1082 registered Standardbred stallions in the U.S., while in 2017 that number was reduced to 681, a reduction of 37%. It is evident from Figure 3 that Standardbred stallions are more clustered nationally than Thoroughbred stallions: They are primarily located east of the Mississippi. There were increases in the percentage of Standardbred stallions in Delaware, Maryland, Ohio, Pennsylvania, and Iowa, where purse and breeder incentive structure is supported by revenue from alternative gaming such as slot machines and table games [51][52][53][54][55][56]. The same is not true in the states of Kentucky, New Jersey, and Michigan, which experienced marked declines in their share of Standardbred stallions over the 2002-2017 period [16,40,57,58]. A complete numeric comparison between distribution by state is displayed in Table A5.
Virginia have done well over time, but so have the states of Florida, Louisiana, and Arkansas. While Kentucky historically has been a leader in the breeding of Thoroughbred racehorses in the U.S. and has been able to survive the lack of alternative gaming revenue support for horseracing, this has not been the case for the Standardbred breeding business in the Commonwealth. The relative decline of the Kentucky Standardbred industry is detailed in Garkovich [24] and is also shown on the map that follows.   Figure 3 that Standardbred stallions are more clustered nationally than Thoroughbred stallions: They are primarily located east of the Mississippi. There were increases in the percentage of Standardbred stallions in Delaware, Maryland, Ohio, Pennsylvania, and Iowa, where purse and breeder incentive structure is supported by revenue from alternative gaming such as slot machines and table games [51][52][53][54][55][56]. The same is not true in the states of Kentucky, New Jersey, and Michigan, which experienced marked declines in their share of Standardbred stallions over the 2002-2017 period [16,40,57,58]. A complete numeric comparison between distribution by state is displayed in Table A5. Ohio was the biggest gainer overall for this breed, but its relative success over the study period is even more dramatic when the analysis is restricted to Level 1 stallions (Figure 4a). A full numeric summary of top-and lower-tier Standardbred stallions can be found in Tables A6 and A7, respectively. They are primarily located east of the Mississippi. There were increases in the percentage of Standardbred stallions in Delaware, Maryland, Ohio, Pennsylvania, and Iowa, where purse and breeder incentive structure is supported by revenue from alternative gaming such as slot machines and table games [51][52][53][54][55][56]. The same is not true in the states of Kentucky, New Jersey, and Michigan, which experienced marked declines in their share of Standardbred stallions over the 2002-2017 period [16,40,57,58]. A complete numeric comparison between distribution by state is displayed in Table A5.  Selecting a more elite group at the top end shows more clearly the difference in geographic change over time by quality level. Ohio was the biggest gainer overall for this breed, but its relative success over the study period is even more dramatic when the analysis is restricted to Level 1 stallions ( Figure  4a). A full numeric summary of top-and lower-tier Standardbred stallions can be found in Tables A6 and A7, respectively.  We have argued that the geographic tendency in racehorse breeding, especially under conditions of secular decline, is toward a winner-take-all outcome. If Kentucky was the big national winner in Thoroughbreds from 1995 to 2017, Ohio was the big national winner in Standardbreds. We have argued that the geographic tendency in racehorse breeding, especially under conditions of secular decline, is toward a winner-take-all outcome. If Kentucky was the big national winner in Thoroughbreds from 1995 to 2017, Ohio was the big national winner in Standardbreds. Like Kentucky, it had a special edge in retaining the top-tier stallions, suggesting that agglomeration is even more important when the economic returns are high. This could not have happened if Ohio had not implemented purse and breeding incentives funded from alternative gaming.
We also examined states in the Northeast/Mid-Atlantic region to assess county-level changes in the stallion population for both Thoroughbreds and Standardbreds. Changes are depicted in Figure 5 for Thoroughbreds, and Figure 6 for Standardbreds. In both cases there is not only a decrease in the total number of stallions standing at stud, but also in the number of breeding farms still in existence in the first few years of the current decade [57].
Both breeds show a similar trend in the geographic distribution of stallions over time. First, small numbers of stallions standing isolated in western New York and Pennsylvania disappeared from the national registries. Instead, both breeds appear to have contracted to a high-density core in the Chesapeake-Susquehanna region. A few counties in New York's Hudson Valley region also held their own over the study period.
Specific state-level policy choices can help to explain these patterns. New York, Pennsylvania, and Delaware now have a lucrative purse and breeder incentive structure supported by alternative gaming [51,54,55,59,60]. The big loser over the study period was New Jersey, a state that lacks these incentives [16,57]. An important reason for this is that New Jersey pioneered legal gambling in stand-alone casinos [16]. Located in Atlantic City, New Jersey's casino industry has opposed alternative gaming at the state's racetracks, the key source of breeder incentives in other states. For this reason, Malinowski and Avenatti [57] predicted that New Jersey could lose much of its equine agribusiness, which generated USD 780 million of economic impact annually, USD 110 million in federal, state, and local taxes, and 57,000 acres of working agricultural landscape and open space. Judging by the latest figures on standing stallions, it appears as if the Malinowski-Avenetti prediction was beginning to come true [16]. It should be noted that in 2019, subsequent to the time period examined in the current study, the New Jersey state legislature passed legislation authorizing an appropriation from the state budget of USD 20 million annually for a period of five years in support of Thoroughbred and Standardbred racing in this state [61]. Future study of the impact that this influx of capital into horse racing and its downstream effects on the breeding segment of the industry is warranted.
local taxes, and 57,000 acres of working agricultural landscape and open space. Judging by the latest figures on standing stallions, it appears as if the Malinowski-Avenetti prediction was beginning to come true [16]. It should be noted that in 2019, subsequent to the time period examined in the current study, the New Jersey state legislature passed legislation authorizing an appropriation from the state budget of USD 20 million annually for a period of five years in support of Thoroughbred and Standardbred racing in this state [61]. Future study of the impact that this influx of capital into horse racing and its downstream effects on the breeding segment of the industry is warranted.    Table 4 summarizes the change in the geographic distribution of all kinds of registered stallions over time, using the size-adjusted Herfindahl-Hirschman index described in the section on materials and methods. As previously discussed, these changes in spatial concentration must be considered in light of the decline in the number of stallions standing nationwide, including the six states selected for county-level analysis. It is not inevitable that a measure of spatial concentration would increase under these conditions. An equal percentage decline in number of head across all geographic units, for example, would leave any concentration index unchanged.   Table 4 summarizes the change in the geographic distribution of all kinds of registered stallions over time, using the size-adjusted Herfindahl-Hirschman index described in the section on materials and methods. As previously discussed, these changes in spatial concentration must be considered in light of the decline in the number of stallions standing nationwide, including the six states selected for county-level analysis. It is not inevitable that a measure of spatial concentration would increase under these conditions. An equal percentage decline in number of head across all geographic units, for example, would leave any concentration index unchanged. The HHI results in Table 4 show that Standardbred stallions are consistently more concentrated than Thoroughbred stallions, as already stated above. Ideally, one would like to compare pairs of HHIs using a test of statistical significance. Djolov [62], however, argues that the study of such a test is in its infancy, calculations are complicated-requiring a conversion to the Gini coefficient-and results in applied contexts difficult to interpret. This is also true across the 310 eastern counties, although the difference between breeds is less dramatic at this scale, especially in 2017.
A second key finding from Table 4 is that top tier stallions are always more concentrated across US states than lower tier stallions, controlling for any bias that may be imparted by the total count. (This fact can be seen in the national maps as well.) This finding confirms the idea that observed agglomeration in this agricultural sector is correlated with value-added production generating significant economic rents.
We define economic rents as above-average sales prices and stud fees that owners can earn due to the genetic advantages certain horses have as racers, combined with expert training and management practices that allow these horses to perform at their maximum potential. Our definition of top-tier stallions is based on win-loss records, lifetime earnings, and the racing performance of progeny, all of which are likely to be correlated with stud fees. The top-tier categorization is therefore a proxy for economic rents. It follows that Table 4's rows for top-tier stallions support a hypothesis linking economic rents in the racehorse breeding industry to both static geographic concentration and to increases in concentration over time.
Additional information on economic rents may be found in the U.S. Census of Agriculture. The Census reports equine sales by state, and this figure can be averaged over the number of head sold. Agricultural Census data cover all breeds. We assume that, compared to Standardbreds, Thoroughbreds are numerous enough that their sale prices might affect the aggregate dollar sales reported in the Census of Agriculture. The Thoroughbred stallions and breeders in our sample might also benefit, via shared resources and knowledge-sharing, from economic rents earned by the much larger number of Census-enumerated horses that are sold near them, including all breeds, uses, and genders.  Table A1). The relationship between the two variables is positive and resembles a logarithmic function. Although the sample is quite small and Kentucky (far right) is a high-leverage observation, Figure 7 supports a hypothesis that links change in geographic concentration to state-level economic rents earned throughout the equine industry. Data on sales prices and stud fees for Thoroughbreds alone would be the best way to measure rents in this figure, but those data are not currently available for states. The final column of Table 4 depicts a finding that is especially interesting in light of modern cluster theory. It shows that in response to overall decline, and when adjusted for total head count, concentration across spatial units increased uniformly across both breeds, and at both of the geographic scales examined. Instead of a uniform percentage decline in all pre-existing jurisdictions, smaller states and counties disappeared from the map (literally, in the case of Figures 5 and 6), while a small handful of jurisdictions with significant critical mass in the early year increased their share of stallions. If we restrict our attention to top-tier stallions, there is really only one winning jurisdiction for each breed.
Having said that, state policies and subtle differences in relative attractiveness must have played a role in which specific jurisdictions would gain share. In the case of all Thoroughbreds, for example, Kentucky emerged in 2017 with most of the share [if not the actual animals] that had been ceded by all of the small states, while California did not. This is true even though California had a larger share of all Thoroughbreds than did Kentucky in 1995.
For Standardbreds, we see that Ohio emerged as more of a winner in 2017 than Indiana, in spite of their similar starting points in 2002. This may be attributable to the timing of the influx of revenue to horseracing from alternative gaming. Ohio opened its first racino, more recently, in 2012, whereas Indiana opened its first racino in 2008, and it is possible that the peak in reinvestment is currently occuring in Ohio but has waned in Indiana [63,64]. Similarly, there are more Standardbred stallions standing in York County, Pennsylvania in 2017 compared to a past leader in Standardbred breeding, Monmouth County, New Jersey. This most likely is due to the fact that New Jersey does not have the lucrative purse structure and breeder incentive program when compared to the neighboring state of Pennsylvania [16,57].
An analysis of the relationship between state policies and geographic outcomes for breeding stallions can, of course, be more systematic and less anecdotal than the stories told immediately above. Figure 8 below shows the change in national rank for the number of Standardbred stallions, by the Standardbred policy category reported in Table 3, for the top ten Standardbred states in 2002. (The vertical axis is an inverted scale for change in rank, ensuring that Figures 8-10 have the same interpretation for the success variables, lower on the graph being worse). The change in national rank for each policy category, each containing exactly five states, is shown using box and whisker plots. In spite of the small sample size for each category, the means of the success variables for each policy category are different from each other according to a simple t-test: the p-value of this test is 0.04. Figure 9 presents an analysis identical to Figure 8, except that the success variable is the change in national share of Standardbred stallions, rather than change in national rank. The means of the two policy categories are also significantly different from each other according to a t-test, with a p-value of 0.03. As stated in the Section 2.3, we did not repeat Figures 8 and 9 and the related t-tests for Thoroughbreds.
(The vertical axis is an inverted scale for change in rank, ensuring that Figures 8-10 have the same interpretation for the success variables, lower on the graph being worse). The change in national rank for each policy category, each containing exactly five states, is shown using box and whisker plots. In spite of the small sample size for each category, the means of the success variables for each policy category are different from each other according to a simple t-test: the p-value of this test is 0.04. Figure 9 presents an analysis identical to Figure 8, except that the success variable is the change in national share of Standardbred stallions, rather than change in national rank. The means of the two policy categories are also significantly different from each other according to a t-test, with a p-value of 0.03. As stated in the Section 2.3, we did not repeat Figures 8 and 9 and the related t-tests for Thoroughbreds.   Figure 10 below shows the boxplot results for both breeds together. Total sample size is fifteen states that ranked high on head count for either breed (see Table 3). The changes in national share over the study periods, 1995-2017 for Thoroughbreds and 2002-2017 for Standardbreds, are pooled together in this figure. This means that an analysis of the difference between means in the success variable could be based on as many as 15 × 2 = 30 observations. More specifically, the S category in Figure 10 includes 18 observations, the MIX category includes four observations, and the NS category includes eight observations. It makes most sense to conduct a t-test on the difference between the means of the S and NS categories in Figure 10, giving a total sample size of 26. This t-test of change in national share for the two breeds combined is significant with a p-value of 0.01. (As stated in note 4 for Table 3, removing the Standardbred observations for Washington and Oklahoma does not change this fundamental result, increasing the p-value by only 0.004).  Figure 10 below shows the boxplot results for both breeds together. Total sample size is fifteen states that ranked high on head count for either breed (see Table 3). The changes in national share over the study periods, 1995-2017 for Thoroughbreds and 2002-2017 for Standardbreds, are pooled together in this figure. This means that an analysis of the difference between means in the success variable could be based on as many as 15 × 2 = 30 observations. More specifically, the S category in Figure 10 includes 18 observations, the MIX category includes four observations, and the NS category includes eight observations. It makes most sense to conduct a t-test on the difference between the means of the S and NS categories in Figure 10, giving a total sample size of 26. This t-test of change in national share for the two breeds combined is significant with a p-value of 0.01. (As stated in note 4 for Table 3, removing the Standardbred observations for Washington and Oklahoma does not change this fundamental result, increasing the p-value by only 0.004). over the study periods, 1995-2017 for Thoroughbreds and 2002-2017 for Standardbreds, are pooled together in this figure. This means that an analysis of the difference between means in the success variable could be based on as many as 15 × 2 = 30 observations. More specifically, the S category in Figure 10 includes 18 observations, the MIX category includes four observations, and the NS category includes eight observations. It makes most sense to conduct a t-test on the difference between the means of the S and NS categories in Figure 10, giving a total sample size of 26. This t-test of change in national share for the two breeds combined is significant with a p-value of 0.01. (As stated in note 4 for Table 3, removing the Standardbred observations for Washington and Oklahoma does not change this fundamental result, increasing the p-value by only 0.004).    -10 and their associated t-tests allow us to conclude that the relationship between supportive state policies and success at retaining breeding stock is more than anecdotal. A more complete empirical analysis of this issue would require a more detailed dataset on state policies, including supplemental appropriations measured in dollars. For example, Neibergs and Thalheimer compared the effectiveness of supplements to non-restricted purses to local breeder subsidies, both of which could be classified as supportive [65]. One could also conduct the study using more states, although we would be concerned that the smaller states are idiosyncratic. Most U.S. states are not truly engaged in the racing industry at the "national level": they may not engage in the interstate competition for breeding stock that is implied by the program impact analysis presented here.

Conclusions
In this study, we cited several references arguing that a thriving equine industry can generate side benefits related to landscape conservation, ecosystem services, and rural amenities. If a region wants to achieve this environmental definition of sustainability, however, its equine industry must be sustained.
Racehorses sit at the high-value end of the industry, so their presence requires a dense agglomeration of high-quality suppliers. For a number of reasons, racehorse breeding is likely to exhibit agglomerative behavior that benefits from increasing returns to scale. Virtuous cycles of growth or vicious cycles of decline may set in, and they can be independent of fixed place characteristics, like the quality of the local bluegrass. Registered stallions standing at stud are key assets in this industry: they can be used as a bellwether of cluster success or failure.
We have shown in this study that under conditions of secular decline, measures of overall concentration tend to increase. A previously large region can become a lower-risk refuge, and can significantly increase its share of stallion assets. If there is more than one jurisdiction that operated at scale before decline, however, the ultimate geographic winner will be highly contingent on state-level policies. Revenue streams for purses and breeder incentive programs appear to be significant determinants of which states became high-share that can occur in states that lack sufficient revenue streams for purses and breeder incentive programs. For the most part, racehorse clusters in the U.S. are fragile and can only be sustained using programs that reward breeders financially on the basis of state of residence. The horse racing sector is worth saving nationwide, not only because of its long and prominent history in U.S. sport, but because it is an economic driving engine of the entire U.S. horse industry and is extremely valuable to the quality of life in the form of agricultural working landscape.
This study confirmed several hypotheses arising from industry cluster theory; most notably, a correlation between the level of geographic concentration of stallions and the magnitude of economic rents and returns to tacit knowledge, holding other things equal. The study has begun an interesting inquiry into a subject not often studied by cluster theorists: change in geographic concentration when an industry with more than one cluster at the national level begins to decline. It would be useful to contrast this industry with another declining industry, like automotive. There, too, state policy plays an important role in the observed distribution, as there have been fewer plant openings recently in union states than in states that have passed right-to-work laws. State policy makers may have more control over the three elements of the local TBL than they realize.     Appendix C