Selection signatures in four German warmblood horse breeds: Tracing breeding history in the modern sport horse

The study of selection signatures helps to find genomic regions that have been under selective pressure and might host genes or variants that modulate important phenotypes. Such knowledge improves our understanding of how breeding programmes have shaped the genomes of livestock. In this study, 942 stallions were included from four, exemplarily chosen, German warmblood breeds with divergent historical and recent selection focus and different crossbreeding policies: Trakehner (N = 44), Holsteiner (N = 358), Hanoverian (N = 319) and Oldenburger (N = 221). Those breeds are nowadays bred for athletic performance and aptitude for show-jumping, dressage or eventing, with a particular focus of Holsteiner on the first discipline. Blood samples were collected during the health exams of the stallion preselections before licensing and were genotyped with the Illumina EquineSNP50 BeadChip. Autosomal markers were used for a multi-method search for signals of positive selection. Analyses within and across breeds were conducted by using the integrated Haplotype Score (iHS), cross-population Extended Haplotype Homozygosity (xpEHH) and Runs of Homozygosity (ROH). Oldenburger and Hanoverian showed very similar iHS signatures, but breed specificities were detected on multiple chromosomes with the xpEHH. The Trakehner clustered as a distinct group in a principal component analysis and also showed the highest number of ROHs, which reflects their historical bottleneck. Beside breed specific differences, we found shared selection signals in an across breed iHS analysis on chromosomes 1, 4 and 7. After investigation of these iHS signals and shared ROH for potential functional candidate genes and affected pathways including enrichment analyses, we suggest that genes affecting muscle functionality (TPM1, TMOD2-3, MYO5A, MYO5C), energy metabolism and growth (AEBP1, RALGAPA2, IGFBP1, IGFBP3-4), embryonic development (HOXB-complex) and fertility (THEGL, ZPBP1-2, TEX14, ZP1, SUN3 and CFAP61) have been targeted by selection in all breeds. Our findings also indicate selection pressure on KITLG, which is well-documented for influencing pigmentation.


Introduction
Since the early onset of domestication, humans have shaped livestock species according to their purposes and current needs. Especially since the establishment of studbooks and the definition of explicit breeding goals and programmes, selection pressure has increased [1].
Regardless whether horses (Equus caballus) were used for warfare, transportation, farming or sports, the emphasis has first and foremost been on physical performance. Within the 20 th century warmblood horses have increasingly been used and bred for competitive sports disciplines such as show-jumping, dressage and eventing.
For these three disciplines, the World Breeding Federation for Sport Horses annually releases rankings for the internationally most successful studbooks. The German warmblood breeds Holsteiner, Hanoverian, Oldenburger and Trakehner have constantly belonged to the top segment in at least one discipline. In Germany, the Hanoverian and Oldenburger studbook are the two largest breeding associations in terms of the number of registered broodmares and sires, whilst the Holsteiner and Trakehner studbook rank on places 4 and 6 [2]. Taken together, the four breeds account for two thirds of the warmblood horse breeding population in Germany.
Currently, the different warmblood horse breeds in Germany essentially share selection goals regarding conformation, locomotion and aptitude for different sport disciplines. However, every breed has, in the course of time, been subjected to specific selection pressures. Thus, the four breeds Holsteiner, Trakehner, Oldenburger and Hanoverian serve as representatives of modern sport horses with divergent breed histories.
From the very start, the Trakehner breeding goal was on creating riding horses, initially for cavalry, and the breed has not undergone a change in utilization like the other three. The Trakehner also went through a severe bottleneck shortly after the Second World War when the population shrank from over 25,000 to about 1,500 breeding animals [3]. Compared with the other three breeds, Trakehner horses have been close to purebred for 250 years. Foreign sires are only seldom accepted into the studbook and generally English thoroughbreds and Arabians are used for refinement [4]. The proclaimed Trakehner breeding goal is a multitalented leisure and sport horse. The breed has a longstanding tradition in cross country riding and eventing and the breeding programme includes (optional) special performance tests for this discipline [4].
Hanoverian horses were originally bred for primary use in agriculture and secondly for military purposes. After the Second World War, the change in breeding orientation changed towards a lighter riding horse, and therefore Thoroughbreds and Trakehner were increasingly included in the breeding scheme [5].
The Oldenburger breed was primarily intended for carriage driving and favoured heavier warmblood horses [6] in the early 20 th century. In contrast to Hanoverian, the Oldenburger studbook remained closed and practiced pure breeding for a relatively long time and started breeding for lighter riding horses only since the 1950s [7,8]. Since then the Oldenburger breeding goal constitutes a powerful high-performance sport horse with aptitude for all kinds of disciplines [9], analogous to the Hanoverian studbook that selects for an aptitude for showjumping, dressage, eventing or carriage driving [10].
Nowadays, Hanoverians and Oldenburger both have a specialised breeding programme for show-jumping, although their formats differ. The Hanoverian studbook opened a specialised jumping programme in 1993 that promotes the pairing of broodmares and sires with proven suitability for this discipline [5].
In 2001, the studbook Oldenburg International was founded, which is oriented on showjumping [11], so the original studbook can predominantly breed for dressage aptitude. Both studbooks can operate independently from one another but belong to the Oldenburger breeding association.
In contrast to Trakehner, Hanoverian and Oldenburger accept sires from a number of different warmblood horse breeds for refinement, as long as their selection criteria are met. English thoroughbreds and Arabians are also acceptable breeds for refinement.
Historically, Holsteiner horses have been primarily used as draught horses in agriculture and transportation and have been rarely selected for riding. In the middle of the 20 th century the breeding goal shifted from a use in agriculture to sports and today they have an explicit focus on show-jumping. To refine the breed, English thoroughbreds, Arabians and French warmblood horses may be accepted and in case of special aptitude for jumping also sires from other warmblood breeds [12].
Particularly in the Holsteiner and Hanoverian breed the intensive use of a few sires in the 20 th century possibly gave rise to popular sire effects [13,14].
Considering the clearly sports-oriented current breeding programmes of all four studbooks in question, we hypothesized that selection pressure on genes relevant for athleticism and suitability for one of the major disciplines (show-jumping, dressage, eventing) should be reflected on a molecular genetic level. Sorbolini et al. [15] demonstrated in cattle that breeds-in spite of similar phenotypes and breeding goals-still have divergent selection signatures due to historic differences. We expected to see a similar phenomenon in sport horse breeds, potentially due to historically divergent main breeding goals.
When an advantageous allele is favoured in the selection process it usually segregates together with neighbouring, so-called hitchhiking alleles. Selective sweeps occur when such genomic segments spread over generations throughout the population due to artificial or natural selection, consequentially bringing about a reduction of genetic variation in those parts of the genome [16]. The study of selective sweeps can therefore give insights into the historical development of populations and is valuable for the unravelling of the functional, genetic background leading to phenotypic variation [17].
Many approaches based on intra-and inter-population statistics have been successfully applied to humans [18,19] as well as domesticated animals [20].
Runs of Homozygosity (ROH) refer to continuously homozygous segments in the genome and have already led to the identification of genomic regions and putative candidate genes that are under selection in domestic animals [21][22][23]. In Haflinger horses this method has also been applied to assess breed history and development [24,25]. A previous ROH study comprising divergent horse breeds, which have been subjected to very different degrees of selection pressure, suggested genes to be targeted that influence metabolic, developmental and neurological processes as well as pigmentation and fertility [26].
The integrated Haplotype Score (iHS) and the cross-population Extended Haplotype Homozygosity (xpEHH) are two other methods for the detection of selection signatures based on haplotype information. The iHS is particularly suitable to detect incomplete sweeps within populations, whereas the xpEHH can better be used to detect (nearly) complete sweeps, i.e. sites that are still polymorphic in one population but are fixed in another [16]. Both approaches have been applied in different horse breeds such as Asian [27] and Shetland ponies [28], where growth, height, feed efficiency and fat deposition related genes appeared to have been under selective pressure. Furthermore, racing performance and locomotion have been targeted in gaited breeds and Quarter Horses [29].
In thoroughbred horses, the search for selection signatures revealed regions that harbour genes associated with muscle strength, energy pathways, insulin signalling, and lipid metabolism, which reflects their breeding for racing performance [30]. The selection for athletic performance in Quarter Horse populations also appears to have put selective pressure on metabolism, next to skeletal muscle development and the nervous central system [31].
Clearly, selection for athletic performance has left traces in the genome of different horse breeds and we hypothesized that similar developments have occurred in the European warmblood horse.
The aim of this study was to identify genomic regions under positive selection within and across warmblood horse breeds. We further sought to elucidate whether differences in breed histories can be detected through selection signatures. Based on the detected selection signatures we intended to present candidate physiological processes and putative candidate genes for phenotypic traits that have been of special interest to breeders.  Table 1). The study made exclusively use of existing data collected for a previous project and no specific sampling was conducted. Blood samples were taken by licensed veterinarians as part of the mandatory health and parentage check in the licensing procedure for stallions in Germany. Since the health and parentage checks are legally mandatory for stallion licensing no ethical approval procedure was necessary. All animals had passed an initial first inspection, but the sampling was independent of passing the health check subsequent to the initial inspection and of the final licensing decision. To pass the first inspection, stallions need to be free of deficiencies in conformation and movement and need to have a pedigree that fits the individual studbook requirements. Stallions presented for preselection are generally 2.5 to 3 years of age. The sample includes stallions that stem from show-jumping and dressage lines as well as stallions with a presumed aptitude for eventing. Stratification due to breeding lines for show-jumping or dressage aptitude can be neglected [32]. The EDTA-stabilized blood samples were used as sources of DNA for genotyping on the EquineSNP50 Bead-Chip (Illumina Inc., CA). Filter options for SNPs were set to MAF <0.01, call frequency <0.9 and p(χ 2 ) <0.00001 for Hardy-Weinberg-Equilibrium in the Illumina Genome Studio used for the analyses. After filtering, 48,410 SNPs (overall genotype call rate of 99.879 percent) on 31 autosomal chromosomes remained for statistical analysis. Allosomes were not considered, because no Y chromosome data were available and allosomes would not enable homozygosity based analyses in male individuals.

Data processing and statistical analysis
For the detection of selection signatures three methods were applied: ROH, iHS and xpEHH. Before statistical analyses, haplotypes were derived and missing genotype calls were imputed chromosome wise for all samples together across breeds in Beagle 4.0 [33] while neglecting pedigree information. Given the very high average call rate (>99.9 percent), the proportion of imputed genotypes in the final dataset was extremely low (0.121 percent). Since the Trakehner sample comprised less than 50 animals, which is usually considered a lower limit for quality imputation, we performed the imputation across all breeds together. To capture population structure, a principal component analysis (PCA) of the genotype dataset was done with the software Genome-wide Complex Trait Analysis (GCTA), version 1.91.7beta [34,35]. A genomic relationship matrix was built from the genotype information and used to calculate the first 20 eigenvectors and all eigenvalues.

Runs of Homozygosity
ROH and their clusters, i.e. homozygous segments shared by multiple individuals, were analysed chromosome-wise using the SNP & Variation Suite v.8.8.1 [36]. ROH-clusters were analysed within and across breeds. The across and within breed clusters of ROHs were defined by segments shared by at least a third of the individuals. The distance minimum was set to 500kb and 15 SNPs and no missing or heterozygous SNPs were accepted. The lower density limit was set to 1 SNP per 100kb and we allowed for a maximum gap distance of 1,000kb [37].

Haplotype-based analyses
Voight et al. [19] introduced iHS as a modification of the Extended Haplotype Homozygosity (EHH) previously developed by Sabeti and colleagues [38]. The EHH captures the decay of homozygosity with increasing distance from a core allele. An allele under strong selection will usually be embedded in an unexpectedly long homozygous haplotype which is in contrast to the unfavoured allele. This difference between ancestral and derived alleles is described as the iHS and equates the standardised quotient of the integral under the EHH curves of the ancestral and derived allele.
A large positive value hence indicates that an ancestral allele is under positive selection and has increased in frequency but has not yet obtained fixation. A large negative value results from selection for the new, derived allele [16]. The iHS-computations were done per chromosome for individuals within and across breeds. By applying the iHS across all breeds, we aimed to pick up selection signals that affect the group as a whole. The pooling of all four breeds together treats them as the sport horse population as a whole and provides a more comprehensive perspective.
When a selected allele has reached fixation within one population but is still polymorphic in another, the xpEHH as described by Sabeti et al. [18] has a very high statistical power to detect such differences between populations. Hence, it successfully discovers complete selective sweeps within a specific breed [16]. The xpEHH is derived from pairwise breed comparisons. We compared each breed individually ("case population") to the total of the other three breeds combined ("control population").
For iHS and xpEHH, information on the allele status is required, defining alleles as ancestral and derived. SNP data from a domestic ass (Equus asinus), serving as outgroup, were used to deduce the putative allele status (http://geogenetics.ku.dk/publications/middle-pleistoceneomics, accessed 13 July 2016).
For comparison with the general caballoid state, the reference genome EquCab2.0 [39] was used. This follows the assumption that the donkey still possesses ancestral alleles while new "derived" alleles have emerged through mutation events in the modern horse and have then increased in frequency through domestication or breed formation. This approach is commonly used in in selection signature studies, e.g. chimps are used as outgroup for humans and bison, yak or buffalo for cattle [19,40].
A total of 48,410 SNPs were entered in the iHS and xpEHH analyses. Calculations of both iHS and xpEHH were executed in R Statistical Software using the tailored package REHH 2.0.0 [41] with default options. A linkage disequilibrium evaluation (r 2 � 0.8), based on phased and imputed data and executed in Haploview 4.2 [42], resulted in 7,739 tag SNPs across all autosomes. We therefore assumed a conservative significance threshold of p = 0.0001 (-log 10 (p-value) = 4.0) equivalent to 10,000 independent tests to account for multiple testing.

Screening for candidate genes
For functional analysis, regions covering selection signatures were scanned for annotated genes in the equine reference assembly EquCab2.0 using the online tool Biomart from Ensembl (https://www.ensembl.org/biomart/martview, accessed April 2018, Ensembl release v92). Breed overlapping iHS-signatures were checked 1Mb up-and downstream from the significant SNP. With regard to ROH-clusters, the positional resolution of the beadchip is comparatively low, and in order to avoid too many false positives the scanning for annotated genes was done conservatively within the margins of each particular ROH-stretch.
For the functional interpretation of the signatures, the assumption was made that signals were due to artificial or natural selection pressures and not due to demography.
To identify putative candidate genes under selection pressure we took into account (A) which Quantitative Trait Loci (QTL) fell into selection signatures, (B) which genes have a potential functional link to the pronounced breeding goals of these horse breeds, (C) which important biological pathways were identified through an enrichment analysis, and (D) which genes have been reported in relevant literature.
A. For results from the across-breed iHS and ROH as well as the xpEHH, we checked for intersection of these selection signatures with known QTL in horses downloaded from the animalgenome.org database (https://www.animalgenome.org/ cgi-bin/QTLdb/EC/summary, accessed February 2019, release 37). The intersection of QTL regions and selection signatures was done with bedtools intersect [43], filtering for a complete overlap. Analogously to the scanning for annotated genes (see above), the selection signatures of iHS and xpEHH were extended by 1MB up-and downstream for this analysis, while no margin adjustment was done for the ROH.
B. We paid special attention to genes related to growth, fertility, conformation, pigmentation, metabolism, athletic performance and locomotion since these aspects are part of the more detailed selection criteria in the statutes of the studbooks.
C. To see which biological pathways might have been targeted across breeds, we used the list of annotated genes within ROH and iHS selection signatures for an enrichment analysis in the functional annotation tool DAVID 6.8 [44,45] (https://david.ncifcrf.gov/, accessed 14 Feburary 2019). The gene lists were analysed for the species Equus caballus against the matching background. The Benjamini and Hochberg [46] test was used to correct for multiple testing.
D. We thoroughly crosschecked with literature which genes have been found or suggested as targets in previous selection signature or association studies in horses and other domestic species. For instance a PubMed search in the National Center for Biotechnology Information (NCBI) database yielded 26 hits for the keywords "horse selection signatures" and 43 hits for "domestic animals selection signatures". These and other topic related publications, such as the studies fed to the HorseQTLdb (https://www.animalgenome.org), were considered for the determination of candidate genes.

Principal component analysis
A plotting of the first two principal components of the genotype data resulted in a tentative separation of the dataset into the four breeds (Fig 1). The Trakehner cohort forms a distinct subgroup and nests next to Oldenburger and Hanoverian, which mostly overlap. Holsteiner cluster more separately from the other three breeds.

Selection signatures intersecting with QTL
When considering across breed iHS and xpEHH selection signatures (both ±1Mb) and ROH shared by at least a third of all samples, these overlap with 44 QTL known in horses. Out of the equine 2,023 QTL listed in the animal QTL database, 1,975 are on autosomes and have a physical position in base pairs. The 44 QTL we found to fall within selection signatures belong to a total of 12 different traits ( Table 2). Since some traits are represented with a much higher number of QTL in the database than others, we set the number of overlapped QTL in relation to the known total. Four traits were identified for which over 10 percent of the listed QTL fall into selection signatures: cannon bone circumference, coat texture, hair density and sperm count.

Runs of Homozygosity
The search for ROH clusters, i.e. homozygous segments shared by multiple individuals, yielded selection signals within and across breeds. The across-breed approach (N = 942) revealed 37   (Table 3). Up to 43 percent (N = 404) of the sampled horses shared a particular ROH-segment. Breed-specific analyses detected a plethora of 149 ROH in Trakehner horses, while the other breeds had comparatively lower numbers. We found 58 ROH in Holsteiner, 39 in Hanoverian and 38 in Oldenburger (S1 Table).

Determination of allele status
For the donkey, 46,747 out of the equine 48,410 SNPs could be identified after alignment to the equine reference genome EquCab2.0 and assigned an allele status: derived or ancestral. The donkey was homozygous for 46.6 percent of the caballoid alternative alleles and for 53.0 percent of the caballoid reference alleles. Alleles where the donkey was homozygous were treated as ancestral and the opposite alleles were categorised as the new, derived alleles. For 0.4 percent (173 SNPs) of the SNPs the donkey was heterozygous and the reference allele of the horse was then assumed to be the ancient one. The remaining 1,663 of the 48,410 SNPs were randomly assigned to either of the two allele status categories. We did not leave them out of subsequent analyses, because we searched for selection events and not selection direction, meaning that we focussed on if and where selection has occurred and not which allele was favoured over its alternative counterpart.

Cross-population Extended Haplotype Homozygosity
Analogous to the iHS-analyses, SNPs with a -log 10 (p-value) � 4.0 were considered to be significant. We compared each breed ("case population") to the other three breeds together ("control population"). Trakehner exhibited significant breed-specific selection signatures on 4 different chromosomes, Holsteiner also on 4, and Hanoverian on 5 (Fig 4, Table 5). The Oldenburger breed showed numerous significant signals on 12 different chromosomes with the highest values on ECA 19 (52.3-53.9Mb). Despite similar iHS signals for Oldenburger and Hanoverian, those two breeds showed many differences when directly compared (Fig 5).

Enrichment analysis
The enrichment analysis based on genes located within across-breed iHS signatures (91 gene IDs recognised by DAVID out of 104 genes), identified the GO terms around nucleus, (tropo-) myosins, motor activity, insulin-like growth factor (IGF) and ATP binding to be enriched at p<0.05 (Table 6). When taking genes falling into ROH stretches as input for the enrichment analysis (388 gene IDs recognised by DAVID out of 444 genes), the pathways IGF I and II binding were again detected, as well as IGF receptor signalling. Other nominally significant GO terms were intermediate filament, embryonic skeletal system morphogenesis and chondrocyte differentiation (Table 7)  When combining the genes from iHS and ROH signatures (461 unique IDs recognised out of 523 genes), the analysis for annotation clusters yielded four clusters with at least one individual GO term enriched at p< 0.05. The first cluster orbits around embryonic development, whereas the second one is based on IGF binding and cell growth. The third cluster focuses on cell proliferation, differentiation and fate, whereas the fourth cluster focusses on metabolism and glycolytic processes (Table 8). Here, embryonic skeletal system morphogenesis is the only biological process to pass the BH test.

Discussion
In this study we looked for signatures of selection within important equine warmblood horse breeds. In spite of their common relevance to sport horse breeding, their official current breeding focus differs with respect to sporting discipline. In addition, historically the four breeds Trakehner, Holsteiner, Hanoverian and Oldenburger underwent different breeding policies regarding pure and cross-breeding and divergent primary focus of utilization. When seeking to evaluate and interpret this study's results it should be kept in mind that the analysed sample set was preselected since only young stallions were included that had passed the studbooks' first inspection and were sampled during the preselection's health check. The sample is thus representative for the potential squad of sires of future generations and reflects the associations' respective current breeding goals.
According to breeding documents, Trakehner and Holsteiner have most consistently pursued pure-breeding over the past century which is clearly reflected in the PCA clustering as well. The separation of Holsteiner from the other three breeds might also stem from their clear and relatively early focus on show-jumping. The sample set used in our study was already included in a study on the genomic prediction of breed assignment, in which an eigenvector analysis resulted in a very similar clustering [32]. Oldenburger and Hanoverian show a very similar clustering pattern in the PCA and also exhibit very similar iHS selection signatures. This concordance could originate both from shared breeding goals as well as the occasional common use of sires since the 1950s [48]. But the detected differences in the xpEHH-analysis Selection signatures in four German warmblood horse breeds show that both breeds have yet unique features that distinguish them from one another and might historical differences in breed formation. The xpEHH allows for pairwise breed comparisons and detects selection sites that are close to or have achieved fixation in one breed but remain diverse in another. Hence, it picks up signatures that are no longer detectable with the iHS or only result in weak signals. Reduced local genetic variation is indicative of ongoing or past selection processes. This idea is implemented in the screening for ROH which refer to continuously homozygous segments in the genome. For the Trakehner horses, we found by far the highest number of breedspecific ROHs. On the one hand, the severe population bottleneck shortly after the Second World War is a possible cause for this phenomenon. On the other hand it is possible that simply more genomic sites have been under selective pressure compared to the other three breeds. The length of ROH can also shed light on the age of selection signatures and to what extend inbreeding is recent or dates further back. However, the average length of the ROHs within breed was not significantly different for any pairwise breed comparison, presumably because of the thresholds for SNP density and ROH assignment that were set for the ROH screening. A higher SNP resolution than used here would be necessary to obtain informative data on the precise length of the ROHs and thus indication on recent or historical selection events. However, results from within breed iHS analysis demonstrate the substantially divergent haplotype pattern and indicate distinct selection signatures in the Trakehner breed compared to the three others. This is in agreement with the reported divergent historical selection focus and breeding policy.
When searching for candidate genes under selection in our sample populations, we relied on overlaps of selection signatures with QTL, enriched pathways, functional candidacy and findings reported from other studies. The four breeds we investigated in this study mostly select for conformation, locomotion, athleticism and aptitude for one of the major disciplines show-jumping, dressage or eventing. Capability of reproduction, i.e. fertility, is also listed as a criterion by these studbooks [4,[9][10][11][12]. The results from the enrichment analyses in DAVID should be considered carefully. Although nominally significant (p<0.05), only two pathways were significantly enriched after a correction for multiple testing (Benjamini-Hochberg test). Two ROH shared by at least a third of all individuals overlapped with QTL for hair density and coat texture (ECA 11). The enrichment analysis based on genes within ROH stretches showed an enrichment of the Gene Ontology (GO) term intermediate filament (GO:0005882). This was mostly driven by the keratin complex (ECA11). The xpEHH analysis between Hanoverian and the other three breeds also detected signatures spanning the keratin complex. Keratin is known to influence skin [49], hair [50] quality and is the major component of the equine hoof [51]. A missense variant in the coil1A domain of the KRT 25 gene, which is located within our selection signatures, has previously been associated with the curly hair phenotype in horses [52]. In addition to the keratin complex, we suspect the gene KIT ligand (KITLG) to be under selective pressure. This gene has a well-documented effect on skin pigmentation and thereby coat colour in cattle [53] and pigs [53,54] and is located within a ROH stretch on chromosome 28. Metzger et al.
[26] also overserved homozygous segments around this locus and suggested KITLG as a selection target. Related to KITLG is KIT (tyrosine kinase receptor), which we found to be very close to a ROH signature on ECA 3 ) that also overlapped with a QTL for white markings [55]. KIT has been linked to dominant white syndrome in horses [56] as well as other coat colour phenotypes [57]. Throughout history, different coat colours have been favoured and targeted by selection in horses [58] and apparently this feature continues to be of relevance and under selection pressure [59].
Next to coat colour, size is a typical example of artificial selection in domestic animals [60]. Height of withers is a highly heritable [61] trait in horses that is easily measured and today a plethora of QTL is available for this trait [47]. We found QTL overlaps with ROH on ECA 3 and 8 [62] as well as overlaps with xpEHH selection signatures in all four breeds on multiple  HECW1, FAM96A, TRIP4, ONECUT1, IKZF1, CSNK1G1, USP3, FIGNL1, PGAM2,  RPS27L, STK17A, PSMA2, GABPB1, MAPK6, PPIB, GCK, ZPBP, POLM, LEO1, GNB5, [63] was located within a ROH stretch on ECA 18 in our analysis. The functional annotation of genes from ROH signatures resulted in an enrichment of the GO term regulation of cell growth (GO:0001558), which comprises the candidate genes IGFBP (insulin-like growth factor binding) 1, 3 and 4, which also feature in the biological process of regulation of IGF receptor signalling (GO:0043567) and the molecular function of IGF I and II binding (GO:0031994, GO:0031995).
Concluding from our results, we propose IGF binding proteins as new candidate genes for height of withers in horses, considering that IGFBP4 has been associated with height in humans already [64]. IGFBP1, 3 and 4, which we found in selection signatures (ECA 4 and 11), can bind to IGF1 and IGF2, which are important for growth in early childhood [65].
An important factor for growth and body height in adolescence is organismal development in earlier stages of life. Genes located in ROH and iHS signatures were found enriched in an annotation cluster that revolved around prenatal development and specifically comprised the GO terms embryonic skeletal system morphogenesis (GO:0048704) and anterior/posterior pattern specification (GO:0009952). The HOXB gene cluster essentially underlying the enrichment of these pathways is very likely to be under selective pressure. HOXB genes are homeobox genes that are crucial for correct patterning of embryonic structures along the body axis, morphogenesis and nerval development [66]. Interestingly, the pathway for chondrocyte differentiation (GO:0002062) was part of annotation cluster 3 in our enrichment analysis (see Table 7. Top 10 enriched pathways determined with DAVID from genes falling in across breed Runs of Homozygosity (ROH) selection signatures in four warmblood horse breeds. Selection signatures in four German warmblood horse breeds Table 8). The gene BMP2 (bone morphogenic protein 2), located within a ROH stretch on ECA 22, belongs to this pathway and has previously been associated with body size and development in sheep and goat [67,68]. When looking at the biological background of athleticism, the two components (energy) metabolism and muscle functionality are of particular relevance [69,70]. Our results give reason to assume that both components have been subject to selective pressure. To our knowledge, no association studies have been done in horses for metabolic traits or related traits. However, as mentioned before, the results from our enrichment analysis highlight IGF I and II binding and the regulation of IGF receptor signalling. Besides regulating growth, IGF binding proteins influence metabolism through the binding to IGFs and thereby manipulate glucose and insulin levels and are central players in diabetes, obesity and other metabolic diseases [71]. Both IGFBP1 and 3 are related to insulin levels, fat accumulation (73), and have been linked to the metabolic syndrome (74), which also affects equids (75). Naturally, many genes act in different pathways and may therefore be of special interest in breeding. Both IGFBP4 and the gene AEBP1 (adipocyte enhancer-binding protein 1) seem to play a double role in metabolism as well as muscle functioning. AEBP1 falls within an acrossbreed iHS and a xpEHH signature in Oldenburger on ECA 4 and is reportedly involved in diet-induced obesity and energy homeostasis in mice, where it was upregulated in adipose tissue [72]. However, it is also a strong candidate for cardiac functioning and has been found to be highly expressed during the differentiation of smooth muscle cells of the aorta [73]. IGFBP4 is a component involved in the canonical WNT-signaling pathway, which is necessary for cardiogenesis, where it exerts an inhibiting function [74].
Racing ability is one of the few performance traits analysed in association studies in horses. We found colocalisation of one QTL each with a ROH stretch on ECA 17 [75] and 18 [76] and an additional QTL for racing ability on ECA 28 [77] colocalised with a Hanoverian specific xpEHH signature. A ROH on ECA 22 spanned over RALGAPA2 (Ral GTPase activating protein catalytic alpha subunit 2), which was already found in a selective sweep in Asian thoroughbreds and is reported to be associated with racing performance [78]. We assume that these regions harbour genes that contribute not only to racing ability but to sportiness in general.
Sarcomeres are the contractile unit at the histological core of the muscle and comprise the two basic modules actin and myosin [79]. Many of the genes we found within or in proximity to selection sites encode for actin-binding proteins which already hints at their importance for sports performance orientated breeding. The genes TPM1 (tropomyosin 1) and TMOD2 & 3 were found in across breed iHS-signals on ECA 1. Both tropomyosin and tropomodulin are actin-binding and function as stabilizers for actin filaments. Mudry and colleagues [80] already reported that TPM1 interacts with tropomodulin and aids to maintain and control actin filament length and is therefore important for cell structure and stability. The importance of TPM1 for muscle functionality is further emphasized by findings in transgenic mice, where it was demonstrated that isoforms of TPM1 govern muscle performance in cardiac and skeletal muscle [81].
While athletic performance is a trait that is clearly driven by artificial selection pressures, fertility is likely to be subject to natural selection processes. Low sperm quality in stallions correlates with pregnancy rates in mares [82] and it can be extrapolated that such stallions will generally produce less or no offspring. Analogously, mares with genetic predispositions for reproductive failure will produce less offspring or remain barren.
In contrast to height, fertility has much lower heritability [83,84] and few GWAS have been performed for this feature. Unsurprisingly, we found only a single QTL overlap for sperm count [85] with a selection signature in our study and could not detect functional enrichment for a directly related pathway. Yet, there are functional candidate genes present in ROH and iHS selection signatures, such as ZPBP1 & 2 (zona pellucida binding protein 1 & 2) and SUN3 (Sad1 and UNC84 domain containing 3) on ECA 4 and 11. SUN3 belongs to an interactive protein complex and is involved in sperm head formation in mammals [86] while the two known zona pellucida binding proteins ZPBP1 & 2 play a crucial role in acrosome formation and morphological sperm development. The inactivation of either of the genes led to partial or full loss of fertility in mice [87] and mutations in ZPBP1 were detected in infertile men, too [88]. The ZPBP is assigned to the GO term nucleus (GO:0005634), for which we found an enrichment based on genes localised in across breed iHS selection signatures.
Other genes possibly associated with male fertility are THEGL (testicular haploid expressed repeat spermatid protein like) and TEX14 (testis expressed 14) in ROH stretches on ECA 3 and 11. Whilst THEGL has been found to be mainly expressed in testis and the ductus deferens in mice [89], TEX14 plays a role in spermatogenesis [90]. Metzger et al. [26] proposed an additional gene as a selection candidate for male fertility in horses: CFAP61 (Cilia and flagella associated protein 61) on ECA 22. Since we also detected a ROH across this gene, the results from our study support this hypothesis.

Conclusion
This study revealed selection signatures in warmblood horses with a common current main breeding goal on athletic performance, but divergent historical breeding policy and selection focus. Despite breed specific differences, shared signals were found across the entire genome. Considering our findings and the analysis of annotated genes in regions under selective pressure, we conclude that candidate genes predominantly play a role in development and growth, metabolism, muscle development and functioning, as well as fertility. We suggest follow-up studies integrating comprehensively phenotyped warmblood sport horses with genomic information in order to validate whether the proposed candidate genes and genomic regions are indeed causal for variations in traits such as athletic performance.
Supporting information S1 Table