Insight into the genetic composition of South African Sanga cattle using SNP data from cattle breeds worldwide

Understanding the history of cattle breeds is important because it provides the basis for developing appropriate selection and breed improvement programs. In this study, patterns of ancestry and admixture in Afrikaner, Nguni, Drakensberger and Bonsmara cattle of South Africa were investigated. We used 50 K single nucleotide polymorphism genotypes that were previously generated for the Afrikaner (n = 36), Nguni (n = 50), Drakensberger (n = 47) and Bonsmara (n = 44) breeds, and for 394 reference animals representing European taurine, African taurine, African zebu and Bos indicus. Our findings support previous conclusions that Sanga cattle breeds are composites between African taurine and Bos indicus. Among these breeds, the Afrikaner breed has significantly diverged from its ancestral forebears, probably due to genetic drift and selection to meet breeding objectives of the breed society that enable registration. The Nguni, Drakensberger and Bonsmara breeds are admixed, perhaps unintentionally in the case of Nguni and Drakensberger, but certainly by design in the case of Bonsmara, which was developed through crossbreeding between the Afrikaner, Hereford and Shorthorn breeds. We established patterns of admixture and ancestry for South African Sanga cattle breeds, which provide a basis for developing appropriate strategies for their genetic improvement.


Background
South Africa is richly endowed with indigenous cattle breeds, among which are the Afrikaner, Nguni and Drakensberger breeds. These breeds played important roles in the social, cultural and economic development of the country [1]. Previously, Makina et al. [2] described these breeds as being genetically distinct from the European Bos taurus breeds (Angus and Holstein) and as having genomic regions associated with tropical adaptation [3]. Therefore, they may hold potential for production in harsh and fluctuating South African environments based on their adaptation to the nutritional, parasitic, and pathogenic challenges they are faced with. These breeds are not endangered and have reasonable effective population sizes [4][5][6]. Given their adaptive characteristics, they are potentially valuable to breeding programs in other regions that face biological stresses such as famine, drought or disease epidemics [7]. Furthermore, there is a worldwide drive for the effective management of indigenous genetic resources, which includes these breeds [7].
Afrikaner and Nguni cattle were brought to Southern Africa by the Khoi-Khoi people who migrated southwards from the African Great Lakes region between 600 and 700 AD [1]. Summers [8] postulated that ancestors of Afrikaner cattle migrated very quickly along the eastern side of Southern Africa to the current Western Cape and western parts of the Northern Cape. Ancestors of Nguni cattle are believed to have moved southward in the

Open Access
Ge n e t i c s Selection Evolution *Correspondence: somakina@gmail.com 1 Agricultural Research Council-Animal Production Institute, Private Bag X 2, Irene 0062, South Africa Full list of author information is available at the end of the article African continent at a much slower pace [8]. Afrikaner, Nguni and Drakensberger are classified as Sanga cattle and are thought to result from crossbreeding between thoracic-humped Lateral Horned zebu and humpless Egyptian Longhorn cattle [9][10][11]. The initial admixture probably occurred when African taurine cattle migrated south from Egypt and the Sudan, and indicine cattle migrated to the eastern seaboard of Africa from Arabia and India [9,11]. However, Bisschop [12] suggested that Sanga cattle originated from crosses between humpless Egyptian Longhorn and short-horned B. taurus brachyceros. The cross-section of the horns of Egyptian Longhorn are oval, which is similar to those of Afrikaner cattle, while those of B. taurus brachyceros are round as in Nguni cattle [12].
Based on analyses with microsatellite markers, Hanotte et al. [11] and Freeman et al. [13] predicted that Sanga cattle resulted from the crossbreeding of African taurine and zebu cattle around 700 AD, which was confirmed by studies based on single nucleotide polymorphisms (SNPs) [14][15][16][17][18]. However, in spite of these studies, the genetic composition of South African Sanga cattle remains uncertain [19]. The genetic distance between cattle breeds appears, at least in part, inversely related to the geographic proximity of their origin [20]. Hanotte et al. [11] and Freeman et al. [13] also found that the extent of genetic introgression of zebu cattle across the African continent decreases from eastern to western Africa. MacHugh et al. [21] reported that the cattle breeds from the tsetse-infested areas of West and Central Africa had limited or no zebu ancestry, which concurs with their susceptibility to trypanosomiasis [21].
Genomic characterisation of South African Sanga cattle is a first step towards the development of appropriate breeding and selection strategies for these breeds. Makina et al. [2] characterized the relationships between the Afrikaner, Nguni, Drakensberger and Bonsmara breeds using Angus and Holstein as reference breeds, without including any other indicine or African taurine cattle. The limited number of breeds analyzed precluded detection of patterns of co-ancestry or admixture in the South African Sanga breeds. Thus, the aim of our study was to provide a more precise analysis of patterns of admixture and ancestry in the Afrikaner, Nguni, Drakensberger and Bonsmara cattle of South Africa using a subset of data that were generated for cattle breeds worldwide [14][15][16][17].
These data were merged in PLINK [25] and autosomal SNPs that were common to all datasets were retained. This resulted in 35,155 SNPs and 548 individuals after removing SNPs with a MAF lower than 0.005, a call rate lower than 0.98 and individuals with more than 5% missing genotypes.

Genetic relationships and population structure
Patterns of admixture and relationships among South African Sanga cattle in relation to the 20 reference breeds were determined using principal component analysis [26] implemented in the SNP Variation suite (SVS 8.1; Golden Helix Inc., Bozeman, Montana) and variational Bayesian inference as implemented in fast-STRUCTURE [27]. The data were evaluated for K values ranging from 2 to 20 to evaluate ancestry proportions from K ancestral populations assuming a simple noninformative prior. The K * ε and K ∅ c * metrics from fast-STRUCTURE were used to determine the appropriate values of K for the population structure explained by the dataset. The K * ε metric is the value of K, which maximizes the log marginal likelihood lower bound and the K ∅ c * metric is the minimum value of K that explains almost all of the ancestry in the dataset. Outputs from fastSTRUCTURE [27] were plotted using the GENESIS software [28]. To further test for evidence of admixture in South African Sanga cattle, ancestry graph [29], threepopulation ( f 3 ) [30,31] and four-population ( f 4 ) tests [30,32] implemented in TreeMix [29] were also used. The maximum likelihood tree (ancestry graph) [29] was first built for all 24 populations (see Additional file 2: Figure S1), after which, migration events were sequentially added to the tree until no more meaningful increases in the proportion of variance explained were observed (see Additional file 3: Table S2).

Principal component analysis
The principal component assessment agreed with previous findings, which partitioned bovine breeds into three distinct groups representing European taurines, African taurines and indicines [14-18, 23, 24] (Fig. 1). Afrikaner and Nguni cattle were situated on the gradient between the indicine and African taurine breeds, but more towards the latter. The Bonsmara and Drakensberger breeds clustered towards the centre of the triangle, which suggests that they have three ancestries (European taurine, African taurine and indicine).

Population structure analysis
Allowing for three ancestral populations (K = 3) (Fig. 2) supported the classification of bovine populations into three distinct groups i.e. European taurine, African taurine and Bos indicus. This analysis predicted that the composition of Afrikaner cattle was approximately 70% African taurine and 28%, indicine, while that of Nguni was 60% African taurine, 30% indicine, and 10% European taurine. Predicted compositions of Bonsmara and Drakensberger were 41 and 46% European taurine, 42 and 38% African taurine, and 16 and 15% indicine, respectively.
Increasing K from 3 to 5 assigned Afrikaner individuals into a single cluster and suggested that 97% of the Afrikaner genome was not shared with any of the reference breeds. The remaining 3% ancestral portion of the Afrikaner genome was shared with the African zebu breeds (1.6%) and with the African taurine and indicine reference breeds (<1%). Also at K = 5, Nguni, Drakensberger and Bonsmara remained admixed with a distinct genome component that was shared with African zebus (ZBO, ZEB, ANKW, SHK and BOR) and Kuri (a hybrid between African taurine and indicine populations [14]), but absent from indicines (BR, NEL, GIR, with the exception of BAG < 1%) and African taurines (NDAM, LAG, SOM and BAO). We note that the distinct component in BAG was only observed in a few individuals, which suggests that it may have been introduced through unsupervised crossbreeding.

Ancestry graph
The ancestry graph with 10 migration edges as developed using TreeMix [29] is in Fig. 3. This graph revealed the introduction of NDAM or NDAM relatives into the Nguni and Afrikaner cattle. This finding agrees with the results from the cluster analyses (K = 3), which indicated that the Afrikaner and Nguni cattle received approximately 60 to 70% of their ancestry from African taurine cattle (Fig. 2). In addition, we observed an admixture edge between Shorthorn and Bonsmara that was consistent with the history of the development of this breed. The eight other admixture processes modelled by network edges were previously characterized by Gautier et al. [14]; The Bovine HapMap Consortium [15]; Matukumalli et al. [16]; Decker et al. [17]; Gautier et al. [23]; and Decker et al. [24].

Formal tests of admixture
The three-population statistic f 3 (A;B,C) [30,31] tests for bifurcating tree-like relationships in the evolution of populations and significant negative values of the f 3 statistic imply that population A is admixed and is a mixture of populations related to B and C [30,31]. In agreement with results from the cluster analyses, we detected strong evidence of admixture in Drakensberger, Bonsmara and Nguni cattle. Examining Nguni in conjunction with any of the populations related to African taurines, indicines, African zebus, European taurines or Afrikaner yielded significant tests, for example,  Table S1) [DRA;ANG,AFR] (Z-score = −12.02). The f 3 statistic failed to detect admixture within the Afrikaner cattle and agreed with the cluster analyses at K = 5 as described above.
The four population test f 4 (A,B;C,D) [30,32] tests whether (A, B) and (D, C) represent distinct clades in a population tree. In this test, significant non-zero values indicate the presence of gene flow between the (A, B) and (C, D) groups in the tree [30,32]. In this test, combining Afrikaner and any of the African zebu or African taurine breeds yielded the most significant values (see Additional file 4: Table S3), which suggested gene flow from African zebu breeds or African taurines into the Afrikaner cattle. For Nguni, Bonsmara and Drakensberger, the most significant non-zero values were obtained when they were combined with any of the indicine, European taurine, African taurine and African zebu breeds (see Additional file 4: Table S3), which indicates the presence of gene flow between these breeds.

Discussion
This study aimed at unravelling patterns of ancestry and admixture in South African Sanga cattle. Cluster analysis (K = 3) and the principal component analysis were mutually supportive and highlighted the presence of three genetic backgrounds for the populations studied. The classification of the bovine populations into a triangle-like structure is well-established [14-18, 23, 24]. As already mentioned, the Nguni and Afrikaner cattle were observed on the gradient between the indicines and African taurines, but more towards the African taurines, which indicates that the genomes of these breeds include more African taurine genetic background than indicine genetic background. This was supported by the proportions inferred by the admixture analysis at K = 3, which indicated that about 70 and 60% of the Afrikaner and Nguni ancestries were derived from African taurines. The detection of a migration edge from NDAM or NDAM relatives into Nguni and Afrikaner cattle in the TreeMix analysis provided further support. This larger proportion of ancestry shared between African taurines and South African Sanga (Afrikaner and Nguni) at K = 3 was consistent with the hypothesis of selection against susceptibility to trypanosomiasis, which may have led to a reduction in the extent of indicine ancestry in the surviving cattle, since indicine cattle are susceptible to trypanosomiasis [21,33,34].
As expected, Bonsmara and Drakensberger clustered towards the centre of the triangle in the principal component analysis, which suggests that these breeds have three ancestries (European taurine, African taurine and indicine). This was supported by the results of the cluster analysis at K = 3, which indicated that the Bonsmara and Drakensberger breeds were 41 and 46% European taurine, 42 and 38% African taurine, and 16 and 15% indicine, respectively. This was further confirmed by the detection of strong evidence for admixture by the three and four population tests ( f 3 [30,31] and f 4 [30,32]) when these breeds were examined in conjunction with European taurines, African taurines and indicines. These results are consistent with the histories of the development of these breeds. The Bonsmara breed was developed at the Mara and Messina research stations between 1937 and 1963 under the guidance of the late Professor Jan C. Bonsma [35]. The aim was to produce a local breed that was well adapted to sub-tropical environments and had superior production compared to Afrikaner cattle. Five types of crosses were tested with Afrikaner cattle that included Red Aberdeen Angus, Hereford, Red Poll, Shorthorn and Sussex breeds. Pilot trials revealed that breed compositions including 5/8 Afrikaner and 3/8 Hereford or Shorthorn ancestries resulted in increased calving percentages and weaning weights with reduced calf mortalities relative to the purebreds [36]. Introgression of European taurine into the Drakensberger breed could have occurred due to the association of this breed with European settlers [1]. In 1837, several "Voortrekker" families (settlers) left Cape Province and traveled north with herds of similar black oxen that were then called "Vaderland" cattle. Most of these trekkers settled along the Drakensberg Mountain range and among them, the Uys family is believed to have played a significant role in the  Table S1] development of the "Uys-cattle" through strong selection within their closed herd; the "Uys-cattle" was later referred to as the Drakensberger breed by the Department of Agriculture owing to their prevalence in the pastures of the Drakensberg Mountains [36].
Allowing for five ancestral populations partitioned the African taurine and indicine admixture ancestry that was observed in the Nguni, Drakensberger and Bonsmara breeds (K = 3). With the exception of the African zebus (ZBO, ZEB, ANKW, SHK and BOR) and the Kuri (hybrid between African taurine and indicine populations [14]), this component was unique to the Nguni, Drakensberger and Bonsmara breeds and absent in indicine (BR, NEL, GIR) and African taurine (NDAM, LAG, SOM and BAO) cattle. Thus, the indicine component of African zebu and Sanga cattle seems to differ from that observed in other modern indicine cattle (NEL, BR and GIR). We hypothesize that strong selection within the African context caused the significant divergence between the indicine genome present in African admixed cattle and the genome of other modern indicine cattle. Alternatively, the founders of indicine cattle that migrated into Africa may have differed from those of the modern indicines used in the analysis.
Afrikaner cattle appear to have diverged from their ancestral populations and are recognized as a distinct breed (K ≥ 5), which is likely due to the effects of genetic drift after admixture and strong selection of animals to conform to the standards and breeding objectives of the breed society. Afrikaner is the oldest indigenous cattle breed in South Africa and was the first indigenous breed to form a breed society in 1902 [1]. These results are consistent with the higher levels of inbreeding postulated by Coetzer and Van Marle [37] and detected by Makina et al. [2].
In agreement with the f 3 statistics, the Nguni cattle were predicted to be admixed (K = 5 and 7), their genetic makeup being predominantly African zebu with traces of indicines, African taurines and European taurines ancestries. The higher proportion of African zebu ancestry within the Nguni cattle is in agreement with the previous report by Makina et al. [3], who detected shared signatures of selection between Nguni and African zebu cattle. Historically, Nguni cattle were reared in extensive communal grazing systems in the presence of numerous other cattle representing various breeds and their crosses [38], which may also explain their admixture. The production potential of Nguni was only recognized in the early 1980s after the introduction of beef cattle recording schemes and the publication of results on the characterization of their productivity [39]. The Nguni breed society was established in 1986 and prior to this date, Nguni cattle were bred for various practical purposes and mated at random, which potentially led to admixture due to their close association with the indigenous people of South Africa and the communal husbandry they practiced [38].
In summary, our analyses support the view that Sanga cattle are composites of African taurine and Bos indicus [9,11]. The Afrikaner breed clearly diverged from its ancestral forebears, probably due to genetic drift and alternative breeding objectives. The Nguni, Drakensberger and Bonsmara breeds are admixed, which was perhaps unintentional for Nguni and Drakensberger, but was certainly done by design in the case of Bonsmara that was developed through crossbreeding of Afrikaner, Hereford and Shorthorn.

Conclusions
This study presents a comprehensive genome-wide characterization of South African Sanga cattle and confirms that South African Sanga cattle originated from African taurine and Bos indicus. The hybrid origin of Bonsmara cattle was confirmed and is consistent with the history of its development. Thus, genome-wide characterization of these populations has accurately recapitulated the history of the breeds' formation [16]. These results improve our understanding of the composition and origins of South African Sanga cattle.