Quantitative evaluation of hybridization and the impact on biodiversity conservation

Abstract Anthropogenic hybridization is an increasing conservation threat worldwide. In South Africa, recent hybridization is threatening numerous ungulate taxa. For example, the genetic integrity of the near‐threatened bontebok (Damaliscus pygargus pygargus) is threatened by hybridization with the more common blesbok (D. p. phillipsi). Identifying nonadmixed parental and admixed individuals is challenging based on the morphological traits alone; however, molecular analyses may allow for accurate detection. Once hybrids are identified, population simulation software may assist in determining the optimal conservation management strategy, although quantitative evaluation of hybrid management is rarely performed. In this study, our objectives were to describe species‐wide and localized rates of hybridization in nearly 3,000 individuals based on 12 microsatellite loci, quantify the accuracy of hybrid assignment software (STRUCTURE and NEWHYBRIDS), and determine an optimal threshold of bontebok ancestry for management purposes. According to multiple methods, we identified 2,051 bontebok, 657 hybrids, and 29 blesbok. More than two‐thirds of locations contained at least some hybrid individuals, with populations varying in the degree of introgression. HYBRIDLAB was used to simulate four generations of coexistence between bontebok and blesbok, and to optimize a threshold of ancestry, where most hybrids will be detected and removed, and the fewest nonadmixed bontebok individuals misclassified as hybrids. Overall, a threshold Q‐value (admixture coefficient) of 0.90 would remove 94% of hybrid animals, while a threshold of 0.95 would remove 98% of hybrid animals but also 8% of nonadmixed bontebok. To this end, a threshold of 0.90 was identified as optimal and has since been implemented in formal policy by a provincial nature conservation agency. Due to widespread hybridization, effective conservation plans should be established and enforced to conserve native populations that are genetically unique.


| INTRODUCTION
One of the goals of conservation is to conserve current biodiversity as well as the ecological circumstances and evolutionary processes that support it. Increasingly, biodiversity is being adversely affected by human actions. Anthropogenic hybridization is increasing worldwide and is a threat to the conservation of species (Todesco et al., 2016).
This human-mediated hybridization may occur due to the changes in the abundance and distribution of species, the removal of barriers that cause isolated or restricted species to expand, and/or the uncontrolled diffusion with domestic species (Allendorf, Leary, Spruell, & Wenburg, 2001;Allendorf, Luikart, & Aitken, 2013;Rhymer & Simberloff, 1996).
Molecular markers have been successfully used over the past decades to identify the rates of hybridization with high accuracy and low cost (e.g., Costa et al., 2013;Cullingham et al., 2011;Stephens, Wilton, Fleming, & Berry, 2015). Managing interspecific hybridization, after it is identified, is a more difficult task (Laikre, Schwartz, Waples, & Ryman, 2010;Piett, Hager, & Gerrard, 2015). As of yet, molecular marker data are rarely integrated with management decisions in a quantitative, predictive framework (Hoban et al., 2013).
In South Africa, wildlife species are extensively translocated outside of their historic distribution ranges onto private land as a part of wildlife management and commercial breeding (Spear & Chown, 2009). Due to private ownership of wildlife in South Africa, there is frequent trade in commercially profitable species which has led to the occurrence of multiple species on the same property outside their natural ranges. Thus, the incidence of hybridization has increased due to the scarcity of conspecific mates (Vaz Pinto, Beja, Ferrand, & Godinho, 2016) and loss of reproductive barriers between previously isolated evolutionary lineages. The potential negative impacts of hybridization are rapidly becoming a concern for South African conservation agencies. Hybridization may disrupt adaptive gene complexes or may result in genetic incompatibilities. Hybridization is known to reduce fitness in at least one species pair in South Africa, namely kudu (Tragelaphus strepsiceros) and nyala (T. angasii) (Dalton et al., 2014). An additional threat of hybridization is complete swamping, as in the rediscovered Giant sable antelope (Hippotragus niger variani), where natural hybridization with roan antelope (H. equinus) was proceeding rapidly and would have led to a complete hybrid swarm without intervention (Vaz Pinto et al., 2016). Consequences of anthropogenic hybridization include reduced fertility in the rare taxon, genetic swamping, or assimilation (Levin, Francisco-Ortega, & Jansen, 1996;Malukiewicz et al., 2014), which may lead to eventual extinction (Wolf, Takebayashi, & Rieseberg, 2001). In some cases, hybrids persist at low levels, while in other cases hybrid swarms effectively replace the original species (Allendorf et al., 2001). Hybridization rates can increase over time, sometimes very rapidly (Huxel, 1999;Schwartz, Luikart, & Waples, 2007). A recent review of 62 anthropogenic hybridization cases found that nearly all had identified negative consequences, especially for mammals (Piett et al., 2015).
The fate of hybrid animals is controversial. Management approaches could include culling all hybrid animals, isolation of hybrid herds, certification of nonadmixed herds, planned breeding, and/ or legislation to restrict the movement of hybrids and nonadmixed species to prevent future hybridization events (Grobler et al., 2011).
These interventions should be considered if introgression is widespread, with hybrid offspring being fertile where one or both taxa are threatened (Allendorf & Luikart, 2007;Laikre et al., 2010;Piett et al., 2015;Schwartz et al., 2007). Hybridization is currently threatening the genetic integrity of numerous ungulate taxa in South Africa such as blue wildebeest (Connochaetes taurinus) and black wildebeest (C. gnou) (Grobler et al., 2011), black-faced impala (Aepyceros melampus petersi) and common impala (A. melampus) (Green & Rothstein, 1998), Grevy's zebra (Equus grevyi) and plains zebra (E. quagga) (Cordingley et al., 2009), and bontebok (Damaliscus pygargus pygargus) and blesbok (D. p. phillipsi) (Lloyd & David, 2008). This article will focus on two subspecies of Damaliscus pygargus, nl. bontebok (D. p. pygargus) and blesbok (D. p. phillipsi). These two taxa are South African endemics (Vrba, 1979) with the common ancestor being historically distributed from the southwestern Cape to the southern boundary of Zimbabwe (Vrba, 1979). Fossil evidence indicates that past climatic and habitat changes resulted in the separation of D. pygargus into two allopatric groups (Skead, 1980;Skinner & Smithers, 1990), which are now classified as separate subspecies (Essop, Lloyd, Van, & Harley, 1991;Van der Walt, Nel, & Hoelzel, 2001). Historically, the blesbok occurred mostly in the grassland biomes in Gauteng, Eastern Cape, Mpumalanga, and the Free State Provinces (Figure 1, Skinner & Smithers, 1990). The bontebok had a more restricted distribution to the low-lying, grassy coastal plains within the fynbos biome of the Western Cape Province, where the population has declined and was driven to near extinction due to hunting and human intrusion in the 1800s (Van der Merwe, 1986). The two subspecies had non-overlapping ranges within different ecosystems ( Figure 1). Translocations to wildlife farms and reserves outside the former distribution ranges have brought the two subspecies in artificial, secondary contact. These events have led to documented hybridization between the two subspecies (Van der Walt et al., 2001;Van Wyk, Kotzé, Randi, & Dalton, 2013).
The primary threat to bontebok in the Western Cape Province is low availability of suitable habitat, thus limiting population expansion.
Previous studies revealed low genetic diversity (Van Wyk et al., 2013), population fragmentation, and the deliberate and/or accidental hybridization with blesbok. The total number of bontebok in South Africa is estimated between 6,500 and 7,000 animals with less than 1,000 individuals occurring within its former distribution range (Unpublished, CapeNature, 2014). The bontebok is listed as near-threatened on the International Union for Conservation of Nature (IUCN) Red List of Threatened Species.
A photographic test to distinguish between hybrids, nonadmixed bontebok, and blesbok was developed by Fabricius, van Hensbergen, and Zucchini (1989). The characteristics chosen as criteria for distinguishing between bontebok and blesbok were described by Bigalke (1955), emphasizing the importance of the white buttocks, upper legs, and belly. The photographic test has some shortfalls that require human interpretation, and in certain cases, hybrids could not be identified (especially in the F2 and subsequent generations). Notwithstanding, bontebok purity certificates were issued for tested populations based on photography. Due to the difficulties in characterizing hybrids based on morphological characteristics, a more accurate DNA test using a model-based Bayesian approach was developed that could be used to identify nonadmixed individuals and hybrids (Van Wyk et al., 2013).
The extent of hybridization across the species range is as yet unknown but must be determined prior to conservation intervention. The Western Cape Provincial Conservation Agency, CapeNature, has been mandated to develop the Bontebok Conservation, Translocation and Utilization Policy (BCTUP, 2014) as a regulatory mechanism to direct the implementation of a genetic purity test for bontebok prior to any approval of translocations.
Sophisticated algorithms for molecular marker data have been used extensively to identify hybrids. The selection of a threshold Qvalue (hybridization or admixture index from clustering algorithms like STRUCTURE) can affect the classification of nonadmixed animals and hybrids. Performance of hybrid identification should be a balance of both accuracy and efficiency (Vähä & Primmer, 2006), reflecting an offset between errors of inclusion (identifying all hybrids at the expense of including some nonadmixed individuals) and omission (omitting hybrids to ensure that nonadmixed individuals are not mistaken as hybrids). In the literature, Q-values ranging from 0.7 to 0.99 are commonly used (Hoban, McCleary, Schlarbaum, Anagnostakis, & Romero-Severson, 2012;Lepais et al., 2009;Sanz, Araguas, Fernández, Vera, & García-Marín, 2008;Valbuena-Carabaña, González-Martínez, Hardy, & Gil, 2007). There is a trade-off between accuracy and efficiency and between focusing on the nonadmixed species or on hybrids. The optimal Q-value depends on the application of the test and the target species. This value can be determined via simulations (Cullingham et al., 2011;Lepais et al., 2009).
Simulations can then also be used to determine the consequence of culling animals deemed to be hybrids (Hoban, 2014;Huxel, 1999).
In this study, we describe species-wide hybridization rates using close to 3,000 bontebok blood, tissue, or hair samples. Our aims are to (1) accurately describe species-wide and local hybridization rates and assess the use of different software for identifying hybrids, (2) determine a threshold for the classification of hybrid and nonadmixed animals using simulated data, and (3) quantify how many animals would be removed based on the selected threshold.

| Sample collection
The study formed part of an ongoing registered project entitled "Detection of hybridization and determination of the level of genetic

| Admixture analysis
Identification and classification of hybrid individuals were conducted using four different methods: principal component analysis (PCA), assignment of individuals based on species-specific alleles, and two Bayesian clustering software programs. A PCA of a pairwise, individual-by-individual, covariance matrix was calculated using multivariate ordination methods without spatial components in GenAlEx 6.5 (Peakall & Smouse, 2006 and adegenet (Jombart, 2008;Jombart & Ahmed, 2011) to examine the relationships among individuals without prior grouping. This method was used to provide information with regard to the divergence among groups and identify individuals that clustered separately from nonadmixed populations, thus indicating hybrid origin.
Assignment of individuals was also determined using speciesspecific alleles (e.g., individuals with blesbok alleles) as an alternative to model-based clustering methods (Metcalf, Siegle, & Martin, 2008 the probability of an individual belonging to a cluster; an intermediate value for both clusters is evidence of recent genetic admixture) for each individual as well as overall assignment rates to each of the hybrid categories, to determine the degree to which these settings influenced our results. STRUCTURE with the number of a priori clusters (K) set to 2 (under the assumption that both species contribute to the gene pool of the individuals) was used to estimate the admixture status for each individual. STRUCTURE is a model-based method that assumes two separate gene pools that had recent contact. Analysis was thus performed using both "uncorrelated" and "correlated" frequencies. All runs were performed assuming the admixture model that allows for hybrid offspring. For each analysis, to determine the appropriate Markov chain Monte Carlo length and burn-in, we tested "short" (total length = 1e5, burn-in = 1e4), "long" (total length = 1.2e6, burn-in = 2e5), and "very long" (total length = 2.4e6, burn-in = 4e5) runs. We performed a minimum of five replicates under each setting. For all runs, the genetic ancestry of the "reference" sets (option LocPrior) was not provided as prior information to STRUCTURE. Jeffrey's prior (Gelman, 2009). However, to rule out the possibility of bias due to low frequencies of alleles, the uniform prior analysis was also run. We performed a minimum of three replicate runs for each prior, for both moderate (burn-in of 1e4, total run length of 1e5) and long (burn-in of 2e5, total run length of 1.2e6) runs.

| Simulation analysis
Simulated data were created using HYBRIDLAB (Nielsen, Bach, & Kotlicki, 2006). This program creates hybrids from two parental population allele frequency pools as determined from frequencies calculated from our reference dataset. We created a test dataset of 4,000 simulated individuals in which the ancestry of all individuals is known. The test dataset consisted of 500 each of nonadmixed bontebok, nonadmixed blesbok, F1 hybrids, F2 hybrids (backcross to bontebok or blesbok or F1), backcross to bontebok, backcross to blesbok, double backcross to bontebok and double backcross to blesbok.
The simulated dataset was subsequently analyzed with STRUCTURE and NEWHYBRIDS, with settings as described above to calculate admixture values (as in Vähä & Primmer, 2006;Cullingham et al., 2011;Hoban et al., 2012). The advantage of using simulated data is that we know which individuals belong to each class, so we are able to assess whether Bayesian assignments are correct, and to quantify how well these programs assign individuals. To determine the error rates of these methods (Vähä & Primmer, 2006), simulated individuals of known "nonadmixed" or hybrid status (Hoban et al., 2012;Lepais et al., 2009) at a range of thresholds were analyzed. This allowed us to determine which admixture probability threshold would maximize the accuracy of assignments.
Using the simulated data, we calculated the accuracy and efficiency of the two software programs (Vähä & Primmer, 2006), testing different admixture thresholds to conclude an individual is a hybrid (Hoban et al., 2012;Lepais et al., 2009). "Efficiency" is defined as the number assigned to a category of the total number simulated in that category; low efficiency indicates that the clustering software failed to identify many individuals it attempted to assign (e.g., a missed call). "Accuracy" is defined as the number correctly assigned to a category of the total number assigned to that category, for example, the proportion that belongs there; low accuracy means that many individuals were assigned to a category incorrectly. These two values reflect erroneous omissions and inclusions, respectively. Vähä and Primmer (2006) multiply these two statistics to summarize overall "performance." For STRUCTURE, we calculated efficiency, accuracy, and performance for thresholds from 0.86 to 0.99 (at intervals of 0.01, e.g., 0.86, 0.87), with individuals above the threshold being assigned as nonadmixed bontebok. The program NEWHYBRIDS, which produces outputs of a probability of being in one of six categories rather than one of two clusters, we tested two sets of thresholds.
First, we tested a threshold of 0.50-0.99, with individuals being assigned to one of the six categories with a probability higher than the threshold. Individuals with no assignment probability above the threshold are considered "other" (unassigned). Secondly, we tested a thresholdof 0.86-0.99, in which we only assigned individuals as bontebok, blesbok, or hybrid (with hybrids being those below the threshold). The second approach is more similar to the threshold described for STRUCTURE. We calculated efficiency, accuracy, and performance for the full simulated dataset, and for a subset without the double backcross individuals, to quantify the degree to which double backcross individuals are problematic. Lastly, using simulated data, we quantified the consequences of different thresholds on the number of bontebok and hybrids removed, for thresholds between 0.86 and 0.99.

| Spatial analysis of the dataset and management options
Using the optimized hybrid thresholds, we examined the hybridization rates at each sampling location, considering locations with at least 10 individuals per location (n = 76 locations) in order to test whether hybridization rates were similar across sampling locations. We tested this hypothesis as hybridization may be associated with particular anthropogenic or environmental factors and can vary across a species range (Costa et al., 2013;Hoban et al., 2012). Finally, we evaluated the management consequences of implementing a hybrid culling policy at different admixture thresholds with the goal to maximize the removal of hybrid individuals and blesbok while minimizing the removal of bontebok.

| Genetic diversity
All individuals were typed at 12 microsatellite loci, and overall, missing data were less than 2.2%. No departures from HW proportions or linkage disequilibrium were detected in each subspecies. In addition, null alleles were not observed by MICRO-CHECKER. As previously reported by Van Wyk et al., 2013, we found that reference blesbok have significantly more variation than reference bontebok (He = 0.543 compared to 0.321, p = .025). This is partly due to near homozygosity at three markers (BB10, BB05, and BB08) in bontebok. Exclusion of these three markers still resulted in the observed lower heterozygosity levels in bontebok compared to blesbok but not significantly so (He = 0.422 compared to 0.531, p = .211).

| Identification and classification of hybrids
Both the species are clearly distinct on PCA, with hybrids separated between the bontebok and blesbok (Figure 2). The blesbok appears to be more scattered which may indicate higher genetic diversity. results, we also found relatively similar consistency in assignments to each class (Table 5). Also, the number of individuals with different assignments between the two software programs is small, close to 2%. Thus, using both methods, our unknown individuals consisted of approximately 1% blesbok, 75% bontebok, and 24% hybrid (Table 4).
According to NEWHYBRIDS (Table 4), hybrids were primarily backcrossed to bontebok and F2 individuals, with some "other" (no strong assignment to one category), and no backcrosses to blesbok or F1 hybrids were identified. We found that the proportion of hybrids at each collection locality varied (Figure 3). Only 23 locations had only nonadmixed bontebok (30.2%). We also tested the hypothesis that smaller populations would have more hybrid individuals because of fewer mate choices available to breeding individuals. However, this hypothesis was not supported (p = .54, linear model).

| Analysis of simulated data
Using simulated data, we determined the threshold values that could be applied to assign individuals as nonadmixed or admixed in an empirical dataset. We tested the efficiency and accuracy of the two Bayesian methods using thresholds ranging from 0.85 to 0.99 (Tables S1 and S2). For assignment to each nonadmixed species, efficiency was maximized at low thresholds, while accuracy was maximized at high thresholds, while for hybrids, efficiency was maximized at high thresholds and accuracy was maximized at low thresholds (for both programs, see Table S1). A subjectively determined balance appears to be achieved at a threshold between 0.90 and 0.95 for STRUCTURE. Double backcrosses individuals decreased both the F I G U R E 2 Principle component analysis (PCA) indicating the relationships among bontebok (1), blesbok (2), and hybrid (3) individuals without priori grouping T A B L E 1 Mean number of individuals whose Q-values increased or decreased by more than 0.05 when compared to another run (mean of 4 runs) accuracy and the efficiency of both methods, as they were difficult to detect, and will change the recommended threshold. For example, with NEWHYBRIDS, maximum accuracy of identifying nonadmixed individuals was achieved with a threshold between 0.8 and 0.88 without double backcross, but 0.93 and 0.97 with them. The two programs had similar results regarding accuracy, efficiency, and

Run length Correlated Uncorrelated
performance.
Lastly, we tested the threshold for removal on the simulated data ( Figure 4a,b) to quantify the proportion of bontebok (black line) and the proportion of the other categories that would be incorrectly assigned (colored lines). As the threshold increased, especially beyond 0.94 for STRUCTURE (Figure 4a), a large proportion of nonadmixed bontebok were incorrectly removed, to capture the backcrosses and double backcrosses. Similar patterns were seen for both programs, with the main difference being that for NEWHYBRIDS (Figure 4b) fewer bontebok were removed, but slightly more hybrids were included at higher thresholds. Note that NEWHYBRIDS performs poor at assigning individuals to each particular hybrid category, for example, assigning an individual that really is an F2 to the F2 category rather than classifying as a general "hybrid" category (data not shown). We then applied the threshold to our actual dataset to quantify how many individuals will be removed in a culling program to retain only nonadmixed bontebok ( Figure 5). A low-to-moderate threshold (0.85-0.92) implies that approximately 20%-30% of all living animals will need to be culled (with similar results for each software), while at the highest thresholds between 40% and 60% of animals will be culled (depending on the software used).

| DISCUSSION
In this study, we examined hybridization rates between bontebok and blesbok in nearly 3,000 animals sampled from across South Africa.
Overall, approximately 25% of nonreference animals are hybrids, which is slightly lower than previously estimated in a smaller study (33% estimated from 121 animals; van Wyk et al. 2013). Importantly, admixed individuals were found in two-thirds of the locations included here. Thus, few populations on private land can be considered as nonadmixed. As the game breeding and hunting industry advances, translocation and hybridization (intentional and unintentional) rates will likely increase. Unless hybrids are removed, this increase combined with the current substantial overall numbers of hybrids and the higher number of locations with hybrids could ultimately result in swamping of the native gene pool as has been reported in other species (Huxel, 1999;Vaz Pinto et al., 2016).
Both the admixture software programs STRUCTURE and NEWHYBRIDS performed equally well and provided similar hybridization rates (Table 5). Other settings such as the model chosen affect the designation of a very small number of individual animals (Tables 2-4).
Thus, the uncertainty regarding any given individual animal's assignment is relatively low. Overall, our conclusion about hybridization rates is robust. We also found that blesbok private alleles can be used to assist in identifying hybrids (classifying nearly 84% of hybrids); however, the probability-based approaches in STRUCTURE and NEWHYBRIDS had a higher efficiency and accuracy in detecting hybrid individuals (>95%, depending on thresholds selected).
We    was not at such a low number already, a higher threshold and thus removal of more bontebok may be acceptable. It is acknowledged that identification of more complex hybrid categories (e.g., crosses between F1 and backcrosses), as well as further generations of crossing may be possible in the future by using more markers (e.g., SNPs). Developing more markers to achieve such resolution is recommended. A formal framework to incorporate data from Q-values, species-specific alleles, and morphology would be imperative to ensure the long-term survival of bontebok.
The results from this study indicate that hybridization in these subspecies is historical and that hybridization is not occurring between nonadmixed bontebok and nonadmixed blesbok but rather between nonadmixed bontebok/nonadmixed blesbok and hybrids, as no F1 hybrids were identified in this study using the method of NEWHYBRIDS (which assigns individuals to different hybrid categories rather than the Q-values of STRUCTURE). Interestingly, a study of polecats in Britain also found no F1 hybrids in a survey of 345 animals; however, extensive admixture rates were identified (Costa et al., 2013) as observed in this study. In addition, we identified no backcrosses to blesbok. One possible explanation is that management strategies on private land include backcrossing with bontebok in order to obtain nonadmixed bontebok populations.
Bontebok was a conservation flagship species for the former (pre-1994) Cape Province and strict regulatory measures to protect bontebok from deliberate and/or accidental hybridization with imported blesbok were implemented in the late 1980s. These measures included the demarcation of natural distributional ranges based on magisterial (administrative district) boundaries to inform translocation through permits. The limited size and availability of habitat within the "natural distributional range" led to the demarcation of an additional area, an "extended distribution range," to which bontebok may be translocated to enable population expansion. The consolidated area was buffered with a region where no introductions of either bontebok or blesbok would be permitted in order to secure the bontebok population through separation. Bontebok populations on private land were assessed using a computer-based method developed to distinguish between nonadmixed bontebok, blesbok, and hybrid populations based on the measurements of the rump patch taken from photographs as per the method described by Fabricius et al. (1989).
Populations comprised of nonadmixed bontebok were certified, and bontebok purity certificates were issued to land owners as incentives to promote population expansion. These certificates are required for any imports of bontebok hunting trophies into the United States.
Nonadmixed bontebok individuals have become a novelty, and the demand for stocking of this subspecies outside its indigenous range has increased substantially. Game species have been legally imported and exported onto private land, outside the indigenous range of the subspecies, resulting in an increase in the demand for bontebok sourced from its indigenous range.
To preserve remaining native bontebok populations, policy and management need to be implemented quickly. Our molecular data and modeling results have helped in this effort. The Western Cape Provincial Conservation Agency, CapeNature, developed the Bontebok Conservation, Translocation and Utilization Policy (BCTUP, 2014) as a regulatory mechanism to implement the genetic test described in this article to advise on permitting translocations. The result from this research was directly used in order to establish the optimal threshold of admixture in this policy. All bontebok that are translocated must be tested and permanently marked (microchipped). Test results will be stored in a centralized database at the National Zoological Gardens of South Africa. BCTUP contains a collection protocol specifically aimed at the implementation of a forensic sampling procedure and the maintenance of custody samples. At present, identified hybrids must be kept in isolation, they may not be translocated alive, and must be culled. We hope that our quantitative assessment of hybrids in this system is an exemplar for future studies of hybridization in southern Africa, and furthers the development of science-based, quantitative conservation policies.