Molecular phylogeny and phylogeography of genus Pseudois (Bovidae, Cetartiodactyla): New insights into the contrasting phylogeographic structure

Abstract Blue sheep, Pseudois nayaur, is endemic to the Tibetan Plateau and the surrounding mountains, which are the highest‐elevation areas in the world. Classical morphological taxonomy suggests that there are two subspecies in genus Pseudois (Bovidae, Artiodactyla), namely Pseudois nayaur nayaur and Pseudois nayaur szechuanensis. However, the validity and geographic characteristics of these subspecies have never been carefully discussed and analyzed. This may be partially because previous studies have mainly focused on the vague taxonomic status of Pseudois schaeferi (dwarf blue sheep). Thus, there is an urgent need to investigate the evolutionary relationship and taxonomy system of this genus. This study enriches a previous dataset by providing a large number of new samples, based on a total of 225 samples covering almost the entire distribution of blue sheep. Molecular data from cytochrome b and the mitochondrial control region sequences were used to reconstruct the phylogeny of this species. The phylogenetic inferences show that vicariance plays an important role in diversification within this genus. In terms of molecular dating results and biogeographic analyses, the striking biogeographic pattern coincides significantly with major geophysical events. Although the results raise doubt about the present recognized distribution range of blue sheep, they have corroborated the validity of the identified subspecies in genus Pseudois. Meanwhile, these results demonstrate that the two geographically distinct populations, the Helan Mountains and Pamir Plateau populations, have been significantly differentiated from the identified subspecies, a finding that challenges the conventional taxonomy of blue sheep.

However, the validity and geographic characteristics of these subspecies have never been carefully discussed and analyzed. This may be partially because previous studies have mainly focused on the vague taxonomic status of Pseudois schaeferi (dwarf blue sheep). Thus, there is an urgent need to investigate the evolutionary relationship and taxonomy system of this genus. This study enriches a previous dataset by providing a large number of new samples, based on a total of 225 samples covering almost the entire distribution of blue sheep. Molecular data from cytochrome b and the mitochondrial control region sequences were used to reconstruct the phylogeny of this species. The phylogenetic inferences show that vicariance plays an important role in diversification within this genus. In terms of molecular dating results and biogeographic analyses, the striking biogeographic pattern coincides significantly with major geophysical events. Although the results raise doubt about the present recognized distribution range of blue sheep, they have corroborated the validity of the identified subspecies in genus Pseudois. Meanwhile, these results demonstrate that the two geographically distinct populations, the Helan Mountains and Pamir Plateau populations, have been significantly differentiated from the identified subspecies, a finding that challenges the conventional taxonomy of blue sheep.

K E Y W O R D S
biogeography, blue sheep, genus Pseudois, mitochondrial DNA, phylogenetic relationship

| INTRODUCTION
As a unique species in genus Pseudois popularly known as "blue sheep," Pseudois nayaur is endemic to the Tibetan Plateau and the surrounding mountains (Mts.) (Figure 1a,b). The Tibetan Plateau is considered the highest ecosystem in the world and the largest high-elevation ecosystem, with an average elevation of more than 4,000 m (Guo et al., 2006;Figure 1b). Unfortunately, due to the inaccessibility of the habitat of blue sheep, sample collection is particularly difficult, and related studies are limited. Classical morphological taxonomy has been utilized to suggest two identified subspecies in genus Pseudois (Bovidae, Artiodactyla), Pseudois nayaur nayaur and Pseudois nayaur szechuanensis. However, the validity of this taxonomy and the geographic characteristics of the two subspecies have never been carefully discussed and analyzed. Previous studies mainly focus on the vague taxonomic status of dwarf blue sheep, whose body size is notably smaller than that of common blue sheep (Cao, Wang, & Fang, 2002;Feng, Lajia, Taylor, & Webster, 2001;Liu et al., 2011). Intraspecific differentiation has rarely been addressed in previous studies due to the great difficulty in sample collection, so the composition of species and subspecies in genus Pseudois is continuous controversial. Different geographically restricted populations of blue sheep (Pseudois nayaur), a Central Asian ungulate with unclear variation in morphology and phylogeographic structure, have been widely debated. Moreover, the ungulate herbivores have strong cascading effects on the history and evolution of ecosystems (Pringle, Young, Rubenstein, & McCauley, 2007) and blue sheep is the most widely distributed of Central Asian ungulate.
Consequently, there is an urgent need to investigate the evolutionary relationship and taxonomic status of this genus.
As for the currently established subspecies of blue sheep, P. n. nayaur is found throughout the Tibetan Plateau, Bhutan, Nepal, northern Pakistan, and northern India, as well as probably in Tajikistan (Harris, 2014;Figure 1a). Before 2012, the molecular and morphological data of this subspecies were always poorly documented. In our previous study, we showed that the subspecies P. n. nayaur was recognized in our molecular analyses for the first time. Nevertheless, only one sample of this subspecies was included in that study. Consequently, this result must be strengthened by accumulating more morphological and molecular data. The other subspecies, P. n. szechuanensis which is distributed in several provinces in western China (Figure 1a), has been the research interest of previous studies. In accordance with the previous morphometrics study (Yang, 2001), the skull of Ningxia (Helan Mts.) blue sheep is remarkably different from those of Sichuan, Gansu, and Qinghai sheep. The Helan Mts. population is considered to be the eastern peripheral population of P. n. szechuanensis. As shown in Figure 1a, this population has been completely geographically isolated from other populations. In 2001, an accumulation of data showed that the characters of the blue sheep population from Helan Mts. differed from those from other localities in behavior, ecology, and morphology (Wu, 2003;Yang, 2001), suggesting that this population never belonged to P. n. szechuanensis. Based on two mitochondrial (mt) genes, Zeng, Xu, Yue, Li, and Zou's (2008)finding subsequently indicated that the Helan Mts. population could likely be classified as a new subspecies according to the recognized 75% rule, which has been frequently F I G U R E 1 (a) Overview of the sampling localities for this study and the approximate geographic distributions of blue sheep (in gray). More specific locality data are shown in Table 1  used for the identification of subspecies (Amadon, 1949;Mayr, 1963).
More samples should be accrued to investigate the phylogenetic relationship of this proposed new subspecies. So far, there has been little information about the molecular phylogeny and phylogeography of this species, and no new species and subspecies have been precisely designated. In this study, we collected a large cohort of samples (56) covering the entire distribution of blue sheep, specifically a large number (42) from the Helan Mts. population and three first-time collected samples from Yunnan Province (Figure 1a). Along with the available sequences obtained from the previous studies, this study has three primary aims: The first is to discuss the genetic differentiation and diversity among the different populations of blue sheep, the second is to evaluate the level of genetic variation, species or subspecies within the geographically distinct population of blue sheep based on additional samples, and the third is to infer the phylogeography and diversification of geographically distinct populations in line with molecular dating data and historical biogeography results.

| Ethics statement
We only used wild blue sheep which died a natural death in this study.
We collected these samples from the remains ourselves and it is confirmed that they had sacrificed in the field. All samples were obtained following the regulations for the implementation of China on the protection of terrestrial wild animals (State Council Decree [1992] No.

| Sample collection and DNA extraction
In this study, we expanded previous datasets by providing a large number (56) of new samples, including some areas (new six sampling locations) that had not been investigated previously (Table 1). We T A B L E 1 (Continued) laboratory. MtDNA was extracted from muscle and skull tissues according to the protocol detailed in the Genomic DNA Extraction Kit (Tiangen, China); skulls were first pulverized and soaked in EDTA for 72 hr.

| Amplification and sequencing
The entire Cyt b gene sequences were amplified using the primer pairs L14724 and H15915 (Zeng et al., 2008) and CYF and CYR (Tan et al., 2012), while the primers D-LOOP_L2 and D-LOOP_H2 were used to amplify the 5′ section of the mitochondrial control region (Wang, Cao, Liu, & Fang, 2006), which includes 554 bp of the hypervariable segment (Eythorsdottir & Tapio, 2001). Amplification was performed in a total volume of 50 μl containing 50 mm KCl, 10 mm Tris-HCl, 1.5 mm

| Phylogenetic analyses
The obtained Cyt b and D-loop sequences were blasted using the software MEGA v.6.06 (Tamura, Stecher, Peterson, Filipski, & Kumar, 2013) and aligned with the previous sequences from GenBank (Table 1) using the Clustal W algorithm (Higgins et al., 1994) with default parameters and no deletions, insertions, or stop codons present in this alignment. Haplotypes were detected using DnaSP v.5.10.01 software (Librado & Rozas, 2009), and duplicate sequences were not used for construction of phylogenetic trees. For the phylogenetic analyses, we chose maximum likelihood (ML) and Bayesian inference (BI) using the programs jModeltest software v.0.1.1 (Posada, 2008) to determine the appropriate model of molecular evolution in a likelihood ratio test framework based on the Akaike information criterion (Posada & Buckley, 2004). To generate robust trees, the sequences of goat (Capra hircus) and aoudad (Ammotragus lervia) were chosen as outgroups. ML analysis was performed with PhyML software v.3.0 (Guindon & Gascuel, 2003) allowing four substitution rate categories, and the final tree was computed from 1,000 bootstrap replicates. BI analysis was carried out using MrBayes software v.3.1.2 (Huelsenbeck & Ronquist, 2001), and the Markov chain Monte Carlo (MCMC) sampling approach was used to calculate the Bayesian posterior probabilities. The program was started with random trees and sampled every 100 generations for a total of 3,000,000 generations. Then, the first 25% of trees sampled were discarded as burn-in. To ensure searching for the stationary parameter of Markov chain, Tracer software v.1.4  was employed to verify that sampled values of log likelihood plotted against generation time reached a stable equilibrium. Trees in this analysis that biased the equilibrium were not used to calculate the posterior probabilities. The retained trees were then used to generate a consensus tree in combination with the Bayesian posterior probabilities.

| Molecular dating
Divergence dates were estimated using a Bayesian analysis of Cyt b sequences implemented using BEAST software v1.7.5 . The program BEAST is known as a unique program to simultaneously build phylogenetic trees and estimate di- (ii) The emergence of Bovinae tribes dates back to 11.5-19.7 Mya (Hill et al., 1985; also see Hassanin & Douzery, 2003; Node B in Figure 2).
(iii) The calibration of 5-7 Mya for the sheep-goat split was derived from the fossil record (Carroll, 1988;Savage & Russell, 1983; Node F in Figure 2). In BEAST, a Yule process speciation prior and an uncorrelated lognormal model of rate variation were employed in each analysis. In the MCMC analysis, following a discarded burn-in of 1,000,000 steps, the posterior samples were drawn at every 1,000 steps of a total of 10,000,000 steps. This analysis was repeated, and the samples from these two runs were combined. Convergence was assessed in Tracer v1.5 , and the effective sample sizes of parameters sampled from MCMC in our run were greater than 200. The means between the upper and lower bounds of the 95% highest posterior density interval for divergence times were calculated using the program TRACER. FigTree v1.4.0 (Rambaut, 2012) and TreeAnnotator v1.7.5  were used to assess the tree topologies.

| Biogeography
To infer the historical biogeography in this study, we performed a Bayesian binary MCMC (BBM) analysis implemented in Reconstruct Ancestral States in Phylogenetics v2.1 beta (RASP) to reconstruct geographic areas at nodes in the phylogeny (Yu, Harris, Blair, & He, 2015). This method calculates the frequencies of ancestral distributions, ranges, or other traits at each node and averages them over all sampled trees. We used the two Cyt b trees obtained from former phylogenetic analyses and geographic area codes. The BBM was run in parallel for 5 million generations with 10 chains under the Jukes-Cantor fixed-state frequencies model and among-site variation (Gamma) set to equal.

| Genetic diversity and mtDNA sequence characteristics
In addition to the marked phylogeographic structure described above, all the populations exhibited relatively high levels of haplotype and nucleotide diversity (Table 2). Among the 153 sampled blue sheep, we found 62 Cyt b haplotypes, with none shared by major clades. Nucleotide diversity was similar among these populations (Table 2)  and Sichuan populations, which simultaneously excluded the effect of background selection (Fu, 1997;Hull & Girman, 2005). We also observed relatively high levels of sequence divergence among the four previously established populations in the genus Pseudois (Table 3). The Hengduan Mts. and Qilian Mts. populations were attributed to the Sichuan population, leading to a low divergence between them.  The results (see Table 5) of biogeographic analysis based on Bayesian

| Molecular dating and historical biogeography
Binary MCMC showed only the most likely states at each node in Figure 3b, which were used in the subsequent discussion.

| DISCUSSION
Only two mitochondrial genes were previously used to study the molecular phylogeny of the genus Pseudois: Cyt b gene and mitochondrial control gene (Li, Liu, Wang, & Huang, 2012;Tan et al., 2012;Zeng et al., 2008).  T A B L E 3 Genetic distances within and between different populations included in this study F I G U R E 2 Divergence time estimates of family Bovidae and blue sheep based on Cyt b gene (Table 4). Numbers above branches refer to the posterior probabilities (PP) and letters are node names as in Table1 Genetic distance is usually used as a standard for species distinctiveness based on the already recognized concepts of major species, such as the evolutionary and biological concepts of species (Mayr, 1963;Wiley, 1978). The main criterion of some phylogenetic species concepts was identified as the evidence of genetic isolation among different monophyletic clades (Mishler & Theriot, 2000). The results of the comparisons of the genetic distances among different monophyletic clades were in congruence with the Cyt b phylogenetic tree topology. The mean sequence divergence among monophyletic clades in the Cyt b phylogenetic tree was 3%-4% (Table 3), while the results were much lower than the mean value (6.3%-12.6%) of interspecific divergence in Bovidae (Schreiber, Seibold, Nötzold, & Wink, 1999).
Meanwhile, these means fell in the range of Cyt b divergence values among subspecies in other related species of Artiodactyla, Capra ibex (4.258%-7.814%), Moschus berezovskii (2.18%), and Sylvicapra grimmia (2.8%) (Li et al., 2012). Thus, these results strengthened the reliability of our identified subspecies in phylogenetic trees, especially for the two suspected putative subspecies. As for the identified subspecies in genus Pseudois, P. n. nayaur was the earliest-diverging lineage, which corresponds to the inference that blue sheep originated from the Tibetan Plateau and then dispersed northeastwardly (Schaller, 1998).
Tibetan Plateau lies between the Himalayan Mts. to the south and the western Kunlun Mts. to the north (Figure 1b), with an average altitude of approximately 4,500-5,500 m. The plateau surface is complete with many interior river systems, lakes, and a wide meadow, forming an alpine tundra environment and a unique terrain that reduces species diversity. Based on limited data, the haplotype diversity of this population is much lower than the others. The significantly positive Fu's Fs and Tajima's D suggested that the Tibet population underwent balancing selection differently from the other populations (Table 2).
As this population lived at such high elevations, they must have experience certain extreme environmental conditions, including hypoxia, high levels of UV radiation, and dramatic daily changes in temperature (Sun et al., 2015). In the present study, we collected wool samples  (Bunch, Nadler, & Simmons, 1978), one from Qinghai Province (Li, 1999), and a dwarf blue sheep from Sichuan Province (Bunch, Wang, Zhang, Liu, & Lin, 2000) have been proven to share the same chromosome number of 2n = 54, whereas a blue sheep from Gansu Province has been uniquely found to have a different chromosome number, 2n = 56 (Bunch et al., 2000). These results support our molecular phylogenetic analysis, which showed that there is a significant genetic divergence between the two different populations of the Sichuan subspecies. One possible explanation for these results is the difference in the environments of their habitats. The Hengduan Mts. region is topographically complex: The elevation of the valley extends higher than 1,000−2,500 m, and the weather is typical of a monsoon climate.  (Table 4). Through the long-term gene flow and ecologically adaptive behavioral changes, this isolated population has developed a dramatic divergence from the other populations. Moreover, the data accumulated from these individuals presented differences in behavior, ecology, and morphology (Wu, 2003;Yang, 2001 Blue sheep originated from the Tibetan Plateau and then dispersed northeastwardly (Schaller, 1998 Zeng et al., 2000). As illustrated in Figure 1b, the south edge of the Tibetan Plateau was constituted by the Himalaya Mts., which formed an insuperable obstacle for blue sheep. For this reason, we speculated that this species migrated to the peripheral mountains of Tibetan Plateau, which have been reported to be zones of high seismic activity. Later, some geographically isolated populations, such as the Pamir Plateau and Helan Mts. populations, gradually formed a new subspecies.
One concern with regard to this species is its distribution range.
A recent survey on large mammals in Central Asia discovered the existence of a blue sheep population in the Altai Mts., outside the recognized distribution of this genus. In our present study, an exception involving the Subei County (SB2) haplotype formed a group sister to the lineage of other individuals from the Qilian Mts. and the Hengduan Mts., while a haplotype from the same location (SB1) clustered with the individuals from the Qilian Mts. population. The K2P distance between SB1 and SB2 reached 2.2% (Table 3), which was a relatively high value for two haplotypes from the same location. This result was probably an artifact of our small set of genes available for these populations. Subei County is an interesting place composed of two separate areas (Figure 1c) with major differences in terrain and environment. Figure 1c shows that the southern part of Subei County was located in the northern Qilian Mts., while the northern part lies in the eastern Altai Mts. In this context, we were able to deduce that these two haplotypes might be collected from different geographic  (Gu & Gao, 1989;Luo & Gu, 1991). We would like to raise doubt about the recognized distribution range of blue sheep and classification of the genus Pseudois. Because these areas are famous as "no man's land" and because it is difficult to obtain samples there, these populations of blue sheep are still largely unknown. Consequently, we highly recommend that more attention should be paid to these distant regions to maintain genetic diversity and protect this rare species. Geographic area codes are as follows: A = Helan Mts., B = Hengduan Mts., C = Qilian Mts., D = Tibet Plateau, E = Pamir Plateau, F = Maduo County (three samples named MD in Figure 1a). Nodes are shown in Figure 2a.
T A B L E 5 Biogeographic analysis using Bayesian binary MCMC, showing only the most likely states at each node