Genetic diversity and phylogeography of Siberian roe deer, Caproulus pygargus, in central and peripheral populations

Abstract Current understanding of phylogeographical structure and genetic diversity of Siberian roe deer remains limited mainly due to small sample size and/or low geographical coverage in previous studies. Published data suggest at least two phylogroups: western (Ural Mountains and Western Siberia) and eastern (east from lake Baikal, including the Korean peninsula), but their phylogenetic relationship remains unclear. Combined sequences of cytochrome b (1140 bp) and the mtDNA control region (963 bp) were analyzed from 219 Siberian roe deer from 12 locations in Russia, Mongolia, and South Korea, which cover a large part of its range, to assess genetic diversity and phylogeographical status. Special emphasis was placed on the demographic history and genetic features of central, peripheral, and isolated populations. Results of median‐joining network and phylogenetic tree analyses indicate that Siberian roe deer from the Urals to the Pacific Ocean are genetically diverse and that geographical distribution and composition of haplogroups coincide with previously described ranges of the subspecies Capreolus pygargus pygargus and Capreolus pygargus tianschanicus. We found that peripheral populations in the northwestern parts of the species range (Urals), as well as the isolated population from Jeju Island, are genetically distinct from those in the core part of the range, both in terms of genetic diversity and quantitative composition of haplogroups. We also found that northwestern (Urals) and northern (Yakutia) peripheral populations share the same haplogroup and fall into the same phylogenetic clade with the isolated population from Jeju Island. This finding sheds light on the taxonomic status of the Jeju Island population and leads to hypotheses about the discordance of morphological and genetic evolution in isolated populations and specific genetic features of peripheral populations.


| INTRODUCTION
The roe deer (Capreolus, Gray 1821) is one of the most widespread artiodactyl genera. It includes two species: the European roe deer (Capreolus capreolus) and the Siberian roe deer (Capreolus pygargus; Fig. 1). The Siberian roe deer is widely distributed in continental Asia and parts of Eastern Europe, from the Khoper River and Don River bend to the Ural Mountains and across southern Siberia. It is found through northern Mongolia and east to the coastlines of the East Sea, and the Yellow Sea, including the Korean Peninsula (Danilkin, 1999).
It ranges geographically from the West Siberian Plain south to Lake Balkhash, and east from there well into Kazakhstan without reaching the Aral Sea. It also ranges from Manchuria through Northern and Central China to the western half of the left bank of the Yangtze River and into eastern Tibet (Bannikov, 1954;Sokolov, Danilkin, & Dulamtseren, 1982;Dulamtseren, Tsendjav, & Avirmed, 1989[cit. by Danilkin, 1999). Records from further south as far as northeastern

Myanmar require confirmation. It also occurs on Jeju Island in South
Korea.
Data on the genetics of Siberian roe deer are scarce compared to the European roe deer. Randi, Pierpaoli, and Danilkin (1998) concluded that Siberian roe deer can be divided into two major clusters, namely the eastern cluster from the Amur region of the Russian Far East and the western cluster from the Kurgan region of Western Siberia. Molecular genetics studies on the taxonomic status of the Siberian roe deer from Jeju, South Korea, and the genetic structure of populations from Northern Eurasia have been reported (Koh & Randi, 2001;Koh et al., 2013;Tokarskaia et al., 2000). Based on RAPD markers, Petrosian, Tokarskaia, Danilkin, and Ryskov (2002) confirmed the appropriateness of assigning the western and eastern groups into the subspecies C. p. pygargus and C. p. tianschanicus, respectively. On the other hand, Xiao, Zhang, Fu, and Koh (2007) argued that the Siberian roe deer found in Northeastern China belongs to the subspecies C. p. manchuricus, based on morphological differences from the other subspecies. Sheremetyeva et al. (2010) presented a complex phylogenetic structure of roe deer populations in the Russian Far East based on genetic analysis of the short fragment of the mtDNA control region.
Based on combined alignment of the control region and cytochrome b, Zvychainaya, Danilkin, Kholodova, Sipko, and Berber (2011) found three haplogroups among 79 Siberian roe deer sampled from 23 regions of Asia, including Russia and Kazakhstan. Individuals from the Russian Far East, northeastern Russia, and Trans-Baikalia formed a single haplogroup, whereas specimens from the Urals, and Western and Central Siberia shared two distinct haplogroups. Lorenzini, Garofalo, Qin, Voloshina, and Lovari (2014) suggested three haplogroups for Siberian roe deer are distributed throughout the entire range of this species, including Western Russia, Kyrgyzstan, Northeastern China, Central-eastern China, and Eastern Russia, but no geographical structuring of the species lineages was found.
Most of the above mentioned studies are based on relatively small sample sizes (but see Xiao et al., 2007), and this could be one reason for uncertainty in the possible phylogeographical patterns reported (particularly by Zvychainaya et al., 2011). Together, published data suggest the existence of at least two or three phylogroups, but their phylogenetic relationships remain unclear, particularly in Central Siberia where the geographical ranges of C. p. pygargus and C. p. tianschanicus may overlap (Sheremetyeva et al., 2010).
As for peripheral populations of Siberian roe deer, the picture becomes especially complex. For example, Zvychainaya et al. (2011) reported that roe deer from Urals and Trans-Urals region (Sverdlovsk and Kurgan regions, close to the western periphery of the species' geographical range) are represented by two haplogroups, each occupying a distal position on the phylogenetic tree. Likewise, recent data on the genetic features of roe deer from Yakutia, at the northern periphery of the species' geographical range, put them into the Far Eastern clade (Zvychainaya et al., 2011). Thus, the phylogeographical structure of the Siberian roe deer remains ambiguous and many authors emphasize the need for extensive studies of the species in a number of regions.
In this study, we report original data from the Urals, Western and Central Siberia, the Russian Far East and Korea, including both mainland Korea and the isolated population on Jeju Island, covering most part of the geographical range of Siberian roe deer. We used a sufficient number of samples (not less than 20 from most regions) to provide reliable estimates of genetic diversity and analyzed phylogeographical patterns of the Siberian roe deer across Northern Asia, with special F I G U R E 1 A male Siberian roe deer (Caproulus pygargus) in Urals, Russia (Sverdlovkaya oblast'). Source: Alexey Gurinin interest in genetic differentiation among central, peripheral, and isolated populations. In particular, we tested the hypothesis suggested by Lorenzini et al. (2014) that there is a lack of geographical structuring of genetic lineages in Siberian roe deer. We also address the question of taxonomic status of the Siberian roe deer on Jeju Island, based on data collected from across the species' entire geographical range.

| Sample collection and DNA analysis
We obtained tissue, blood and skin samples from 219 individuals of C. pygargus from 12 locations in Russia, Mongolia, and South Korea and H15915 (5′-AAC TGC AGT CAT CTC CGG TTT ACA AGA C-3′) (Kocher et al.,1989). PCR conditions were as follows: 94°C for 4 min; 35 cycles of 94°C for 30 s, 55°C for 30 s, and 72°C for 1 min; and finally 72°C for 5 min. A 923-bp fragment of the mtDNA control region was amplified using primers L15775 (5′-ACA TGA ATT GGA GGA CAA CCA GT-3′) (Irwin, Kocher, & Wilson, 1991) and H651 (5′-AAG GCT AGG ACC AAA CCT-3′) (Kocher et al., 1989). PCR conditions were as follows: 94°C for 5 min; 35 cycles of 94°C for 45 s, 55°C for 45 s, and 72°C for 1 min 30 s; and finally 72°C for 5 min. Amplification was carried out in 20 μl reaction volume containing 10-100 ng template DNA, 100 μmol/L each dNTPs, 10 pmole of each primer, 1.5 mmol/L MgCl 2 , 1 unit i-star Taq™ DNA polymerase (iNtRON Biotechnology Inc, Korea), and 1 × PCR buffer. PCR products were purified using the Zymoclean™ Gel DNA Recovery Kit (ZYMO RESEARCH, USA). Purified PCR products were sequenced using an ABI Prism™ 377 automated sequencer (Applied Biosystems Inc, USA). The sequencing primers for both mtDNA regions were the same as those used for the amplification, except that in the case of the control region, the sequencing primers included a supplementary inner primer: L-362 (5′-AAT CAC CAT GCC GCG TGA AAC C-3′) (Douzery & Randi, 1997).

| Data analysis
The sequences derived in this study were identified as being from Capreolus species through BLAST searches (Altschul et al., 1997).
Haplotype diversity (h) and nucleotide diversity (π) for each of geographical samples were estimated with DNASP version 5.1 (Librado & Rozas, 2009). ARLEQUIN 3.1 (Excoffier, Laval, & Schneider, 2005) was used to calculate mismatch distributions and pairwise F ST s to compare genetic differentiation among geographical regions. Mismatch distributions were analyzed using the sudden expansion model (Rogers & Harpending, 1992), and goodness-of-fit tests of the observed to the estimated mismatch distributions were computed. The possible occurrence of historical demographic expansions was examined by Tajima's D (Tajima, 1989) and Fu's Fs (Fu, 1997) neutrality tests using ARLEQUIN. Fu's Fs is sensitive to demographic expansion, which usually leads to large negative values (Fu, 1997).
Phylogenetic relationships between geographical samples were estimated by the median-joining network procedure using the program Network version 4 (http://www.fluxus-engineering.com/). Network analysis effectively portrays the relationships among sequences and allows inference of haplotype genealogies at the F I G U R E 2 Geographical distribution of the haplogroups revealed by Bayesian analysis. The proportion of color in each circle indicates the proportion of haplogroups from the respective phylogenetic clusters (A, B, C) and all the rest (D). Circle size is proportional to the number of samples from each region. See Table 1 for regional abbreviations population level because they explicitly allow for extant ancestral sequences and alternative connections (Bandelt, Forster, & Rohl, 1999).
Phylogenetic trees to investigate evolutionary relationships were constructed using four methods: neighbor joining (NJ: Saitou & Nei, 1987) using Kimura's two parameter distances (Kimura, 1980), maximum parsimony (MP), maximum likelihood (ML), and Bayesian inference (BI). We used the combined sequences (2,071 bp) without tandem repeats because phylogenetic trees were very similar among separate analyses of cytochrome b and control regions (data not shown). European roe deer (Capreolus capreolus) was used as the outgroup for phylogenetic tree construction. The NJ, MP, and ML trees were constructed using MEGA 5.05 (Tamura et al., 2011). The MP tree was obtained using Close-Neighbor-Interchange with random sequence addition and 10,000 bootstrap replicates.
The most appropriate models of sequence evolution for ML and Bayesian trees were selected with JMODELTEST 2.1.4 (Posada, 2008).
The best-fit model for the ML tree was the general time reversible model (GTR) with Gamma distributed (+G) and proportion of Invariant sites (+I). Consensus ML trees were obtained by Nearest-Neighbor-Interchange heuristic searches of 1,000 bootstrap replicates.

Bayesian inference and Bayesian posterior probabilities (BPPs)
were estimated using MRBAYES v 3.2.2 (John & Fredrik, 2001). The Hasegawa-Kishino-Yano model (HKY) + G + I was selected as the best-fit model for the BI tree. Two Markov chains were conducted for 2,000,000 generations, and the tree was sampled every 100 generations with a burn-in of the first 500 data points. Nodes with bootstrap values (BS) >50% were regarded as sufficiently resolved (Hillis & Bull, 1993). Nodes with BPP >95% were considered statistically significant (Leaché & Reeder, 2002).
Divergence time (T) between mtDNA lineages was estimated among clades in the Bayesian tree. T was calculated as K/(2r) (Li, 1997), where sequence divergence (K, substitutions/site) was derived from the mean value of P-distance between groups with mean distance determined using Mega 5.2 (Tamura et al., 2011), and r is the average mutation rate of mtDNA (12.6 ± 3.2) as proposed by Pesole, Gissi, De Chirico, and Saccone (1999).

| Mitochondrial DNA diversity and genetic divergence
The combined alignment of the mitochondrial control region (923 bp C.c Ukraine, Crimea 3 Hap113 (2)

| Phylogenetic analysis of mitochondrial haplotype
Phylogenetic trees using NJ, MP, ML, and Bayesian approaches generated similar patterns of major branches, and therefore, only the Bayesian tree is presented. The Bayesian tree revealed three major haplogroups with strong posterior probability values >.99 and additional 10 minor clusters and singletons (Fig. 3). These minor clusters and singletons mainly originated from eastern populations (SKM, RPRA, and RSMG) that do not differ from one another according to pairwise F ST 's. Thus, we regarded these clusters as "all the rest" rather than separating small clusters, and they are designated as "group D" in further analyses. Geographical analysis of haplogroup distribution T A B L E 2 Estimates of genetic diversity of regional Siberian roe deer populations

T A B L E 3 Pairwise estimates of genetic differentiation between Siberian roe deer populations
The median-joining network was star-shaped (Fig. 4) (Table 6).
Although Fu's Fs for SKM and SKJ did not differ significantly, the one-sided bell shape of the mismatch distribution suggests more recent population growth than in other eastern populations (RPRA and RSMG) (Fig. 5).

| Genetic diversity and demographic history
In this study, we investigated and compared genetic parameters for populations from most parts of the geographical range of  T A B L E 4 Distribution of haplotypes in each region among clusters (i.e., haplogroups) revealed by Bayesian tree analysis (Fig. 3) Danilkin, 2013 Table 1 for regional abbreviations exhibited the highest levels of genetic diversity in most diversity estimates, except combined nucleotide diversity. This result may be attributable to mixed samples from historically split populations or to stable populations with large long-term effective population sizes (Grant & Bowen, 1998). High nucleotide diversity of Siberian roe deer at the northern periphery of its geographical range in Yakutia (RYA) may belong to the former scenario as it was reported that the population there is a mixture of two subspecies, C. p. pygargus and C. p. tianschanicus (Argunov, 2013;Boeskorov & Danilkin, 1998). Co-occurrence of two subspecies could result in high genetic diversity in a mixed sample.
A moderate level of genetic diversity from western (RUKO) parts of the range may reflect relatively stable populations. This suggestion is supported by the mismatch distribution and neutrality tests. Relatively low genetic diversity in the sample from Western Siberia and Western Altay regions (RARN) could be explained by the small sample size. Siberian roe deer from Jeju Island (SKJ) showed the lowest level of genetic diversity in this study and compared to other species. This is presumably due to its geographical isolation (Jo, Kim, Choi, & Oh, 2012) and a small founder population originally colonizing Jeju Island. The Mainland Korea (SKM) population showed relatively high haplotype diversity and low nucleotide diversity among the populations sampled. This may be attributable to a rapid population expansion after a period of low effective population size caused by a demographic bottleneck (Grant & Bowen, 1998).
Support for a population expansion is also indicated by the mismatch distribution, and by the Tajima D and Fu's Fs values. the two populations. According to Zvychainaya et al. (2011), there are two main haplogroups found in the area from the Urals to Baikal, and these groups probably correspond to haplogroups B and C in our study. The single haplogroup detected from the area of Lake Baikal to the Pacific Ocean in that study may correspond to haplogroup A in our study. Similar to our results, the Lake Baikal region was where "eastern" and "western" haplogroups were found together. However, Zvychainaya et al. (2011) showed that the western part of the species range (the Urals and Western Siberia) harbored a genetically unrelated haplogroup to the Russian Far East and Yakutia. In contrast, our study demonstrated that three haplogroups and group D are found in central and eastern populations, and subsets of these haplogroups (B and C) are found in Western populations. The geographical distribution of haplogroups coincides with the ranges of subspecies C. p. pygargus (to the west of Lake Baikal) and C. p. tianschanicus (to the east of Lake Baikal), which were previously described based on morphological and cytogenetic traits (Danilkin, 1999;Groves & Grubb, 2011). Meles meles, where one of the haplogroups was found in the northern and eastern part of the species range, Ireland, but not in nearby Britain (Frantz et al., 2014). Haplogroup A was also found in Yakutia, Russia (RYA), which was a common haplogroup in the eastern and southeastern parts of the species range. Such genetic composition of Yakutia populations could result from several "waves" of species expansion into these territories.

| Distribution scenarios of Siberian roe deer
There are several possible scenarios that could explain the observed distribution of genetic lineages of Siberian roe deer. One possible scenario is that all the analyzed samples originated from the same ancestral group, which was preserved in refugia during the periods of Central Asia followed by periods of glaciation (Matjushkin, 1982). In

| Taxonomic status of Siberian roe deer on Jeju Island
Our results raise questions about the taxonomic status of the Siberian roe deer inhabiting Jeju Island, which was composed of only one haplogroup (cluster B). Genetic distinction (pairwise F ST and haplotype distribution) of the Jeju Island population from all other populations does not allow classification of the Jeju roe deer as C. p. tianschanicus (Koh, Yang, Yoo, & Chun, 2000), nor as a distinct subspecies as suggested by Koh and Randi (2001) and Park et al. (2014). Siberian roe deer from Jeju Island are indeed different from those of mainland Korea (Lee et al., 2015), but they do not appear to represent a distinct phylogenetic clade, sharing the main haplogroup of Western populations. On the other hand, roe deer on Jeju Island are much smaller than those inhabiting the western part of the range. The total body length and height at the shoulder are almost 1.5 times smaller (144 vs. 96 cm and 92 vs. 57.5 cm, respectively) in Jeju roe deer (Danilkin, 1999;Park, Lee, Kim, & Oh, 2011). Genetic similarity associated with obvious morphological differences gives an example of discordance between genetic and morphological evolution in mammals. Lack of correlation between genetic and morphological traits is clearly related to the type of molecular marker, mitochondrial DNA, employed in this study, because comparison of populations based on nuclear markers such as microsatellites revealed clear differences between the Jeju population and roe deer from the western and eastern part of the geographical range (Lee et al., 2015). Therefore, additional and more comprehensive studies will be necessary for clarifying the taxonomic status of roe deer on Jeju Island, Korea.

| CONCLUSION
Our data show that roe deer in the area from the Urals to the Pacific Ocean are genetically diverse and that the geographical distribution and composition of haplogroups support previously described ranges of the subspecies C. p. pygargus and C. pygargus tianschanicus. We

ACKNOWLEDGMENTS
We gratefully acknowledge Dr. Thomas W. Sappington for his valuable comments and revision of this manuscript. We would greatly appreciate to Mr. Alexey Gurin for providing us with pictures of roe deer. We would like to express our extreme gratitude to Mr.