The First Mitogenome of the Cyprus Mouflon (Ovis gmelini ophion): New Insights into the Phylogeny of the Genus Ovis

Sheep are thought to have been one of the first livestock to be domesticated in the Near East, thus playing an important role in human history. The current whole mitochondrial genome phylogeny for the genus Ovis is based on: the five main domestic haplogroups occurring among sheep (O. aries), along with molecular data from two wild European mouflons, three urials, and one argali. With the aim to shed some further light on the phylogenetic relationship within this genus, the first complete mitochondrial genome sequence of a Cypriot mouflon (O. gmelini ophion) is here reported. Phylogenetic analyses were performed using a dataset of whole Ovis mitogenomes as well as D-loop sequences. The concatenated sequence of 28 mitochondrial genes of one Cypriot mouflon, and the D-loop sequence of three Cypriot mouflons were compared to sequences obtained from samples representatives of the five domestic sheep haplogroups along with samples of the extant wild and feral sheep. The sample included also individuals from the Mediterranean islands of Sardinia and Corsica hosting remnants of the first wave of domestication that likely went then back to feral life. The divergence time between branches in the phylogenetic tree has been calculated using seven different calibration points by means of Bayesian and Maximum Likelihood inferences. Results suggest that urial (O. vignei) and argali (O. ammon) diverged from domestic sheep about 0.89 and 1.11 million years ago (MYA), respectively; and dates the earliest radiation of domestic sheep common ancestor at around 0.3 MYA. Additionally, our data suggest that the rise of the modern sheep haplogroups happened in the span of time between six and 32 thousand years ago (KYA). A close phylogenetic relationship between the Cypriot and the Anatolian mouflon carrying the X haplotype was detected. The genetic distance between this group and the other ovine haplogroups supports the hypothesis that it may be a new haplogroup never described before. Furthermore, the updated phylogenetic tree presented in this study determines a finer classification of ovine species and may help to classify more accurately new mitogenomes within the established haplogroups so far identified.


Introduction
The knowledge of species origin has always been an essential element for informed genetic diversity conservation. In particular, the evolution of domestic species has been intensively investigated during the last few decades. According to Vigne [1], archaeozoological findings suggest that the earliest detected domestications occurred in the Near East during the 11 th millennium before present (BP). In such a context, sheep and goats were the first livestock to be domesticated near the region known as Fertile Crescent [2]. In this area, the domestication of the Asian mouflon (Ovis gmelini) probably gave rise to the domestic sheep (O. aries) [1,[3][4][5][6][7], although the contribution of other wild species such as urial (O. vignei), argali (O. ammon) and European mouflon (O. a. musimon) has been suggested [8][9][10].
To date, five sheep haplogroups (HPGs) have been identified, including the last discovered HPG E [11]. The latter is the rarest haplogroup together with HPG D, whereas HPG C is the third more widespread, with samples retrieved in Asia, Fertile Crescent, Caucasus and the Iberian Peninsula. HPGs A and B are the most common ovine haplogroups; the first being more recurrent in Middle East, Asia, and Europe, while HPG B in European sheep [11]. Furthermore, molecular investigations based on the analysis of a partial sequence of the mitochondrial Dloop region [12] supported the occurrence of a new ovine haplotype, never observed before among domestic sheep; this was detected in Anatolian mouflons whose gene pool is composed of two different mtDNA haplotypes, one belonging to HPG A and one closely related to HPGs C and E, named by the authors as haplotype X.
Analyses based on comparisons of the whole mtDNA sequence, which significantly improve the resolution power of phylogenetic analyses if compared with single genes or small DNA fragments, clearly confirmed that neither urial nor argali sheep are the maternal ancestor of the domestic sheep [13]. In addition, the European mouflon should not be considered a truly wild sheep; instead, it more likely represents a remnant from early domestication events which has readapted to feral life [14][15][16].
Currently, the revised taxonomy of O. gmelini [20] differentiates the taxa into several "subspecies". The Armenian mouflon (O. g. gmelinii) from western Iran and easternmost Turkey, is considered the most probable ancestor of domestic sheep along with the Anatolian mouflon (O. g. anatolica) endemic to central Anatolia. A third subspecies has been assigned to the Cyprus mouflon (O. g. ophion), also known as agrino, a wild sheep found exclusively in the Mediterranean island of Cyprus. The agrino is the only wild representative of the Caprinae subfamily on the island, and the largest animal of the local wild fauna [21]. The mouflon population in Cyprus currently counts~3,000 individuals living in the mountainous area of the Paphos forest, a region of 620,000 hectares located in the North West of the island and classified as a Special Protection Area since 2005.
The IUCN [22] included the Cypriot mouflon in the list of species considered as "vulnerable" due to poaching, habitat loss and fragmentation caused by the presence of road networks, and infection by pathogens (see [23] for details). To make matters worse, climate models predict that climate change will lead to higher temperatures and less rainfall in the Eastern Mediterranean with deleterious impact on the agricultural system and animal health [24].
The presence of mouflon in Cyprus is believed to date back to around 10,000 years ago (YA) [2]. Recent paleontological findings [25] indicated that the first sheep (O. aries) were introduced in Cyprus during the first half of the 10 th millennium BP, then replaced 500 years later by bigger sheep presumably introduced in the island from Syria. Based on the shape and size of the horns of these prehistoric animals, these sheep were similar to the wild sheep of the nearby mainland [25][26].
To date, only small portions of the mtDNA and nuclear DNA of the Cyprus mouflon are available. These segments include the complete sequence of the mitochondrial Cytochrome b gene (Cyt B) (GB#FR873149) [23], and the entire sequence of the alpha 1 (GB#EU938070.1), alpha 2 (GB#EU938071) [27] and beta (GB#DQ352469) [28] nuclear globin genes. Based on partial Cyt B sequence analyses, O. g. ophion clustered with the ovine HPGs E and C, whereas its D-loop belonged to HPG B [12,29].
The aim of this study was to shed some light on the evolutionary pathway that led to the emergence of current sheep haplogroups including as a new resource the Cyprus mouflon mitogenome. Mitochondrial DNA was used to infer evolutionary history and phylogeographic relationships among current species of the Ovis genus. In such a context, 27 mitogenome sequences were used to obtain a whole mtDNA genome-based phylogenetic tree of Ovis species. Such analysis was combined to a molecular dating to infer the evolutionary divergence timing among Ovis species providing a possible coalescence time for the rise of the current sheep haplogroups. To achieve a more comprehensive picture of the phylogenetic relationships among ovine haplogroups we analysed 35 additional D-loop sequences. The D-loop of three agrinos were sequenced and compared with the homologous region from representative samples of the five ovine haplogroups identified so far.
Animal species domestication has been carried out through the cross-breeding persistence with wild population [30], including Cyprus mouflon. This process started after the last glaciations when the increasing temperature determined abrupt environmental changes with the extinction of some species and the success of others [31]. The Cyprus mouflon would well represent the species which adapted to new environment. The present study, providing new information on the genetics of the Cyprus mouflon, would help to improve the efficiency of selection on breed traits linked to reproductive performance and to maintain good productive skills in arid environments.

Materials and Methods
The Ethics Committee of the University of Sassari, Italy, approved this study.
Peripheral blood samples were obtained from five Sardinian, three Cypriot and two Corsican mouflons and two Sardinian and two Chios sheep. The dataset was implemented downloading from Genbank 26 whole mitogenomes and 21 D-loop sequences (see Table 1 for details).
Furthermore, the data matrix included ten mtDNA whole genome sequences from sheep representing the five main mitochondrial haplogroups (A, B, C, D, E) (two sequences per haplogroup). The water buffalo [32], the Tibetan antelope [33], the Rocky Mountain goat [34], the Southern antelope, the white screwhorn antelope, the nilgai antelope, the blue wildebeest, the waterbuck, the red lechwe, the Siberian musk deer [35] and the cattle sequences were used as outgroups for the phylogenetic tree analysis (see Table 1 for scientific names and Genbank accession numbers). Genomic DNA was extracted using the GenElute blood genomic DNA kit (Sigma-Aldrich) according to the manufacturer's protocol. Sample quality and DNA concentration were determined via spectrophotometry using a ND-8000 (NanoDrop Technologies, Thermo Fisher Scientific Inc., Wilmington, DE). The DNA mean concentration obtained was 125 ng/μL.

Mitogenome amplification and sequencing
To perform amplification experiments, specific primers were designed for the most conserved regions belonging to all Ovis species mitogenomes available in databases.
A primer pair was selected when the average size of the amplicon was 1,100 base pairs (bp) long to allow sequencing reactions by means of the same primer. Each adjacent fragment had at least 100 bp of overlap to ensure complete sequencing coverage and the average annealing temperature was approximately 56°C.
A total of 21 primer pairs were selected with an average overlapping of 212 bp long fragments; the amplifications involved also two nested PCR reactions (see S1 Table for details).
PCR and sequencing reaction were performed according to the protocols provided by Pirastru et al. [27] and Manca et al. [28]. The annealing conditions for each primer are specified in S1 Table. The failure of the sequencing reaction for some amplicons, required the design of 5 additional sequencing primers to complete the whole mitogenome sequence (S1 Table).
Raw sequencing data were processed by Sequencing Analysis Software 5.3.1 (Applied Biosystem) and the quality value of each base in the electropherograms was assessed by the KB base-calling algorithm. Processed sequences were visualized using FinchTV 1.4.0 (Geospiza Inc.) and assembled into contigs, after identifying overlapping areas on Clustal X 2 [36]. Gene HPG: haplogroup; GB#: Genbank accession number. $ X for the Anatolian mouflon O. g. anatolica represents an haplotype whose haplogroup was not described by Demirci et al. [12].
arrangement was identified by comparing the sequences to that of O. aries (GB# NC_001941). Double peaks of similar height, which may be interpreted as evidence of mitochondrial pseudogenes in the nucleus (Numts) or heteroplasmy, were not observed in any of the electropherograms. Protein coding genes were detected by means of ORF Finder software (http://www.ncbi. nlm.nih.gov/gorf.html) by setting the type of mitochondrial code to the vertebrate genetic code. The tRNA genes were identified by the online program tRNA-scan SE (http://lowelab. ucsc.edu/tRNAscan-SE) using the mito/chloroplast genetic code and the default search mode.
In order to assess the number and the length of tandem repeats in the control region, which differentiate ovine haplogroups, the D-loop sequences were tested using Tandem Repeat Finder 3.01 [37]. The physical map of the Cyprus mouflon mitogenome was generated by means of OGDraw 1.2 [38]. The geographical distribution of sheep samples analyzed in the present study is shown in Fig 1.

Phylogenetic analysis of whole genome
In order to carry out phylogenetic analysis in accordance with a clock-like model we used the concatenated sequences of the 12 protein-coding genes, 14 transfer RNA genes and two ribosomal RNA genes, all located on the H-strand. Hereafter, we will refer to these mitochondrial segments as a unique molecular marker called 28H. The D-loop sequence was analysed as a single marker due to its higher nucleotide substitution frequency. Additionally, mutations are not randomly distributed across the length of the locus, making rate heterogeneity an important issue in calculating divergence date estimates [39][40][41].  Sequences were aligned using the programme CLUSTAL W [42], implemented in the BioEdit 7.0.5.2 software package [43]. The genetic variation within the genus Ovis was assessed estimating the number of polymorphic sites (S), number of haplotypes (H), haplotype diversity (h), and nucleotide diversity (π) using the software package DnaSP 5.10 [44]. MEGA 6.06 [45] was used to choose the nucleotide substitution model. According to the lowest BIC scores (Bayesian Information Criterion), AICc value (Akaike Information Criterion, corrected), Maximum Likelihood value (lnL), and the number of parameters, the general timereversible (GTR) [46] was selected for the three codon positions of the dataset. Non-uniformity of evolutionary rates among sites was modeled by using a discrete Gamma distribution (+G) with 6 rate categories and by assuming that a certain fraction of sites are evolutionarily invariable (+I).
Phylogenetic relationships among individuals were investigated by means of Bayesian Inference (BI) and Maximum Likelihood (ML) analysis using the 28H segment. MrBayes 3.2.4 [47] was used for BI. In order to search for the optimal evolutionary model combination for each gene, the 28H region was split up in 28 different segments corresponding to the mitochondrial genes that compose it. Two independent runs, each consisting of four Metropolis-coupled reversible-jump Markov chain Monte Carlo chains were performed. A Dirichlet distribution with quasi-flat priors (1, 2, 1, 1, 2, 1) was assumed for estimating the substitution frequency parameters. Analyses were carried out for 5 million generations and the trees were sampled every 10 generations. Convergence of chains was checked by ensuring that the standard deviation of split frequencies reached and stabilized at a value < 0.01 and by verifying the stationarity of the generations/log probability graph [48]. For each run, the first 25% sampled trees were discarded.
A ML tree reconstruction was performed using a GTR model by TREEPUZZLE 5.2 [49]. A priori tests for the detection of a phylogenetic signal were performed using the likelihood mapping option and the reliability of each branch was estimated by bootstrapping (10,000 puzzling steps). Molecular dating was carried out assuming seven different calibration points (CPs) based on fossil records that provide ages for nodes inside Bovidae [50] (see Table 2 for details).
Estimates of divergence times were obtained using the Bayesian approach implemented in BEAST 1.7.5 [51].
Sequences were analysed under the GTR+G+I model of sequence evolution, with 4 gamma categories, a Yule process speciation rate, and empirical base frequencies. We assumed a lognormal relaxed molecular clock with uncorrelated rates [51], setting the priors for multiple calibration points used in this study (Table 2) as described in Bibi et al. [50]. Four independent runs were carried out with the following settings: 20,000,000 steps, drawing data to file every 2,000 steps in order to obtain 10,000 records and trees. The output of two independent runs was analysed using Tracer 1.5 [52], applying a post-processing burn-in of 10% thus discarding the first 1,000 records for each run. Analysis of log files indicated for all parameters convergence of independent runs and the combined effective sample size (ESS) was >200. Subsequently, a maximum clade credibility (MCC) tree was created by (i) using a logcombiner [51] to merge tree files from each independent run after removing 10% of initial trees as burn-in and subsequently resampling of states to obtain a final sample of 9,000 trees, and (ii) using a tree annotator to create the consensus tree. Time to the most recent common ancestors (TMRCAs) was also estimated using two other methods based on the Maximum Likelihood framework. The first one is the RelTime method implemented in MEGA 6.06 and does not assume any specific lineage rate of evolution [45]. Divergence times for all branching points in the topology were calculated using Maximum Likelihood based on GTR+G+I model, and allowing for all the possible local clocks. 95% confidence intervals around each estimate were computed following the method of Tamura et al. [42]. With the second method, which is implemented in the APE R-package (R Core Team 2015) [53], TMRCAs was estimated using the Penalised Likelihood and Maximum Likelihood algorithm developed by Paradis et al. [54]. The algorithm is an improvement of the penalized likelihood method developed by Sanderson [55] and is available in the function Chronos. Divergence times were estimated assuming the strict model for substitution rates along the tree and setting the smoothing parameter to the default value.

Phylogenetic analysis of D-loop region
The phylogenetic analysis of the mtDNA control region was performed on a dataset including 35 Ovis sequences (21 from wild and 14 from domestic sheep), 14 of which were obtained in this study (see Table 1B for details). Sequences were aligned and the genetic variation was assessed as above reported.
The data matrix, including ten sheep D-loop sequences representing the five main mitochondrial haplogroups (two per each HPG), was used to carry out both a Bayesian and Maximum Likelihood analyses according to methodologies above reported.
Due to the larger number of sequences available for this marker, the presence of a genetic structure among haplogroups was also assayed by the Bayesian model-based clustering algorithm implemented in BAPS 5.3 [56]. Clustering was performed using the module for linked molecular data and applying the codon linkage model, which is appropriate for sequencing data. The analysis was run ten times with a vector of K values = 2 to 22, each with six replicates. Haplotypes were organized into haplogroups according to the genetic structure evidenced by Bayesian clustering.
A 95% statistical parsimony network analysis was performed using the software package TCS 1.21 [57], aimed at searching for possible disconnections between groups of individuals, further inferring the genetic relationships among the haplotypes. Gaps were treated as a fifth character state.
Statistical parsimony is an alternative method for network construction that joins haplotypes within a parsimony connection limit, the latter being the maximum number of differences not due to reversion between haplotypes for which a 95% confidence exists.
The software package Network 4.5.0.1 (www.fluxus-engineering.com) was used to construct a median-joining network [58] to be superimposed on the sample map in order to infer the genetic relationships among the haplotypes and analyze the occurrence of discrete geographic genetic clusters. Transitions and transversions were equally weighted (default option).
The pairwise genetic distances corrected according to the Kimura two-parameter model (K2P) [59] were estimated between individuals by means of the software MEGA 6.06 [45] with 1,000 bootstrap replicates.

Results
We determined the complete nucleotide sequence of the mitogenome of the Cyprus mouflon (GB# KF312238) (Fig 2).
The  Table 3, together with the inferred start and stop codons as determined by comparison with the homologous gene sequences of domestic sheep.
The percentage composition in bases of the L-strand is 33.6% A, 25.8% C, 13.1% G, and 27.4% T, in accordance with those obtained for O. aries [60]. As commonly observed in mammal mitogenomes, ND6 and eight tRNA genes are encoded in the L-strand, and all protein-coding genes have the ATG start codon, except for ND2 and ND3 genes, that begin with ATA. Cyt B is the only gene that stops with AGA instead of the TAA codon which is incomplete in the ND2, ND4, COIII (T) and ND3 (TA) genes. The origin of the light strand replication (O L ) is located at the 5,164-5,195 nt region, and has the same sequence as the domestic sheep [60].
The mtDNA sequence of the Cyprus mouflon matches with those of other members of the Caprinae sub-family, and features like incomplete stop codons, overlapping coding regions, and different start codons lie within the range of mammalian mtDNA variation [60][61].

Whole genome (28H): phylogenetic analyses and molecular dating
The complete Cyprus mouflon mtDNA sequence was compared with the homologous sequences of the five sheep mtDNA haplogroups identified by Meadows et al. [13] (see Table 1 for details).
Overall, 27 sequences representative of bovid species were included into the dataset and the M. moschiferus sequence was used as outgroup (Fig 3). Among 15 individuals belonging to six species within the Ovis genus, 15 haplotypes, defined by 1,094 polymorphic sites (S), were found. Overall, high values of total mean haplotype and nucleotide diversity, were obtained (h = 1 and π = 0.016, respectively). A lower value of In the Gene column (L) indicates a gene encoded on the L-strand. * Incomplete stop signals.
nucleotide diversity (π = 0.006) was found among the individuals of the species O. aries. Estimates of genetic diversity for the whole genome are reported in Table 4.
The BI tree analysis identified two main clusters of domestic sheep (I and II in Fig 3) supported by high posterior probabilities (Prob 0.99). Cluster I comprised of the sequences of Cypriot mouflon and sheep HPGs C and E, the latter was included into a well-supported subcluster.
Cluster II included two well-supported main groups, one of them including HPG D, and the other both HPGs A and B together with the European mouflon, closely associated to HPG B. Among the other Ovis species, the most phylogenetically related to Cluster I and II was the urial.
The ML analysis (tree not shown) was consistent with BI, showing both the same topology and the same highly supported nodes at the main groups retrieved (for the corresponding bootstrap values, see Fig 3).
To quantify the divergence among haplogroups, pairwise genetic distances between groups were calculated under the K2P model (S2 Table).
The rate variation among sites was modeled with a gamma distribution (shape parameter = 0.05). The lowest level of genetic differentiation was observed between HPG C and E (0.003 ± 0.0005), closely followed by HPG A and B (0.005 ± 0.001). The genetic distance between O. g. ophion sequences and the other haplogroups range from 0.005 ± 0.001 (with HPG C and E) to 0.011 ± 0.001 (with HPG B and D).
The divergence times of splitting events within the genus Ovis, resulting from multiple point calibration (see Table 2 for further details), are reported in Table 5 and   Table 1.

D-loop region: features and phylogenetic analyses
A total of 14 complete D-loops from two Sardinian and two Chios sheep, two Corsican, three Cypriot and five Sardinian mouflons were here sequenced (GB# KR011770-82).
All samples from Sardinia, Corsica and Chios islands harboured a 1,179 bp long D-loop region, except for one indel occurring in a specimen of Corsican mouflon (1,180). Instead, the three Cypriot mouflons showed a 1,184 bp long D-loop segment, with four copies of a 76 bp long repeat motif, located within a fragment of 304 bp long ranging from the 15,654 nt to the 15,957 nt, referring to the whole mtDNA domestic sheep sequence (GB# NC_001941) [60]. Repeat units of 76 bp are typical of sheep belonging to the HPGs C and E [12].
As it was already described for O. aries [60], each repeat contains two octamer sequences of mirror symmetry (TTAATGTA, TACATTAA) which can form stable stem loops. The D-loop   Table 1 for details). Three Anatolian mouflon (O. g. anatolica) sequences, representative of each clade (haplogroup A and haplotype X) previously described for this species by Demirci et al. [12], were included into the dataset. The bighorn, the argali and the urial wild sheep sequences were used as outgroups.
Among 35 individuals belonging to seven species within the genus Ovis, 23 haplotypes, defined by 645 polymorphic sites (S), were found. Overall high values of total mean haplotype and nucleotide diversity, were obtained, h = 0.976 and π = 0.138, respectively. The Anatolian mouflon (O. g. anatolica) showed the lowest average value of haplotype diversity (h = 0.600), whereas the lowest nucleotide diversity values were found among individuals belonging to the two species O. a. musimon and O. g. ophion. Estimates of genetic diversity for the whole genome are reported in Table 4.
BI tree analysis identified two different maternal lineages in the radiation of domestic sheep (Fig 4).
Results were consistent with clusters (I and II) identified in the 28H BI and ML tree analysis (Fig 3). Accordingly, cluster I grouped HPGs C and E, while cluster II included HPGs A, B and D. In accordance with Demirci et al. [12], our results showed that cluster I exclusively grouped Within cluster I, two groups were observed. One including the HPG E, and the second including the HPG C together with some Anatolian and all Cypriot mouflons. In the latter, a well-supported genetic structure was evident between HPG C and the mouflons belonging to Anatolia and Cyprus. A further sub-structure was also evident between Anatolian and Cypriot specimens.
The Sardinian, Corsican and European mouflons grouped together within cluster II in a single group along with the domestic sheep from Sardinia and Chios islands and the HPG B sequences. Within the same cluster, the remaining Anatolian mouflons are included in a second group along with HPG A sequences. The ML analysis (tree not shown) was consistent with BI, showing both the same topology and the same highly supported nodes at the main groups retrieved (for the corresponding bootstrap values, see Fig 4). Consistently with the BI and ML tree, Bayesian assignment analysis evidenced the occurrence of four genetic groups hereafter named as G 1 (yellow in Fig 4 and S1 Fig), G 2 (red in Fig 4 and S1 Fig) Table; Fig 4 and S1 Fig).
Cluster G 1 included all mouflons from Cyprus, three Anatolian mouflons along with sequences belonging to sheep HPGs C and E. Cluster G 2 includes the remaining Anatolian mouflons (three individuals) along with sheep HPG A sequences. Cluster G 3 grouped all the mouflons from Sardinia, Corsica, and continental Europe, along with all the domestic sheep from Sardinia and Chios and sequences belonging to sheep HPG B. Cluster G 4 included individuals representative of HPG D exclusively.
The statistical parsimony network analysis retrieved seven disconnected clusters within the Ovis genus. Three main clusters (α, β, γ in Fig 4) encompassed 78% of the individuals, while four additional minor clusters correspond to one or two individuals representative of the domestic sheep HPGs C, E and D and of one Anatolian mouflon individual. Cluster α was exclusive of Anatolian and Cypriot mouflons. Cluster β encompassed domestic sheep haplogroup A individuals and some Anatolian mouflons. Cluster γ corresponded to the Bayesian genetic group G 3 above reported.
In accordance with the statistical parsimony network analysis, the median-joining network highlighted the occurrence of three main groups of haplotypes likely sharing a central common Rooted tree obtained by Bayesian inference for D-loop region and the corresponding 95% statistical parsimony networks. Bayesian groups inferred by the Bayesian assignment test are also represented through coloured boxes (G 1 in yellow, G 2 in blue, G 3 in red, G 4 in green). Nodal supports are indicated above the nodes (posterior probability for BI / bootstrap values for ML). Groups retrieved using the 95% statistical parsimony networks are shown on the right. Each black dot in the networks represents a point mutation. Haplotypes in the network are coloured according to the groups of individuals analyzed (see Table 1 for details). Sample codes are given in Table 1. ancestor (CCA) (Fig 5). A group (N1) exclusively encompassed haplotypes from western Europe and Aegean Sea (Chios) along with individuals representative of the domestic sheep HPG B. It corresponds to the Bayesian genetic group G 3 and the statistical parsimony network cluster γ, respectively, and diverged for 18 point mutations from the CCA. A further group (N2) encompassed domestic sheep haplogroup A individuals and some Anatolian mouflons. It corresponds to the Bayesian genetic group G 2 and diverged for 16 point mutations from the CCA. The last group (N3) encompassed all Cypriot and some Anatolian mouflons along with individuals representative of the domestic sheep HPGs C, E and D. It corresponds to the cluster I in ML and BI tree analysis and diverged for 6 point mutations from the CCA. Notably, this group diverged from the CCA for a number of point mutations (22) comparable with the other groups if considering only Cypriot and Anatolian mouflons along with HPGs C and E. Pairwise genetic distances between groups were calculated using the K2P correction model (S4 Table). The lowest level of genetic differentiation was observed between HPG C and E (0.018 ± 0.004), closely followed by HPG A and B (0.0345 ± 0.006). The genetic distance between Cypriot and Anatolian mouflons carrying the haplotype X was 0.013 ± 0.003. Into cluster II of ML and BI tree analysis, the average variability between the group composed of Cypriot and Anatolian mouflons carrying the haplotype X and the sheep HPG C and E was 0.0235 ± 0.004 and 0.0195 ± 0.0045, respectively. This value is higher than the distance between HPG C and E.

Discussion
The evolution and taxonomy of the genus Ovis is still a debated topic, and the relationship between the extant wild sheep and domestic sheep remains unsolved [63]. In the last decades, CCA indicate the likely common ancestor. The capital letters (A, B, C, D and E) inside white spots on the network correspond to the five sheep haplogroups used as references. Small red plots on the nodes, correspond to median vectors representing hypothetic connecting sequences, calculated with a maximum parsimony method. The long branches leading to isolated haplotypes were shortened and indicated with ''\\". The numbers of mutations between haplotypes that are greater than one are reported on the network branches. Sample codes are given in Table 1.
doi:10.1371/journal.pone.0144257.g005 different wild sheep have been proposed as potential ancestors of domestic sheep, such as the urial, argali and European mouflon. However, based on analyses performed by means of mtDNA sequences and endogenous retrovirus [10,16] none of them can be conclusively confirmed as being the living ancestor of sheep. The Asian mouflon (O. gmelini) is currently considered as the next closest extant species to domestic sheep [5][6][7]12].

Cyprus mouflon mitogenome
In this study the first Asian mouflon mitogenome from a Cypriot individual (O. g. ophion) was provided and compared with those from other Ovis species. The mtDNA main structural features (see [60][61] for details) did not differ between the Cyprus mouflon and other Caprinae species.
As far as control region variation is concerned, the Cyprus mouflons here analyzed were included in a cluster closely related to the Anatolian individuals carrying the haplotype X [12]. The only mitochondrial haplotype previously reported for O. g. ophion (HPG B- [29]) was not detected in this study. However, this may be due to the reduced number of Cypriot sequences analysed here.

Phylogeography patterns of distribution of sheep haplogroups
Phylogenetic analysis, carried out on the 28H and D-loop regions, evidenced the occurrence of two main phyletic lines emerging from a common ancestor as it was evidenced by ML and BI tree analysis. Domesticated sheep are included in both clusters thus suggesting the occurrence of a high genetic differentiation among domestic sheep mitochondrial lineages as possible consequence of the domestication of several phylogenetically related ancestors.
The first cluster spread exclusively in the Near East (cluster I in Figs 3 and 4), while the second one spread in the Near East and Europe (cluster II in Figs 3 and 4). Within the Near East cluster, the lowest level of genetic differentiation among haplogroups was detected between HPGs C and E, as inferred by genetic distances estimation.
On the basis of D-loop ML and BI analyses, individuals from Anatolia (mouflons) were included in a cluster (cluster II in Fig 4) grouping sheep HPGs A, B and D. Such findings were partially consistent with those provided by Demirci et al. [12]. With the exception of the three Cyprus mouflons, all the D-loop sequences obtained in the present study belonged to HPG B. The dominance of HPG B among European ovine breeds can be explained by the combined action of founder effects and genetic drift, occurred during the migration of the Neolithic farmers into Europe. Otherwise, assuming that the introduction of sheep into Europe may have followed two subsequent waves [16], the HPG B dominance could be explained by a massive presence of this lineage among the sheep populations that spread during the second wave.
Analyses performed on the D-loop by statistical parsimony network also indicated the presence of an additional well-supported genetic sub-cluster within the ML and BI cluster I (Fig 4) exclusively grouping the two species of Asian mouflons (O. g. ophion and O. g. anatolica). Genetic distances estimation supports the occurrence of a new haplogroup (HPG X). Indeed, as showed in the S4 Table, the distance between HPGs C and E (0.018 ± 0.004) is lower than those between HPG X and HPG C (0.0235 ± 0.004) or HPG X and HPG E (0.0195 ± 0.0045). Interestingly, the median-joining network analysis (Fig 5) suggested that sheep HPG D could represent the closest haplotype to the past common ancestor for sheep and mouflons.
Results obtained for the D-loop region (1,270 bp-long) were not always consistent with those obtained from the whole mitochondrial genome regions (28H) (14,426 bp-long). Such a discrepancy may be related to several reasons. Indeed, the 28H segment is more than ten times longer than the D-loop region, which occupies less than 7% of the mtDNA genome [64][65][66].
Furthermore, analysis inferred from the D-loop alone can be problematic given that this locus mutates rapidly and it is subjected to saturation due to excessive homoplasy [41,67]. In addition, several equally likely gene trees can often be inferred from D-loop sequences, particularly when large numbers of samples are analyzed [68][69].

Molecular dating
Based on the results obtained in this study, the group including domestic sheep and Cyprus mouflon diverged from urial around 0.89 MYA in accordance with previous analysis of retrotypes and morphological traits which dated this event around 0.8 MYA [16]. Furthermore, domestic sheep along with the Cyprus mouflon and urial diverged from argali around 1.11 MYA.
The rise of the extant sheep haplogroups is estimated to have occurred at around 5-35 KYA, which was about 25 thousand years before the first domestication event, as assumed based on fossil records and archaeozoological evidence [2] (Table 5). Divergence times estimated in this study encompassed the first domestication events suggested by either fossil records and archaeozoological evidence. However, our results do not rule out the chance that first domestication might have occurred earlier than 10 KYA.
Our estimates of the earliest radiation of the domestic sheep common ancestor (298 KYA) (see Fig 3 for details) are not consistent with Meadows et al. [13], who placed this event around 920 KYA. Such a discrepancy might be related to the different molecular marker used by these authors who obtained their estimates on the concatenated sequences of the 12 protein coding genes of the H strand (12H), which cover a smaller portion of the whole mitogenome if compared with the 28H segment used in the present study.
The coalescence estimates provided by Meadows et al. [13] might also be affected by the use of a single Cyt B gene-based calibration point not directly referred to a fossil reference. The use of a smaller molecular marker combined with the use of only one calibration point may affect the estimation of divergence times. However, we should underline that such a discrepancy might as well reflect the different method used to estimate this event. In fact, estimates obtained using APE, assuming a relaxed clock with correlated rates, were similar to those obtained by Meadows et al. [13] (0.89 versus 0.92 MYA). Noteworthy, the method used in the present study for estimating divergence times in APE is partly based on the Penalized Likelihood (data not shown). Based on our molecular dating, the five sheep haplogroups here retrieved originated between 5-35 KYA. This is consistent with a sympatric or allopatric differentiation event likely occurring in the Near East among wild sheep populations during the Pleistocene. In such a context, the current sheep haplogroups would not be the result of multiple, independent domestication events but rather represent the remnants of an ancient high genetic variability which spread to Asia and Europe during the Neolithic human migrations [30]. Indeed, it is estimated that goat and sheep domestication events occurred around 10.5-11 KYA in regions encompassing northern Zagros (North of Iraq) and south eastern Anatolia (South-East Turkey) [1,2,5,[25][26]70]. The coalescence estimates here provided for the rise of the Cypriot mouflon (190 KYA) set this event considerably earlier than the origin of sheep haplogroups .
The island of Cyprus is considered one of the first areas where domesticated sheep were introduced [16,25], thus offering a valuable insight into the evolution of its unique endemic mouflon. At the present time, the most probable hypothesis is that mouflons were brought to Cyprus by humans about 12 KYA [12], following an independent evolutionary history, most likely facilitated by the absence of potential predators/competitors as supported by fossil records [71][72].
However, an alternative hypothesis that large mammals, including mouflons, have seemingly arrived on Cyprus before the arrival of humans is also possible. During the period of minimum sea levels through Pleistocene glacial maxima, the sea dropped to at least 125 m below the current level, and the Cyprus shoreline expanded towards the mainland by several kilometres (see Fig 1 from [25] for more details). It has also been reported that between 25 and 18 KYA there were small mounts above sea level, forming three islands between Cyprus and the mainland that might have been used by animals as a stepping stone pathway for an average period of 10 KYA [25]. Cyprus Pleistocene fossil sites consist almost exclusively of pygmy hippopotamus (Phanouirios minutus) and pygmy elephant (Elephas cypriotes) [71][72]. The lack of sheep fossils could be consistent with the absence of mouflons and other wild sheep in the island during this period [71]. However, it is worth noting that the sheep's preference for mountainous habitats and the relative unfavourable conditions for fossil preservation [73] might have prevented the finding of fossil records.
Even considering such an alternative hypothesis relating to the early presence of mouflon in Cyprus, it appears more likely that the high genetic distance observed between the Cypriot mouflon and other sheep haplogroups is due to a limited crossbreeding with domestic sheep. Indeed, contrary to what happened in the other main Mediterranean islands of Corsica and Sardinia, sheep farming has never been crucial to the economy of Cyprus, where for a long period the "fat tailed" sheep was the only variety of domestic sheep on the island until the introduction of new breeds from Israel and Europe in the 1970s [74][75][76].
Fat-tail breeds are an important class of sheep breeds characterized by specific phenotypic traits, that are first documented as being present 5,000 YA [77][78]. These breeds are well adapted to arid regions thanks to the fat deposition stored in the tail that represent an energy reserve during times of drought and feed shortage. The fat-tail sheep are commonly found in a wide geographical range with subtropical to semi-arid climate, especially including the Middle East and North Africa [78]. The fact that the fat-tail breeds are now prevalent in the Fertile Crescent, where sheep were originally domesticated, while thin-tail sheep breeds are predominant in peripheral areas [78], and that the wild ancestor of sheep is thin-tail suggests that the first domesticated sheep had a thin-tail and the fat-tail was developed later [77].
These findings suggest that the Cypriot mouflon, which never experienced modern selection strategies, can be considered as a relic of the first wild domesticated populations [12,16] likely representing one of the closest descendants of the Palaeolithic Anatolian wild sheep. As a consequence of its geographic isolation, it presumably returned early to a feral or semi-feral state before the secondary event of domestication occurred in South-West Asia, involving in a second time Europe, Africa and the rest of Asia [16]. Some Cypriot primitive sheep might have survived the migrations of the secondary domesticated breeds from South-West Asia in areas without predators or by occupying sites less involved in trading contacts [16]. Future phylogeographic analyses on a larger number of Cypriot mouflons might be useful in depicting the population dynamics which affected the origin of the species on the island to better clarify the relationships between O. g. ophion and other ovine breeds, such as the Cypriot fat-tail sheep, thus providing useful insights into the management of the Cypriot distinct genetic conservation unit.
The threefold molecular dating methodologies used in this study enabled us to provide a well-supported estimate of the time of coalescence of the current ovine haplogroups setting their origin between 5-35 KYA. These lineages probably extended their range towards Europe only at around the 10-11 th millennium BP, when human populations moved to the continent from the Near East [79][80].
In conclusion, the Cyprus mouflon whole mtDNA sequence here provided improves the panel described by Meadows et al. [13] thus identifying a potential new sheep haplogroup which includes the Cypriot and Anatolian mouflons. However, since the whole mtDNA sequence of the Anatolian mouflon is not available, this hypothesis was inferred by the analysis performed on the D-loop sequencing data only and requires to be confirmed by means of deeper analyses on the whole mitogenome. Additionally, further analysis on a larger number of samples from Near East and Europe might shed light on the occurrence of new haplogroups never described before, resulting in a more complete overview of the phylogenetic relationships among ovine breeds from different geographic areas.
Finally, the molecular data provided in this study may also play an important role in functional genomics or functional pathways related to energy metabolism. Indeed, mtDNA have an important role in bioenergy production and thermogenesis and thus in climate adaptation [81][82]. Genomic research can provide additional knowledge on thermal impact effects on the energy metabolism. In such a context, the analysis of the Cyprus mouflon whole mitogenome sequence could be useful to identify mutations potentially related to mitochondrial heat production [82], improving the knowledge on the adaptation to a subtropical to semi arid climate similar to the climate of Cyprus. These new genetic tools will enable researchers to review the history of the species and the geographic distribution of haplogroups in the light of environmental change adaptation that occurred during evolution, and to increase the efficiency of breed traits selection related to the reproductive ability in semi-arid areas [83] in order to improve food security.  Table 1. (TIF) S1 Table. List of the primers used to carry out PCR and sequencing reactions to obtain the first Cyprus mouflon mtDNA sequence. bp: base pairs; Ta: annealing temperature. In the Forward and Reverse columns, numbers in parenthesis refer to the position of the 3' end L-strand. (PDF) S2 Table. Estimates of evolutionary divergence between whole genome (28H) sequences. Genetic distances are shown below the diagonal and standard deviations above the diagonal. Analyses were conducted using the K2P model. Sample codes are listed in Table 1. (PDF) S3 Table. Frequencies distributions of the four Bayesian genetic groups found in the present study for D-loop region. N: sample size; %: relative frequency of distribution. Sample codes are listed in Table 1. (PDF) S4 Table. Estimates of evolutionary divergence between D-loop sequences. The genetic distances are shown below the diagonal and standard deviations above the diagonal. Analyses were conducted using the K2P model. Sample codes are listed in Table 1. (PDF) Stavrostis Psokas Forest Station, as well as the late C. Papamichael, Head of the Game and Fauna Service of Cyprus for assisting with sample collection, and Dr. Fabio Scarpa for his invaluable help in statistical analysis. We also thank our native English speaker colleague Giustina Casu Finlayson for her help in revising the English form.