Positive selection and recombination shaped the large genetic differentiation of Beet black scorch virus population

Beet black scorch virus (BBSV) is a species in the Betanecrovirus genus, in family Tombusviridae. BBSV infection is of considerable importance, causing economic losses to sugar beet (Beta vulgaris) field crops worldwide. Phylogenetic analyses using 3′UTR sequences divided most BBSV isolates into two main groups. Group I is composed of Iranian isolates from all Iranian provinces that have been sampled. Chinese, European, one North American and some other Iranian isolates from North-Western Iran are in Group II. The division of Iranian BBSV isolates into two groups suggests numerous independent infection events have occurred in Iran, possibly from isolated sources from unknown host(s) linked through the viral vector Olpidium. The between-group diversity was higher than the within-group diversity, indicating the role of a founder effect in the diversification of BBSV isolates. The high FST among BBSV populations differentiates BBSV groups. We found no indication of frequent gene flow between populations in Mid-Eurasia, East-Asia and Europe countries. Recombination analysis indicated an intra-recombination event in the Chinese Xinjiang/m81 isolate and an inter-recombination breakpoint in the viral 3′UTR of Iranian isolates in subgroup IranA in Group I. The ω ratios (dNS/dS) were used for detecting positive selection at individual codon sites. Amino acid sequences were conserved with ω from 0.040 to 0.229 in various proteins. In addition, a small fraction of amino acids in proteins RT-ORF1 (p82), ORF4 (p7b) and ORF6 (p24) are positively selected with ω > 1. This analysis could increase the understanding of protein structure and function and Betanecrovirus epidemiology. The recombination analysis shows that genomic exchanges are associated with the emergence of new BBSV strains. Such recombinational exchange analysis may provide new information about the evolution of Betanecrovirus diversity.


Introduction
Soilborne viruses, especially those persistently transmitted by plasmodiophorid and chytrid vectors, are economically important and cause considerable losses to sugar beet (Beta vulgaris) production worldwide [1]. Beet  in the greenhouse. An autoclaved potting soil was used as control. Soil samples were mixed with equal parts of autoclaved sand to facilitate roots removal from susceptible B. vulgaris cv. Jolge plants at harvest. Soil samples were placed in new 280-ml cups (used instead of pots) with holes in the bottom for drainage. The cups were placed in sterilized plastic saucers spaced on greenhouse benches to avoid cross contamination due to water splash. The plants were harvested 8-10 weeks after planting and used for nucleic acid extraction and sequencing. Each sequence obtained represents a single virus isolate. The isolates used in this study were selected from soil samples collected from ten various sugar beet fields (Fig 1).

Enzyme-linked immunosorbent assay (ELISA)
Samples were prepared following washing the roots of seedlings from each pot to remove the soil. Root tissue samples (0.2 g from each root mass) was placed in extraction bags containing 2 ml of extraction buffer (0.05 M phosphate-buffered saline, pH 7.2, 0.5% Tween 20, 0.4% dry milk powder) and homogenized with a handheld roller press. Extracted sap was added to duplicate wells of a microtiter plate (100 μl per well). Each plate also contained controls including sap from BBSV-infected and healthy beet roots. Double antibody sandwich ELISA reagents were purchased from LOEWE (Sauerlach, Germany) and were used to assay for BBSV. Purified IgG prepared against BBSV (1 mg/ml) was used to coat microtiter plates at a dilution ratio of 1:200 according to the manufacturer's instructions. Alkaline phosphatase-conjugated anti-BBSV IgG was added to wells (dilution 1:200). Alkaline phosphatase substrate (Sigma Chemical Co., St. Louis, MO) was used at a ratio of 5 mg in 8.3 ml of substrate buffer. Absorbance readings (A405nm) were recorded 30 min after the addition of substrate using a Bio-Tek ELx800 microplate reader (Winooski, VT). Absorbance values greater than the mean plus three times the standard deviation of the OD test values of the negative controls at 405 nm were considered positive.

Immuno-capture reverse transcription-polymerase change reaction (IC-RT-PCR)
Immuno-capture of BBSV virions was performed in 0.2 mL PCR tubes pre-coated with 50 μL of virus-specific IgG (2 μg/mL, in carbonate buffer pH 9.6), incubated for 3 hr at 37˚C and subsequently washed three times with PBS-Tween. Then 50 μL of sap extract (0.1 g of root extract in 1 mL of PBS-Tween) was added to the tube and incubated at 4˚C overnight, followed by two washes with PBS-Tween and one wash with DEPC-treated water. Complementary DNA (cDNA) was prepared in a final volume of 20 μL. For complementary DNA (cDNA) synthesis, 1 μL of reverse primer (2587 5 0 -CTCCAATAGTTATGTATTGCGTCTTC-3 0 2561) (20 pmol/ μL), 2 μL of dNTPmix (10 mM) and 3 μL of DEPC-treated water were added to the tube, heated at 65˚C for 5 min and immediately chilled on ice.

PCR, cloning, and sequencing
The full-length genome sequences of BBSV isolates were amplified by RT-PCR from the total RNA extracted from infected sugar beet roots using Tri-reagent (Sigma, USA) and first-strand cDNA was synthesized using M-MuLV reverse transcriptase (Fermentas, Lithuania), according to the manufacturer's instructions. The full-length genome of BBSV isolates was amplified using three pairs of primers (BBF1 5 0 -AAGAAACCTAACCAGTTTCTCGTTGA-3 0 and BBR3 5 0 -TTGCATCTCCATGCCAGCCTGATC-3 0 ); (BBF3 5 0 -TGCTGAGGAACATCTGTTCGA-3 0 and BBR5 5 0 -CATTTCAGAAGTGGAAATGTTGTGT-3 0 ) and (BBF5 5 0 -AAGAARGAYATGGG TCCATCGG-3 0 and BBR7 5 0 -GGGCACCTGGAAYACCAGGTAT-3 0 ) with at least 50-100 nt overlap. Purified RT-PCR generated DNA fragments using the QIAquick Gel Extraction Kit (QIAGEN, Valencia, CA, USA) were used as templates for direct sequencing or cloned into Overlapping sequences sharing 99-100% nucleotide identity were assembled to ensure that they came from the same genome and not from different components of a genome mixture. Nucleotide sequences of the cloned fragments for each isolate were determined using three to five cDNA clones and when any differences was detected, then its RT-PCR products were directly sequenced to determine which was the most common and majoritarian one. Sequence data were assembled using BIOEDIT version 5.0.9 [14]. The sequences were compared with other sequences in the GenBank by the BLAST program of the National Center for Biotechnology Information (NCBI).

Phylogenetic relationship and estimation of genetic distances
The full-length genome sequences of four BBSV isolates obtained in this study and all available BBSV sequences in the GenBank were used for phylogenetic analyses (Table B in S1 File). Non-overlapping regions of the overlapping ORFs were used for genetic and population analysis. Phylogenetic tree for each ORF and 3 0 UTR were estimated using the Maximum-Likelihood (ML) method in MEGA6 [15]. We estimated the model of nucleotide substitution that best fitted the data using the application BestModelTest implemented in the MEGA6. For the ML analysis, we used the Kimura's two-parameter (K2) model of nucleotide substitution with rate variation among sites modeled using a gamma distribution and a proportion of invariable sites (K2+I+G). The signals for virus replication are located in the promoters at the 3 0 UTR of the plus and minus strands therefore phylogenetic tree using 3 0 UTR is shown. Branch support was evaluated by bootstrap analysis based on 1000 pseudoreplicates. The ML trees were compared using PATRISTIC [16]. Nucleotide distances and nucleotide diversity (mean nucleotide distance between two randomly selected sequence variants) were estimated by the maximumcomposite-likelihood method with MEGA6 [15]. Pairwise synonymous substitutions per synonymous site (dS) and nonsynonymous substitutions per nonsynonymous site (dNS) were also calculated according to the Pamilo-Bianchi-Li (PBL) method based on Kimura's twoparameter model [17]. Standard deviations were calculated by the bootstrap method with 1000 repeats. Furthermore, pairwise genetic distances were analyzed by the Kimura's two-parameter method implemented in Phylip 3.67 software [18] for each gene and 3 0 UTR. DNASP version 4.10 [19] was used to estimate haplotype diversity. Haplotype diversity was calculated based on the frequency and number of haplotype in the population.

Detection of recombination
Relationship between aligned genes, 3 0 UTR, and full-length genome sequences (Table B in S1 File) were calculated separately using Maximum Likelihood (ML) method implemented in MEGA6 [15]. Recombination events, major and minor parental isolates of recombinants, and recombination break points were analyzed using several methods implemented in the RDP4 version 4.70 [20] with default configuration and Bonferroni corrected P-value cut-off of 0.05 and 0.01. Putative recombinants found by the RDP4 were confirmed using SISCAN version 2.0 [21].

Selection analyses
A ML method has been developed for detecting amino acids under positive selection [22]. This method originally employed 14 models that use statistical distributions to account for variable ω (dNS/dS) ratios among codon sites. But models M0, M1, M2, M3, M7, and M8 are sufficient for accurate selection analysis [23]. Models M0, M1 and M7 do not allow for the existence of positively selected sites. M0 calculates a single ω ratio (between 0 and 1) averaged over all sites, M1a (nearly neutral) account for neutral evolution by estimating the proportion of conserved (ω = 0) and neutral (ω = 1) sites, whereas M7 uses a discrete β distribution (between the same bounds) to model different ω ratios among sites. The shape of the beta distribution is governed by the parameters p and q, alternatively models M2, M3 and M8 account for positive selection using parameters that estimate ω > 1. Models M2 and M8 extend M1 and M7, respectively, through the addition of two parameters (p 2 and ω 2 for M2 and p 1 and ω 1 for M8) that have the potential to estimate ω > 1 for an extra class of sites. M3 provides the most sensitive test for positive selection by estimating a ω ratio for a predetermined number of classes. Three classes were used in this analysis (p 0 , p 1 , and p 2 ) such that three corresponding ω ratios (ω 0 , ω 1 and ω 2 ) were estimated. The first step in the identification of amino acid sites under positive selection is to test whether sites exist with ω > 1 by comparing nested models using likelihood ratio tests (LRTs). M0 and M1 are both special cases of M2 and M3, while M7 is a special case of M8, and such nested models can be compared with LRTs. Three LRTs (M3 vs M0, M2a vs M1a and M8 vs M7) were used to assess the models' fit to the data, as described by Wong et al. [24]. Once positively selected sites have been shown to exist, the second step is to use Bayesian methods to locate their position. Sites having high posterior probabilities (> 90%) of belonging to a site class with ω > 1 are good candidates for positively selected sites. Posterior probabilities are conditional on the observed data such that they refer to the probability that a site, given the data at that site, is from a particular site class. The methods and models described here were implemented using the CODEML program of the PAML package, version 3.0c [25].  B and C in S2 File) indicted that BBSV isolates fell into two main groups. In addition, using 3 0 UTR sequences, BBSV isolates clearly divided into groups GI and GII which further subdivided into five subgroups (Fig 2). Most of the Iranian isolates clustered in GI with two subpopulations IranA (I-IranA) and IranB (I-IranB). Furthermore, four Iranian BBSV isolates from North-West (Ir-Ksh7, Ir-Ksh8, Ir-Ksh9, and Ir-Ksh10) were grouped in the distinct subgroup IranC in GII (II-IranC). All Chinese isolates and one isolate from Spain (CR-Dm2) fell into subgroup Chinese from GII (II-Chinese), whereas the European isolates together with the USA isolate clustered in subgroup Europe (II-Europe) ( Fig  2). Two-dimensional pairwise nucleotide distances plot analysis also showed two main phylogenetic Groups. BBSV isolates in GI are closely related to each other which confirmed by low nucleotide diversity (0.000 to 0.073; high similarities). However, high pairwise nucleotide distances 0.073 to 0.141 were indicated for GII (Fig 2). The pairwise nucleotide distances for each subgroup were 0.000 to 0.049 (I-IranA); 0.049 to 0.073 (I-IranB); 0.073 to 0.098 (II-IranC); 0.098 to 0.122 (II-Chinese); and 0.122 to 0.141 (II-Europe) (Fig 2).

Patristic distance plots
We constructed pairwise comparison of the maximum-likelihood trees of the distinct ORFs by PATRISTIC approach. All pairwise plots of the distances in the trees deduced from the RT-ORF1 versus each ORF3, ORF4, and concatenate ORFs (3+4) showed similar templates. This is demonstrated by the plot of ORFs (3+4) against RT-ORF1 distances (Fig 3a), in which the three sets of distances show a linear correlation coefficient of 0.959 (p<0.001). The plot of the RT-ORF1 distances against ORF6 (CP) with a linear correlation coefficient of 0.986 ( Fig  3b) and plot of the ORFs (3+4) distances against ORF6 with a linear correlation coefficient of 0.944 (Fig 3c) have a similar pattern with three distinct subpopulations. The pairwise plots of the 3 0 UTR vs ORFs (RT-ORF1, Fig 3d and ORF6, Fig 3e) were constructed and showed three subpopulations but the linear correlation coefficients were less than those of ORFs vs ORFs.

Recombination analysis
Different methods were used for recombination breakpoint prediction that provided evidence for intra and inter recombination events across the BBSV genomes analyzed. Recombination analysis found that the Xinjiang/m81 sequence had two recombination sites around nucleotide 668 in the RT-ORF1 and nucleotide 2714 in the ORF6 genes (event 1). This is a 'clear' intra  Table 1). In addition, a putative inter-recombination breakpoint (event 2) was detected using RDP4 in 3 0 UTR region of Iranian isolates in subgroup I-IranA (Figure D in S2 File, Table C in S1 File) with the likely parental isolates Ir-Ksh9 (FN543419) belonging to subgroup II-IranC and Ir-Ksh5 (FN543418) from subgroup I-IranB, as major and minor parents, respectively. However, the recombination event 2 was not supported with a high degree of confidence (with multiple different methods and with a low associated P-value for each of the methods). The RDP4 results are shown for two putative recombinant isolates Ir-Kr1 and Ir-Gh1 in Table 1 which detectable by Siscan program. No recombination site was found in the ORF3 and ORF4, which are involved in cell to cell movement of the virus.

Mean nucleotide diversity and selection analysis
The mean nucleotide diversities for the RT-ORF1, ORF3, ORF4 and ORF6 were 0.089, 0.056, 0.071 and 0.141, respectively (Table 2). In addition, the within-group diversity of BBSV genes was less, from 0.008 to 0.042 (Table 2). We also estimated pairwise dNS/dS ratios using the PBL method [16]. When all isolates were included, the highest and lowest dNS/dS ratios were 0.147 for ORF4 and 0.020 for ORF3 (Table 2). ML method implemented in PAML [22] was used to find variations in the ω ratio between sites. This method enables detection of distinct codon sites under positive selection and eliminates the other hypothesis about population demography correlated with other statistical tests of selection. Generally, the evolutionary constraint applied on ORF3 is larger than the one exerted on other ORFs and no site was detected under positive selection ( Table 3).
The model M0 was used to evaluate selection pressures [maximum likelihood (ML) framework of codon substitution]. The selection pressure values obtained were 0.056, 0.040, 0.229 and 0.041 for RT-ORF1, ORF3, ORF4 and ORF6, respectively (Table 3). Three models (M2a, M3, and M8) predicted a positively selected group of sites in the polymerase gene (RT-ORF1). M3 was not restricted in this way and estimated that 1.4% of sites are under weak positive selection (ω 2 = 1.223). In addition, M8, which also estimated a small group of sites with a  (Table 3). Although M3 could not reject M2 in a likelihood ratio test (LRT), sites 39L, 641S, and 672A should still be under possible positive selection because they were also detected by M8 (Table 3). A positive selection site was found in the ORF4 data set using M2, M3, and M8, which rejected models M0, M1 and M7 in LRTs (Table 3). M2 and M3  showed similar results and predicted a large set of conserved sites and a small set of positively selected sites (ω > 2). By Bayesian methods, three models indicated that the amino acid 8Q was under positive selection with posterior probability of > 95. No site was detected under positive selection for ORF3 (p7a). The model M0 predicted a similar likelihood in comparison to other models (as described above) which were to detect positive selection and proposed that the sites on ORF6 (CP) are strongly conserved (ω = 0.041). However, values indicating positive selection were obtained in the CP alignment by M2a, M3, and M8. Models M3 and M8 predicted similar parameters and showed 0.8% of the sites were under a very strong positive selection pressure (ω = 3.836 and 4.081, respectively) ( Table 3). Bayesian methods assigned two sites 12S and 155R to the positively selected group with posterior probability >75% for 12S and >95% for 155R (Table 3).

Discussion
The objectives of this investigation were to better understand the sequence diversity and genetic structure of BBSV population using different approaches. The four new Iranian BBSV isolates sequenced in this study (ac. no. MH705129 to MH705132) were all 3644 nts in length, with genome organization identical to that of the type BBSV member. The presented sequence data is the most common variant within each sample. Phylogenetic analysis using 3 0 UTR region grouped the BBSV isolates into two groups. Most of the BBSV isolates from China, Europe, and one from North America were in phylogroup GII. Group GI isolates from Iran fell into two subgroups. Almost all of the BBSV isolates from North-East Iran (Khorasan district, geographically the nearest county to Xinjiang province in China) were not clustered with Chinese or European isolates in Group II. The Iranian BBSV isolates in the subgroup II-IranC originating from North-West Iran (Kermanshah district, Fig 2) were clustered in Group II. Overall, the PATRISTIC plots using ML trees indicated that the coding and 3 0 UTR regions were closely linked and showed similar evolutionary pattern (Fig 3). Different methods were used for recombination breakpoint prediction. Two recombination loci at different genomic locations were identified. According to the adopted criteria, a clear  1 Model descriptions is according to [22] M0 (one ratio); M3 (discrete); M7 (β); M8 (β plus ω), [24], M1a (nearly neutral); M2a (positive selection).  Table C in S1 File) was detected among Iranian isolates in subgroup I-IranA. However, this recombination event was not supported with a high degree of confidence (Table 1). Evolutionary comparisons of a large number of isolates from mid-Eurasia, and East-Asia with representative worldwide isolates would be required to determine extent of recombination and genetic variability of BBSV. Differentiation is considered one of the key subjects in population genetics. We have compared levels of diversification among BBSV subgroups. We used the F ST program to measure the overall genetic variation between subpopulations. The range of F ST is from zero (complete sharing of genetic sequences) to 1.0 (populations completely isolated from each other) [26]. The high F ST values (> 0.6) were estimated among BBSV populations by DNASP [19] calculation of F ST . Pairwise comparisons between Group I and Group II isolates, of four genes and the 3 0 UTR are presented in Table D  Haplotype and nucleotide diversity values were also compared to determine if BBSV population expansions have occurred. The mean nucleotide diversity of each gene and the withingroup diversity were estimated to be similar to those reported for other plant viruses (Table 2) [11]. This finding indicates that BBSV populations are genetically stable. In addition, these analyses showed that, although the population sizes vary between BBSV groups, the rates of evolution of the ORFs analyzed were alike. The highest nucleotide diversity was found in Chinese isolates. However pairwise nucleotide identity, haplotype diversity, and nucleotide diversity revealed two subpopulations of closely related BBSV isolates in North-West and North-East of Iran (Table 2).
Nucleotide diversity estimates the average pairwise difference among sequences. Haplotype diversity is calculated based on the frequency and number of haplotypes in a sample. Estimates of nucleotide diversity can range from zero (no variation) to 0.1 (extreme divergence) between alleles, whereas haplotype diversity may differ between zero and 1.0 [27]. The haplotype and nucleotide diversity values for BBSV subpopulations are presented in Table 2. In most cases haplotype diversity values are high (from 0.666 to 1.0) and nucleotide diversity values are low (from 0.006 to 0.045). Generally, the combination of high haplotype diversity and overall absence of nucleotide diversity within individual subpopulations are consistent with a model of recent population expansion events. Given that evolutionary bottlenecks/founder effects [28], or strong selection pressures (e.g. due to host adaptation) would yield the same low genetic diversity. Independent statistical tests of population differentiation are necessary to better understand the evolutionary forces which influence the BBSV population.
Selection pressure is an important evolutionary force, which accelerates the variation between homologous proteins [29]. The dNS/dS ratio for coding regions ( Table 2) was similar to other plant RNA viruses, indicating that they are under negative (purifying) selection [11]. The dNS/dS ratios differed for phylogenetic groups in RT-ORF1 and ORF6, indicating that the isolates in subgroup I-IranB are probably under positive selection. However, by this analysis, the I-IranA and II-Chinese subgroups are under negative selection ( Table 2). The dNS/dS ratio showed that ORF3 was under purifying selection in both of these subgroups whereas ORF4 was under purifying and positive selections in I-IranA and II-Chinese subgroups, respectively ( Table 2).
Most of the amino acid positions of functional proteins are considered to be conserved, while evolutionary fitness most possibly affects only a few sites [30]. The proteins encoded by the BBSV genome are all presumed to be essential to viral function and evolutionary constraints may well differ among them (Table 3). Coat proteins are multifunctional in plant viruses [31]. E.g. in Tombusvirus, CP is involved with nucleic acid binding and encapsidation [10]. Negative selection pressure was detected, in a few codons in coat proteins undergoing positive selection, indicating that variations in this gene can change viral fitness/infectivity [10,32,33].
BBSV has the highest sequence identity with Tobacco necrosis virus-TNV-D [34]. As previously reported for the CP of TNV-D [34], four conserved amino acids (117D, 120D, 179T, 232N) are involved in calcium binding. This four amino acid motif was also detected in BBSV-CP.
According to the comparison of CP amino acid sequences, Mehrvar [35] proposed that pathogenesis of some Iranian BBSV isolates on sugar beet correlates with changes at CP amino acid positions 12, 145 and 158. In our data, although CP gene sites are strongly conserved (ω = 0.041), only 0.8% of the sites were found to be under strong positive selection (ω 2 = 3.838) (Tables 3). High ω ratios were detected for two of the amino acid residues (12S and 155R). Interestingly, 155R was conserved in all BBSV isolates whereas 12S was found only in Chinese isolates (Table 4).
Three overlapping ORFs (ORF3, ORF4 and ORF5) of BBSV are involved in cell-to-cell movement, accumulation of viral RNAs, and production of local lesions in Ch. amaranticolor [33]. Protein 7a (ORF3) is the only protein that does not show positive selection. Strong selective constraints on ORF3 can be attributed to its key role(s) in viral functions. In addition, the absence of positively selected sites in this gene suggests that host associated selection is probably not a main factor affecting BBSV evolution. For protein 7b (ORF4), site 8Q was found to be under positive selection only in II-Chinese subgroup (Table 4). Overall, this analysis indicates that purifying selection is acting to maintain functional integrity of BBSV proteins (Tables 3 and 4).
The low ω ratios determined for RT-ORF1 indicates that most of the amino acids were under purifying selection (Table 3), this was expected because of the role of this ORF in virus replication. Members of family Tombusviridae express their RdRp by translational readthrough strategy. This process is stimulated by an RNA structure that is positioned immediately downstream of the recoding site (readthrough stem-loop, RTSL), and a sequence in the 3 0 UTR (distal readthrough element, DRTE). A base pairing interaction between RTSL and DRTE is required for enhancement of readthrough. Any change in RNA sequences and structures that flanking either RTSL or DRTE may affect optimal translational readthrough and virus infectivity [36].
Differences in selection pressure on BBSV proteins may reflect diverse geographical origin of those proteins (later assembled into group-specific genomes by recombination); further selection pressure may arise after viral migration to different regions (Table 4). Chiba et al. [37] indicated that vigorous positive pressure on the p25 gene of Italian BNYVV isolates facilitates their ability to overcome Rz1-host resistance genes, when other geographically bounded BNYVV strains could not.
Positive selection acting directly on amino acids with important roles is rarely illustrated because the adaptation-related phenotypic results of particular amino acids are generally unknown. Therefore, the precision of site-specific tests of selection yet remains basically in question. Nevertheless, if any sites are positively selected along a gene, it is possible that these sites are involved in increasing fitness. In addition, further study particularly using reverse genetic approaches is needed for a better understanding of the impact of the amino acid replacements. This is especially interesting in studying the extent to which extent positive selection can be attributed wholly to the efficacy of pathogen-host interactions, or if there are other forces resulting in positive selection on BBSV population. This analysis is, to our knowledge, the first demonstration of the population structuring of BBSV in mid-Eurasian Iran. We have demonstrated effects of selection pressure and recombination in the evolution of BBSV. The phylogenetic relationships and comparisons between each virus group provide an understanding of evolutionary mechanisms. In addition, an understanding of the inter-specific diversification in the groups may be useful in developing strategies for controlling the diseases and the spread of BBSV. In this respect for future research projects on BBSV, we cloned CP gene and generated antibodies against bacterially expressed recombinant CP (rCP) of Iranian BBSV isolate. The polyclonal antibody was specific to BBSV and it was used successfully for BBSV detection in sugar beet samples.

Conclusions
The evolutionary analysis of BBSV indicates that: a) Genetic differentiation has occurred among two original populations of BBSV (Table B in S1 File), one in the Middle East (Iran) and the other one in East Asia. b) Differentiation has divided Iranian BBSV isolates into two groups. This suggests the wide spread dissemination of the virus in sugar beet growing areas in Iran. c) Three BBSV subpopulations have been observed in Iran (based on the analysis of 3 0 UTR sequences). The subpopulation in North-East Iran appears to have diverged most recently. d) Recombination has a considerable role in the evolution of viruses by decreasing mutational bar, producing genetic diversification, and generating new strains. Our analysis using RDP4 [20] indicated a recombination event in the Chinese Xinjiang/m81 isolate and a putative inter-recombination breakpoint in 3 0 UTR of Iranian isolates in subgroup I-IranA. e) Fitness to the host plant or to the chytrid vector may illustrate how diversifying selection influences different sites in the BBSV genome. Furthermore, the positively selected site(s) in different ORFs of BBSV indicate(s) differentiation among evolved subgroups (Tables 3 and 4). BBSV isolates belonging to subgroups I-IranA and I-IranB are dispersed in Iran, whereas except for the USA isolate, all other isolates in BBSV subgroup II-Chinese were collected from China in East Asia. As previously described by Moury [38] different evolution patterns may have originated by biological variation and/or dispensation diversities. Beet has long been cultivated in Iran and some parts of Iran are considered as an origin point of the domestic beet [39]. However, improved beet seeds were introduced about 120 years ago to Iran. At that time each original population of BBSV might have passed through sequential bottleneck transmissions in different host varieties. These host changes may have selected for the introduction of changes at various sites during several generations. In addition, resting spores of O. brassica can remain dormant in infested soil for long times; e.g. as reported for Polymyxa betae, the vector of BNYVV [40]. Beet soil-borne viruses are transmitted primarily via the movement of soils containing viruliferous resting vector spores [40], so the transmission of BBSV from unknown natural hosts to sugar beet fields may well have been an important factor in this evolution.
Supporting information S1 File. Occurrence of BBSV in soil samples (Table A). BBSV isolates analyzed in this study (Table B). Crossover sites in BBSV isolates detected using recombination detecting programs (Table C). Genetic differentiation analysis of BBSV isolates (Table D)