Molecular characterization and phylogenetic analysis of a Squash leaf curl virus isolate from Baja California Sur, Mexico

Background The begomovirus, squash leaf curl virus (SLCuV) is one of the causal agents of squash leaf curl (SLC) disease, which is among the most destructive diseases of cucurbit crops in tropical, subtropical, and semiarid regions worldwide. This disease was originally reported in the American continent with subsequent spread to the Mediterranean basin. Up to now, SLCuV has only been detected by PCR in Mexico. This study provides the first complete sequence of a Mexican SLCuV isolate from Baja California Sur (BCS). In addition, the genome of the virus was characterized, establishing its phylogenetic relationship with other SLCuV isolates. Methods The full genome (DNA-A and DNA-B) was amplified by rolling circle amplification, cloned and sequenced and the open reading frames (ORF) were annotated. Virus identification was performed according to the International Committee on Taxonomy of Viruses (ICTV) criteria for begomovirus species demarcation. To infer evolutionary relationship with other SLCuV isolates, phylogenetic and recombination analyses were performed. Results The SLCuV-[MX-BCS-La Paz-16] genome (DNA-A and DNA-B) had 99% identity with SLCuV reference genomes. The phylogenetic analysis showed that SLCuV-[MX-BCS-La Paz-16] is closely related to SLCuV isolates from the Middle East (Egypt, Israel, Palestine and Lebanon). No evidence of interspecific recombination was determined and iterons were 100% identical in all isolates in the SLCuV clade. Conclusions SLCuV-[MX-BCS-La Paz-16] showed low genetic variability in its genome, which could be due to a local adaptation process (isolate environment), suggesting that SLCuV isolates from the Middle East could have derived from the southwestern United States of America (USA) and northwestern Mexico.


INTRODUCTION
Viruses of the genus Begomovirus (family Geminividae) are devastating pathogens that affect a variety of agronomic crops worldwide (Rojas et al., 2018). Begomoviruses are commonly associated with vegetables (Varma & Malathi, 2003) and have also been reported in medicinal and aromatic plants (Saeed & Samad, 2017). The genus Begomovirus has 388 species, which have importance by their worldwide distribution and their direct and negative impact over a wide range of crops (Zerbini et al., 2017). Begomoviruses can be divided based on their geographic location and genomic organization. In the Old World (OW) they can be mono-or bipartite and are often associated with DNA-satellites, while those in the New World (NW) are mostly bipartite (Rojas et al., 2005;Duffy & Holmes, 2007;Melgarejo et al., 2013). Two additional groups associated with a specific host instead of geographical location are the sweepoviruses (monopartite begomoviruses that affect sweet potato) (Trenado et al., 2011) and the legumoviruses (bipartite begomoviruses that affect legumes), constituting two divergent monophyletic groups distinct from OW and NW begomoviruses (Ilyas et al., 2009).
Squash leaf curl virus (SLCuV) is a typical NW, bipartite begomovirus which infects squash (Cucurbita pepo L.) in North America (Flock & Mayhew, 1981) and the Mediterranean basin (Antignus et al., 2003;Lapidot et al., 2014). SLCuV popoulation from the Middle East show a low degree of genetic variability (Lapidot et al., 2014), and there is little genetic differentiation between population from North America and the Middle East (Rosario et al., 2015). Although it has been found in mixed infections with other begomoviruses (Kuo et al., 2007;Sufrin-Ringwald & Lapidot, 2011;Ali, Mohammad & Khattab, 2012;Ahmad, Odeh & Anfoka, 2013), recombinants have not been detected (Rosario et al., 2015). However, the recent migration and rapid spread of the SLCuV from the Americas into the Middle East could influence the appearance of new virulent strains and the expansion of the host range of the virus in native flora (Abudy et al., 2010). Thus, surveillance is necessary to monitor the appearance of new strains. The objective of this study was to characterize a SLCuV isolate from Mexico to infer its phylogenetic and evolutionary relationships with other isolates.

Samples collection and DNA extraction
Plant samples of Cucurbita pepo L. showing the characteristics of SLC disease were collected. Samplings were performed in the most important squash crops in the southern part of Baja California Sur State (BCS) during the spring/summer and autumn/winter cycles from 2016 to 2017. Total nucleic acids were isolated using a CTAB method (cetyl trimethylammonium bromide) (Doyle, 1991).

SLCuV PCR detection
To detect and identify SLCuV, samples were tested by PCR. One microliter of total DNA from each sample (50 ng/µL) was used as template. The reaction mixture consisted of 0.5 µM forward and reverse primers (SqA2F and SqA1R; Table 1), 10 µL of 2×Phusion High-Fidelity PCR Master Mix (New England Biolabs, Inc., Ipswich, MA, USA) in 20 µL of final reaction volume. The PCR reaction was carried out as follows: initial denaturation step (98 • C 30 s, one cycle), amplification step for 35 cycles (98 • C 10 s, 55 • C 30 s and 72 • C 30 s, for each cycle), and a final elongation step (72 • C, 5 min).

SLCuV full-length genome amplification
From SLCuV positive samples, total DNA was used as template for rolling circle amplification (RCA) using the TempliPhi Kit (GE Healthcare, Chicago, IL, USA) following the manufacturer's protocol. RCA amplification products were digested with restriction enzymes Cla I and Xba I (New England Biolabs) to linearize DNA-A and DNA-B, respectively. Both DNA-A and DNA-B linearized genomic segments were isolated and ligated into pGEM-T Easy vector (Promega, Madison, WI, USA) according to the manufacturer's protocol, and then used to transform Escherichia coli DH5-α. Recombinant clones were Sanger sequenced bidirectionally using SLCVF-SalI, SLCVR-SalI, SLCVA2295F, XhoSLCVR, XhoSLCVAF, SLCVA2314R primers for DNA-A and SLCVDNAB1F, SLCVDNAB1R, SLCVDNAB2R, SLCVDNAB2F, BgMP-BC1F, BgMP-BC1R primers for DNA-B (Table 1).

Genome assembly and annotation
The resulting Sanger sequencing reads were used to assemble the SLCuV-

Phylogenetic, recombination and iterons analysis
Phylogenetic analysis was performed using complete DNA-A and DNA-B nucleotide sequences as well as replication-associated (Rep) and capsid (CP) protein amino acid sequences (see Table S1 for details of the sequences used). Phylogenetic trees were constructed with MEGA7 (Kumar, Stecher & Tamura, 2016) using the Neighbor-Joining (NJ) algorithm with the Kimura 2-parameter substitution model and 1,500 bootstrap replications. The RDP4 program was used to identify putative recombination events (Martin et al., 2015). The comparative analysis of the conserved elements in the IR (Argüello-Astorga et al., 1994) was performed using Clustal X2 and MEGA7.

Virus detection
A typification of the SLC disease was performed in the field observing symptoms of thickened leaf vein-banding, mild chlorosis, severe leaf curling, reduction in the size of leaf, leaf distortion and mottled interveinal tissue (Fig. 1). In preparation for viral detection, we performed a PCR-based detection with specific primers (SqA2F and SqA2R) obtaining the expected ∼ 600 pb size product.

Phylogenetic analysis
The phylogenetic tree based on full-length DNA-A nucleotide sequences revealed that SLCuV-[MX-BCS-La Paz-16] forms a monophyletic group with other SLCuV isolates and a separate group with related NW begomoviruses that infect cucurbits and other  hosts. SLCuV-[MX-BCS-La Paz-16] shows the closest relationship with SLCuV isolates from the Middle East, including Egypt, Lebanon, Palestine and Jordan (Fig. 3A). The phylogenetic analysis was well supported with high bootstrap values, and is consistent with pairwise sequence identity analyses. We carried out the same phylogenetic analysis with the DNA-B component (Fig. 3B), confirming the close phylogenetic relationship among SLCuV isolates. Phylogenetic trees based on amino-acid sequences of REP and CP also indicated that the SLCuV-[MX-BCS-La Paz-16] formed a single cluster with other SLCuV isolates (Fig. S1).

Recombination and iterons analysis
In the analyses to search for potential recombination events in the DNA-A, we used the same data set used for the DNA-A phylogenetic analysis, including the NW and OW groups as well as other cucurbit begomoviruses. No putative recombination events were identified between SLCuV and other cucurbit begomoviruses. Using a second data set comprising only SLCuV isolates, two putative recombination events were supported by five of the seven different methods of the RDP package, indicating major parents Middle East isolates and the USA isolate US-AZ-04 as the minor parent. In the analysis of the intergenic region, the TAATATTAC sequence at the hairpin structure of geminiviruses was conserved in SLCuV- . The analysis of the iterons located in the promoter region associated with the Rep protein showed four direct repeats and two inverted repeats, with 100% identity in the sequences of iterons with other SLCuV isolates (Fig. S2).

DISCUSSION
This study sequenced the full genome (DNA-A and DNA-B) of a SLCuV isolte from Mexico (SLCuV-[MX-BCS-La Paz-16]). It is worth noting that this is the first SLCuV full genome sequenced in Mexico, with all previous SLCuV detections having been limited to PCR-based diagnosis (Ramirez-Arredondo et al., 1995;Lugo et al., 2011). Despite the presence of SLCuV in North America and the Middle East, the genome seems to be very stable (Lapidot et al., 2014;Rosario et al., 2015) with no substantive changes in the sequence since the first genomic characterization of the virus (Cohen et al., 1983;Antignus et al., 2003). Our isolate is a typical SLCuV isolate with only slight modifications in the nucleotide sequence but without changes in the ORFs sizes and organization. The absence of genetic variations and the iteron analysis (without changes in sequence, number and orientation) is further evidence of the genomic stability observed in SLCuV-[MX-BCS-La Paz-16] with respect to other SLCuV isolates. SLCuV-[MX-BCS-La Paz-16] formed a discrete monophylogenetic group with the SLCuV clade but closer with the isolates from Middle Eastern countries (Egypt, Lebanon, Jordan and Palestine) than with the isolates from the USA. Despite the selection pressures, the interaction of the virus with the host and its vector and the biological-ecological interactions that confronts the viral populations, the genomic stability of the SLCuV seems to be maintained over time, preserving its genetic and structural functionality (Gibbs et al., 1999;Sánchez-Campos et al., 2002).

CONCLUSION
The complete genome of SLCuV was sequenced for the first time in the Mexico, in the southern part of the Baja California peninsula. The molecular characterization indicated a closer relationship with isolates from Middle East rather than with isolates from the USA, suggesting that SLCuV might have reached BCS from the Middle East or vice-versa and not from the USA as it had been previously assumed. In order to confirm this hypothesis, phylogeographic studies should be performed to determine the paths of dispersion.