Genetic diversity of cucumber green mottle mosaic virus (CGMMV) infecting cucurbits

Cucumber green mottle mosaic virus (CGMMV), a well-known Tobamovirus, infects cucurbits across the globe. To determine its current status, molecular characterization, genetic recombination, gene flow and selection pressure, 10 districts from Punjab province of Pakistan were surveyed and a total of 2561 cucurbits samples were collected during 2019–2020. These samples were subjected to virus-specific double antibody sandwich-enzyme linked immunosorbent assay (DAS-ELISA) for the detection of CGMMV. The results revealed that viral disease was prevalent in all surveyed districts of Punjab with an overall 25.69% disease incidence. ELISA positive samples were further confirmed through RT-PCR and sequencing of coat protein (CP) cistron. Sequence analysis showed that the present studied CGMMV isolates have 96–99.5% nucleotide and 94.40–99.50% amino acid identities with those already available in GenBank. Phylogenetic analysis also revealed that understudied isolates were closely related with South Korean (AB369274) and Japanese (V01551) isolates and clustered in a separate clad. Sequence polymorphisms were observed in 663 bp of sequence within 31 CGMMV isolates covering complete CP gene. Total number of sites were 662, of which 610 and 52 sites were monomorphic and polymorphic (segregating), respectively. Of these polymorphic, 24 were singleton variable and 28 were parsimony informative. Overall nucleotide diversity (π) in all the understudied 31 isolates was 0.00010 while a total of 1 InDel event was observed and InDel Diversity (k) was 0.065. Haplotype diversity analysis revealed that there was a total 29 haplotypes with haplotype diversity (Hd) of 0.993458 in all the 31 isolates which provide evidence of less diversity among Pakistani isolates. The statistical analysis revealed the values 2.568, 5.31304 and 4.86698 of Tajima's D, Fu, & Li’s F* and D*, respectively, which witnessed the population of CGMMV was under balanced selection pressure.


Introduction
The family Cucurcbitaceae includes several edible and nonedible species that are being grown across the globe particularly in tropical regions warm temperature (Ali et al., 2014;Asad et al., 2022). Economically important species of the family Cucurbitaceae are cucumber (Cucumis sativus), bottle gourd (Lagenaria siceraria), melon (Cucumis melo L.), sponge gourd (Luffa acutangula), summer squash (Cucurbita pepo), bitter gourd (Momordica charan-tia) and serpent gourd (Cucumis melo var. flexuosus), (Bisognin, 2002;Dhillon et al., 2020). In Pakistan, cucurbits are grown throughout the country due to their nutritive as well as medicinal importance (Adams et al., 2011;Ali et al., 2014;Ró _ zyło et al., 2014;Al-Jahani and Cheikhousman, 2017;Ashfaq, et al., 2021a). The leading producers of cucurbits are Turkey, China, India, and the United States. In Pakistan, total area under cucurbits cultivation is 32,848 ha with an annual production of 357,064 tons (http:// www.crs.agripunjab.gov.pk/). In Punjab province of Pakistan, total area under cucurbits cultivation is 20,433 ha and total annual production is 259,365 tons with an average of 12 tons/ ha (Anonymous, 2018-19). The average yield of cucurbit crops in Pakistan is comparatively low that is attributed to several biotic and abiotic factors (Asad et al., 2022). Among the biotic factors, plant viruses especially RNA viruses are major threat to vegetables production in general and cucurbits in particular (Kone et al., 2017;Ashfaq, et al., 2021a).
The Tobamovirus has positive-sense RNA genome and virions are helical (Lewandowski, 2000;Adams et al., 2012;Darzi et al., 2020). Tobamovirus RNA encodes coat protein (CP) of 17.5 kDa, replicase proteins of 130 and 180 kDa, and movement protein (MP) of 30 kDa. (Crespo et al., 2017). According to Ainsworth (1935) first strain of CGMMV was described as cucumber virus 3 (CV3) followed by cucumber virus 4 (CV4) in Japan, two cucumber strains (Inouye et al. 1967), watermelon strain (Komuro et al., 1968) and Yodo strain (Kitani et al., 1970). The characteristics symptoms produced by CGMMV in cucurbits are leaf mottling, mosaic, blistering, and stunted growth to severe distortion of fruit (Antignus et al. 2001). However, variation in symptoms produced by CGMMV in different cucurbits relies upon the infected plant species and strain involved in inducing infection (Mandal et al., 2008). Transmission of virus occurred mostly through infected seeds and prevail on seed surface as a surface contaminant (Lecoq and Desbiez, 2012). Irrigated water and farm equipment are also source of virus transmission (Li, Liu, and Gu, 2016). Mechanical transmission of virus through propagated material made this virus more catastrophic (Sui et al., 2019). Mechanical means of transmission include wounds created by farm equipment during handling of plant and contact of leaves (Li et al., 2016). To date, virus transmission is not observed by any insect vector however, in India only 10% virus transmission has been observed through cucumber leaf beetle, Raphidopalpa faeveicollis (Mandal et al., 2008) and pollinators viz. European honeybees (Apis mellifera L) (Darzi et al., 2018). Some weeds of families; Apiaceae, Amaranthaceae, Lamiaceae, Boraginaceae, Solanaceae, Chenopodiaceae and Portulacaceae also serve as virus reservoirs (Boubourakas et al., 2004;Shargil et al., 2017;Dombrovsky et al., 2017).
Highly stable genome and transmission through different ways makes this virus a threatening challenge across the world wherever cucurbits are being cultivated (Smith and Dombrovsky, 2019). Occurrence of the inoculum right from the seedling development results in a very high infection level and it is very challenging for plants to resist against infection in such conditions (Coutts and Jones, 2002). ELISA and Polymerase Chain Reaction (PCR) are most successful and versatile techniques for routine diagnosis of plant viruses (Boonham et al., 2014). The limited work regarding molecular characterization of CGMMV is only confined to Khyber Pakhtunkhwa (KPK) Province of Pakistan (Ali et al., 2014). The Punjab province is a main hub for production of agronomic and horticultural crops so the present study was planned to determine the incidence of CGMMV and its phylogenetic relationship as well as genetic diversity in Punjab, Pakistan.

Field survey and sample collection
Diagnostic surveys were conducted in 2019 and 2020 in 10 districts of Punjab province, viz Rawalpindi, Sialkot, Faisalabad, Sahiwal, Khanewal, Vehari, Lodhran, Bahawalpur, Muzaffargarh and Multan (Fig. 1). A random stratified design (Khan et al., 2015) was adopted to check the incidence and distribution of CGMMV infecting cucurbits. Approximately 95-100 cucurbits fields were visited each year (total approximately 200 fields in two years) and 2561 samples from different cucurbit crops were collected in a diagonal survey of fields showing leaf mottling, mosaic, blistering, and stunted growth to severe distortion of fruit. To facilitate the revisit, all the locations were marked with Global Positioning System (GPS). All the collected samples were put into plastic zipper bags and kept on an ice-bucket and brought back to the Plant Virology Laboratory, MNS University of Agriculture, Multan. The samples were washed with tap water for removing superficial materials (dirt and other contamination) and divided into two slots. One slot was preserved and dried on silica gel for serological analysis and second was stored at -20 ℃ for further study.

Serodiagnosis of CGMMV
All samples were confirmed for the presence of CGMMV by using virus-specific DAS-ELISA as performed earlier (Clark and Adams, 1977). CGMMV-specific coating antibody (IgG) was diluted in coating buffer @ 1:200 and coating of ELISA plate wells was done using coating buffer at 200 lL/ well. ELISA plate was incubated overnight at 4 o C in a moist chamber followed by three washing with TBS-Tween buffer at an interval of 5 min. CGMMV symptomatic leaf samples were homogenized (1:20) in an extraction buffer and sample sap was added @ 200 lL/ well. The plate was incubated overnight at 4 o C followed by 3-4 washings at 5 min intervals and blot-dried on a paper towel. Then conjugated antibody (1:200) were added @ 200 lL/ well and incubated overnight at 4 o C. Substrate buffer containing p-nitrophenyl phosphate (75 lg mL -1 ) was freshly prepared and added at 200 lL/ well followed by incubation at room temperature for 30 min. The absorbance values i.e., 405 nm were measured with an Automatic ELISA Reader (HER-480 HT Company (Ilford) Ltd, UK). Samples were considered positive for CGMMV infection when the ELISA absorbance value was equal to two times or higher than the average of the absorbance value of the healthy sample as well as the negative control (Ashfaq et al., 2014). Commercial positive and negative controls were used as a reference provided by Bioreba. The reaction was stopped by adding 3 M NaOH at 50 lL/ well. Relative disease incidence was calculated by using the following formula (Asad et al., 2022).

Disease incidence ¼
Infected samples Total samples Â 100

RT-PCR amplification and sequence analysis
Extraction of total RNA from the ELISA positive along with some healthy samples of cucurbits were performed using the TRIzol Ò Reagent (Life Technologies, Carlsbad, USA). Quantification of RNA was performed using Nanodrop (Thermo Scientific Co. USA) according to standard guideline of manufacturer. Working dilution of RNA at 500 ng/lL was used for synthesis of first strand complementary DNA (cDNA) using the Revert Aid, first strand cDNA synthesis Kit and CGMMVR-53 5 0 -TTG CAT GCT GGG CCC CTA CCC GGG GAA AG-3 0 (Hongyun et al., 2008) as a virus-specific downstream reverse primer. The resultant cDNA (2 mL) was used for PCR amplification along with following PCR ingredients; DreamTaq Green PCR master mix (2X) (12.5 mL) (Thermo Fisher Scientific, USA Cat. No. K1081), virus-specific upstream primer (CGMMVF-52 5 0 -CCG AAT TCA TGG CTT ACA ATC CGA TCA C-3 0 as well as CGMMVR-53 as a downstream primer and nuclease free water up to final reaction volume of 25 mL (Hongyun et al., 2008).
The amplification scheme was set as: initial denaturation for 3 min at 94℃ followed by 35 cycles: 30 s at 94℃, 53℃ for 45 s and 72℃ for 60 s, and final extension was done at 72℃ for 5min. PCR products were analysed by electrophoresis with 1.0% (w/v) pre-stained agarose gel and visualized under Omega Fluor TM Plus Gel Documentation System (1149C42). The desired amplicons with 700 bp were purified using GeneJET PCR Purification Kit and cloned into pJET1.2/blunt cloning vector with chemically competent cells of E. coli strain XL1-Blue. Recombinant plasmid DNA was purified using the GeneJET Plasmid Miniprep Kit as per manufacturer's instructions. Digestion with restriction enzymes (BglII) confirmed the presence of an insert in transformants and positive clones were sequenced at Macrogen Inc. (North Korea) in both orientations. The obtained sequences were analysed using the National Center for Biotechnology Information; Basic Local Alignment Search Tool (NCBI BLAST) application and compared with different isolates of CGMMV reported from elsewhere in the world. The CGMMV nucleotide sequences identified in the present study were submitted to the database.

Phylogenetic study of complete CGMMV CP gene
CGMMV sequences identified in the present study were aligned with related CGMMV sequences retrieved from GenBank by using Clustal W embedded in MEGA7 software (Tamura et al., 2013). After aligning, their phylogenetic relationship and ancestral lineage was deduced using the Neighbor-Joining method with 1000 bootstrap replicates (to epitomize the taxa's evolutionary history) in MEGA 7 software (Tamura et al., 2013). The nucleotide and amino acid sequence identity were calculated by Sequence Identity Matrix option in BioEdit v7.2.6.1 (Hall, 1999). Aligned sequences consists of 13 Pakistani CGMMV and 18 other isolates reported elsewhere in the world were analysed using RDP4 to detect apparent recombinant events in the identified sequences of CGMMV isolates in Pakistan (Table 2) by applying the general tab with the default settings which implement all the available methods viz. RDP, GENECONV, BootScan, MaxChi, SiScan, Chimaera and 3SEQ for this purpose. The Nucleotide diversity (p), number of polymorphic (segregation site, S), insertion and deletions (InDel), haplotype diversity (Hd) and synonymous (Ka) and non-synonymous (Ks) rate of mutations, gene flow, genetic differentiation and neutrality within each group and defined region, statistical tests like Fst, Z, Ks*, Snn, Tajima's D, Fu, Li's D* and F* were calculated using DnaSP v 6.12.03 (Fu and Li, 1993;Tajima, 1989;Rozas et al., 2017).

Molecular characterization and phylogenetic analysis
PCR results showed that CGMMV gene specific primers CGMMVF-52 and CGMMVR-53 amplified a specific fragment of 700 bp in all the selected ELISA positive samples representing each surveyed district and crop while no amplification was observed in case of negative control. A total of 13 CGMMV isolates were obtained and sequenced at Macrogen Korea. MegaBLAST analysis confirmed that amplified products were coat protein gene of CGMMV along with some portion of 3' UTRs. After sequence analysis, 13 isolates having accession numbers MW732114-MW732126 identified from cucumber, watermelon, melon, ridge gourd, sponge gourd, pumpkin, bitter gourd and squash were submitted to Genbank along with other previously reported isolates (Table 2).
Sequence identity matrix based on nucleotides and amino acids showed that all 13 isolates reported in current study shared 96.20%-99.5% similarity with isolates reported from other parts of the world and shared 98% to 99.5% similarity with each other (Table 3). Coat protein sequence based phylogenetic analysis of CGMMV isolates with the other reported sequences from China, Australia, USA, Greece, Canada, Netherland and Taiwan resulted in two main clads (A and B). Clad B is divided into two sub clads IB and IIB. In clad A, two Canadian isolates (MF510467 and MH426842) were present. In clad IB, isolates from China, Australia, Greece, Taiwan, Netherland and USA were present. In clad IIB, two isolates, one from South Korea (AB369274) other from Japan (V01551) and 13 isolates reported in this study were present while ZYMV used as an outgroup (Fig. 2).

Selection pressure and recombination analysis
Sequence polymorphisms were observed in 663 bp of sequence within 31 CGMMV isolates covering complete CP gene. Total number of sites (excluding sites with missing data / gaps) were 662, of which monomorphic sites were 610 and 52 polymorphic (segregating) sites (S). Of these, 24 were singleton variable and 28 were parsimony informative. Overall nucleotide diversity (p) in all the studied 31 isolates was 0.00010 while a total of 1 InDel event was observed and InDel Diversity (k) was 0.065. Haplotype diversity analysis revealed that there are a total 29 haplotypes with haplotype diversity (Hd) of 0.993458 in all 31 isolates. This result provided evidence that Pakistani isolates have less diversity compared to the other reported isolates. Less p value also confirmed the phylogeny results as all 13 Pakistani isolates were present in the same clad. The Ks values range was 0.0030-0.9058 while the range of Ka was 0.00-0.8245 and total number of 358 mutations were observed. No Recombination event was detected in Pakistani CGMMV isolates. Moreover, all the positive values 2.568, 5.31304 and 4.86698 of Tajima's D, Fu, & Li's F* and D*, respectively, (commonly used tests to recognize sequences that do not suit the neutral model in genetic drift and mutation equilibrium), weren't statistically significant, respectively (Ramírez-Soriano et al., 2008;Tajima, 1989) demonstrating that CGMMV population was under balanced selection pressure.

Discussion
Viruses infecting vegetables have always been a tremendous threat to sustainable vegetable production across the world (Amari et al., 2017;Moriones et al., 2017;Asad et al., 2022). In Pakistan, high incidence of vegetable viral diseases has already been reported by number of scientists (Ashfaq et al., 2015;Ashfaq and Ahsan, 2017;Hussain and Atiq, 2017;Tahir et al. 2017). Cucurbitaceae is the largest family among vegetables which includes cucumber, watermelon, ridge gourd, sponge gourd, melon, bitter gourd and these are vulnerable to a number of plant viruses (Rao et al., 2017). This study demonstrated the ubiquitous occurrence of CGMMV infecting cucurbit crops in the Punjab, Pakistan. Accord- Table 3 Comparison of Pakistani CGMMV CP gene nucleotides sequences to each other and retrieved Genbank sequences. Z. Asad, M. Ashfaq, N. Iqbal et al. Saudi Journal of Biological Sciences 29 (2022) 3577-3585 ing to our knowledge it is the first ever study regarding the occurrence, molecular characterization, genetic recombination and selection pressure of CGMMV in Punjab Pakistan. CGMMV significantly affected the cucurbit crop yield and resulted in a variable symptoms depending upon the genotype/ variety (Mandal et al., 2008). Green mottling symptoms generally appeared on fruits and young leaves of cucumbers and consequently resulted in death of the plants. In young watermelon plants, mosaic and mottling type symptoms appeared and brown necrotic lesions developed on peduncles and stems. The foliage became wilted and bleached resulting into the premature death of runners or even whole plants. However, in mature plants, particularly in open-field conditions, foliage symptoms were fade. Malformation and internal flesh symptoms of sponginess and yellowing or dirty red discoloration were common in their fruits rendering them unmarketable. Initially mottling and mosaic type symptoms appeared on young melon leaves, later on faded away as the foliage matures. Fruits showed varying degree of mottling, malformation, and netting on their surface. Infected foliage in zucchini, squash, and pumpkin were asymptomatic or had leaf mottling and mosaic. The symptoms caused by CGMMV on cucurbits were similar as observed in previous studies (Varveri et al., 2002;Shim et al., 2006;Wu et al., 2011;Reingold et al., 2013;Ali et al., 2014;AUSVEG, 2016).
It was observed that the growers purchased seeds from the open market or used their domestic seeds which served as a primary source of viruses in general and tobamoviruses in particular. Being a Tobamovirus, CGMMV is usually transmitted via contact, seeds, field implements as well as poor phytosanitary measures that intensify the prevalence and incidence under in vivo conditions. Besides this, the population of different weeds in the field and adjacent areas also played a significant role in the higher incidence of viruses. The incidence results are in agreement with Ali et al. (2014). While the contrary results were also reported by other scientists (Ellouze et al., 2020;Yoon et al., 2008).
Disease incidence varied from species to species and efficiency of vectors (Mandal et al., 2008;Li et al., 2016). Virus characterization, their genetic composition, recombination and mechanism of variation helped in the develpoment of transgenic plants resistant against viral diseases. It also aid in studying the CGMMV Pakistani isolates relatedness to CGMMV isolates reported from other parts of the world and from same geographical region. It also predict about new strain evolution and its adaptation to new hosts and geographical conditions. The nucleotide (nt) and amino acid (aa) Z. Asad, M. Ashfaq, N. Iqbal et al. Saudi Journal of Biological Sciences 29 (2022) Yoon et al. (2008) and Ali et al. (2014) where it was reported to have 98-99% similarity among CGMMV isolates. Coat protein sequence based phylogenetic analysis of present study CGMMV isolates with isolates reported from China, Australia, USA, Greece, Canada, Netherland and Taiwan resulted in two main clads (A & B). Clad B is further divided into two sub clads IB and IIB. In clad A, two Canadian isolates were present. Isolates from China, Australia, Taiwan, Greece, Netherland and USA were present in clad IB while the present study 13 isolates along with one isolate each from South Korea and Japan were present in clad IIB. This study provided evidence that CGMMV Pak isolates that infect cucurbits most probably originate from South Asia. As an evolutionary process, variations occurs in the genetic makeup of organisms by addition of new alleles through gene flow or mutations (Zu et al., 2019). Positive values of Neutrality tests i.e. Tajima's D test, Fu, & Li's F*, Fu, & Li's D* revealed that CGMMV population is under balanced selection pressure and low frequency of variation was observed (Tajima, 1989;Fu and Li, 1993). Mutation, re-assortment and recombination are the main causes of genetic diversity in RNA viruses (Holmes, 2006;Akinyemi et al., 2016) which may result in the enclosure of discrete sequence components along with interchange, repetition or obliteration of existing viral elements. No genetic recombination event was observed among the CGMMV isolates identified in current study. Our finding deviates from the findings of Rao et al. (2017) who reported very little recombination event but recombination score was below 60% which reflected that isolate is not recombinant.

Conclusion
CGMMV is a notorious and devastating pathogen responsible for huge losses in cucurbit crops all over the world. In the present study, CGMMV was detected in almost all cucurbit crops grown in Punjab, Pakistan with an overall disease incidence of 26.35% during 2019-2020. Evolutionary distance and phylogenetic analysis of 13 CGMMV isolates revealed that all of these isolates have a close relationship with Japanese and South Korean isolates. High frequency of gene flow with lower nucleotide diversity was detected in CGMMV population. Positive values of statistical tests showed balanced selection pressure in understudied CGMMV population. The presence of this potentially destructive virus in Punjab represents an alarming situation for successful production of cucurbit crops. Based on the findings of present study, necessary strategies including resistant genotypes and integrated management approaches are recommended to prevent the widespread occurrence of this virus.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.