High-throughput sequencing of virus-infected Cucurbita pepo samples revealed the presence of Zucchini shoestring virus in Zimbabwe

Objectives Plant-infecting viruses remain a serious challenge towards achieving food security worldwide. Cucurbit virus surveys were conducted in Zimbabwe during the 2014 and 2015 growing seasons. Leaf samples displaying virus-like symptoms were collected and stored until analysis. Three baby marrow samples were subjected to next-generation sequencing and the data generated were analysed using genomics technologies. Zucchini shoestring virus (ZSSV), a cucurbit-infecting potyvirus previously described in South Africa was one of the viruses identified. The genomes of the three ZSSV isolates are described analysed in this note. Results The three ZSSV isolates had the same genome size of 10,297 bp excluding the polyA tail with a 43% GC content. The large open reading frame was found at positions 69 to 10,106 on the genome and encodes a 3345 amino acids long polyprotein which had the same cleavage site sequences as those described on the South African isolate except for the P1-pro site. Genome sequence comparisons of all the ZSSV isolates showed that the isolates F7-Art and S6-Prime had identical sequence across the entire genome while sharing 99.06% and 99.34% polyprotein nucleotide and amino acid sequence identities, respectively with the isolate S7-Prime.


Introduction
Cucurbit is a generic term used to denote all species within the Family Cucurbitaceae also know as the gourd family [1]. Numerous cucurbit crops are economically important worldwide. Cucurbits are consumed in different ways as fruits or vegetables, providing essential nutrients and dietary fibre [2].
Virus diseases on cucurbits produce diverse symptoms that result in yield reduction and in severe instances compromised fruit quality [3][4]. The negative effects of plant-infecting viruses on crops are more prominent especially in countries where their studies are underdeveloped.
High-throughput sequencing (HTS), also called next-generation sequencing (NGS) describes a series of technologies whereby millions or billions of DNA molecules are sequenced simultaneously [5]. The application of these ever-growing sequencing technologies and bioinformatics data analysis to the studies of plant-infecting viruses, which started in 2009 [5], have revolutionized the fields of virus discovery and diagnostics, resulting in unprecedented virus discoveries from any host and environment [6]. Unlike other popular techniques such as the enzyme-linked immunosorbent assay, molecular hybridization and polymerase chain reaction that mainly work on known pathogens, HTS data analysis has made possible the identification of sequences of known or unknown viruses from any host without any prior knowledge of the disease aetiology [7][8].
Zucchini shoestring virus (ZSSV) was discovered among other known cucurbit-infecting viruses in 2015 in South Africa when the RNA from severely distorted Baby marrow leaves were subjected to HTS [9][10]. Genomics and taxonomic studies revealed that ZSSV is a new species in the genus Potyvirus [10]. The International Committee TV subsequently ratified these findings [11]. The genus Potyvirus is one of the 8 genera that composed the family Potyviridae. Members in that family, also known as potyvirids, are differentiated by the host range, genomic features and phylogeny, with a species demarcation criterion set to a nucleotide and amino sequence identity less than 76% and 82% respectively for the large open reading frame (ORF) or its protein product. In instances where the complete ORF sequence is not available, similar criteria can be used for the coat protein (CP) coding region [12].
Viruses that belong to the genus Potyvirus have non-enveloped, flexuous and filamentous virions of 680-900 nm in length and 11-20 nm in diameter. The genome of potyviruses is a positive-sense ssRNA molecule with its 5' terminus covalently linked to the viral protein genome linked (VPg) and its 3' end polyadenylated. The 10 000 bp genome harbours two ORFs that encode eleven multifunctional proteins. A large ORF is translated into a single polyprotein that is cleaved at semi-conserved sites by three self-encoded proteases into ten mature proteins namely the protein 1 protease (P1-Pro), the helper component protease (HC-Pro), Protein 3 (P3), six kilodalton peptide 1 (6K1), the 6K2, the cytoplasmic inclusion (CI), the nuclear inclusion A protease (NIa-Pro), the nuclear inclusion B RNAdependent RNA polymerase (NIb), the VPg and the CP [12]. A smaller ORF, named the pretty interesting Potyviridae ORF (PIPO), is generated by a polymerase slippage mechanism and is expressed as the trans-frame protein P3N-PIPO [13][14][15].
In this note, we described and studied the genome sequences of three ZSSV isolates obtained through HTS of infected baby marrow leaves collected in Zimbabwe.

Sample sources
Virus

High-throughput sequencing and data analysis
Total RNA was extracted from each leaf sample using the Quick-RNA Miniprep Kit (Zymo Research, USA) as per the manufacturer's instructions and was shipped on dry ice to the Agricultural Research Council Biotechnology Platform (ARC-BTP) in Pretoria, South Africa for sequencing on the HiSeq platform (Illumina Inc., USA). For each sample, the data generated from sequencing was analysed as follows. The read quality was assessed using FastQC version 0.11.5 (Babraham Bioinformatics) and when necessary, Trimmomatic version 0.36 [16] was used to trim. De novo assembly was then performed using SPAdes version 3.10.1 [17] according to the developer's instructions. Nucleotide blast was performed on all contig using BLAST+ [18].
ClustalW [19] was used to do multiple sequence alignment. Nucleotide and amino acid sequence identities were performed online with SIAS (http://imed.med.ucm.es/Tools/sias.html). MEGA X software version 10.1.7 [20] was used to find the best evolutionary model fitting our phylogenetic analysis and to infer the maximum likelihood tree accordingly. ZSSV being one of the species in the 5 "Papaya ringspot virus (PRSV) cluster" of cucurbit-infecting potyviruses, the phylogenetic

ZSSV genome sequence identified from HTS data analysis
The BLAST results identified one contig from each sample as a perfect match to the full-length genome sequence of the South African (SA) ZSSV isolate (GenBank accession number: KU355553.1).
These sequences were then referred as ZSSS isolates F7-Art, S6-Prime and S7-Prime. The coverage values were 30x, 66x and 80x for F7-Art, S6-Prime and S7-Prime respectively. The genome size was the same for the three isolates and consisted of 10297 bp excluding the polyA tail with GC contents varying between 42.92 and 42.96%. Each isolate sequence was submitted to GenBank and was given accession number as surmised in Table 1.

ZSSV genome analysis and phylogeny
The genome features common to the three isolates included the lengths and the positions of both The phylogenetic analysis involved 33 nucleotide sequences and was inferred using the general time-
Moreover, MWMV, AWMV, ZSSV, SuWMV and WMVBV were identified in Africa, suggesting that the PRSV cluster underwent an important diversification in Africa [25]. Out of these viruses present in Africa, MWMV is the widespread one having been reported in all African regions [3,[26][27][28][29][30][31][32]. The HTS in this study made the detection of ZSSV on infected leaf sample possible. The presence of ZSSV in cultivated baby marrow plants from the surveyed farms may indicate either a broader geographical distribution of the virus or its spreading across borders. The occurrence of ZZSV in Zimbabwe highlights the need to conduct further studies on its epidemiology and to develop effective management strategies.

1.
The small number of samples analysed in that study was one of the limitations.

Ethics approval and consent to participate
Not applicable

Consent for publication
Not applicable

Availability of data material
The ZSSV genome sequences generated in this study can be freely and openly accessed on the NCBI 8 GenBank under the accession numbers MK204479.1, MK204480.1 and MK204481.1. Please see Table   1 for details and links.

Competing interests
The authors declare that they have no competing interests.

Funding
The