A new virus found in garlic virus complex is a member of possible novel genus of the family Betaflexiviridae (order Tymovirales)

Plant vegetative propagation strategies for agricultural crops cause the accumulation of viruses, resulting in the formation of virus complexes or communities. The cultivation of garlic is based on vegetative propagation and more than 13 virus species from the genera Potyvirus, Allexivirus and Carlavirus have been reported. Aiming for an unbiased identification of viruses from a garlic germplasm collection in Brazil, total RNA from eight garlic cultivars was sequenced by high-throughput sequencing (HTS) technology. Although most viruses found in this study were previously reported, one of them did not belong to any known genera. This putative new virus was found in seven out of eight garlic cultivars and phylogenetic data positioned it as representative of an independent evolutionary lineage within family Betaflexiviridae. This virus has been tentatively named garlic yellow mosaic-associated virus (GYMaV), sharing highest nucleotide identities with African oil palm ringspot virus (genus Robigovirus) and potato virus T (genus Tepovirus) for the replicase gene, and with viruses classified within genus Foveavirus for the coat protein gene. Due to its high frequency in garlic cultivars, GYMaV should be considered in upcoming surveys of pathogens in this crop and in the development of virus-free garlic plants.


INTRODUCTION
Garlic (Allium sativum L.) is one of the most consumed vegetables in the world with triennial world production (2011-13) of over 23 million tons (Camargo-Filho & Camargo, 2015). Since garlic cultivation is based on vegetative propagation, viruses can accumulate after successive planting cycles and spread to different regions by contaminated bulbs (Conci, Canavelli & Lunello, 2003). To date, many viral diseases have been reported, some of which have devastating effects on garlic development (Conci, Canavelli & Lunello, 2003;Lunello, Di Rienzo & Conci, 2007). Garlic plants infected with the so-called ''virus complex'' (VC), which includes mainly viruses from the genera Potyvirus, Carlavirus, and Allexivirus, have significantly reduced bulb weight and perimeter (Lunello, Rienzo & Conci, 2007).
In this study, we identified viruses present in different garlic cultivars from the germplasm collection of EMBRAPA Hortaliças, Brazil. The majority of the viruses found in these samples were previously reported, except for a new virus putatively classified as a member of new genus in the family Betaflexiviridae (order Tymovirales).

Garlic samples
The eight garlic cultivars analyzed in this study are part of the germplasm collection of the Brazilian Agricultural Research Corporation on Vegetables (EMBRAPA Hortaliças), Brazil. These cultivars are known as Branco Mineiro, Cateto Roxo, Amarante, Gigante Lavinia, Moz 2014 Africa, Ito, San Valentin, and Chonan. All of them are planted commercially in Brazil and are classified in three main groups (early, medium, and late planting) according to their climate requirements for bulbification. Temperatures around 20 • C, below 15 • C, and below 10 • C are required for proper bulbification of early (Branco Mineiro, and Cateto Roxo), medium (Amarante, Gigante Lavinia, and Moz 2014 Africa), and late (Ito, San Valentin, and Chonan) garlic cultivars, respectively. In addition, all these plants displayed yellowish mosaic in their leaves during vegetative development.

RNA extraction, sequencing, and RT-PCR detection
Total RNA was extracted from symptomatic leaves of 10 plants from each garlic cultivar using the RNeasy R Mini kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions. For high-throughput sequencing, the RNA samples were combined together (RNA pool). cDNA libraries and sequencing (2 × 100 bp read length) on the HiSeq 2,000 platform were performed at Macrogen Inc. (Seoul, Republic of Korea). The generated

P1 protein 731
Garlic yellow mosaic-associated virus F-GTGTGGCTAGTCTGCTTGGT R-TTGTGCTTGATCGCGGTTTC Replicase 1,000 reads were trimmed and de novo assembled using CLC Genome Workbench 6.5.2 (CLC bio, Qiagen). Contigs related to viruses were retrieved using Blastx against a RefSeq virus database. To determine whether the assembled contigs corresponded to complete virus genomes, they were compared with complete virus genomes deposited on public databases using Geneious 7.1.8 (Kearse et al., 2012). Genome annotation was also performed using the latter program, in which open reading frames (ORFs) were annotated using BLASTx search against the NCBI non-redundant protein database. The identified viruses were then traced back in each garlic cultivar by reverse transcriptase (RT) reaction followed by polymerase chain reaction (PCR) amplification. Complementary DNA sequences (cDNAs) were synthesized using SuperScript III reverse transcriptase (Thermo Fisher Scientific, Waltham, MA, USA) and random hexanucleotides. Then, PCR reactions were performed using PCR Master Mix (Promega, Madison, USA) and specific primer pairs for each one of the detected viruses (Table 1). Nucleotide (nt) sequences of PCR products were confirmed by Sanger sequencing at Macrogen Inc. All procedures followed the manufacturer's instructions.

Phylogenetic analysis
The phylogenetic tree containing ICTV recognized species of the family Betaflexiviridae was built based on the deduced amino acid (aa) sequences of the replicase and coat protein (CP) genes. For the cophylogeny trees, aa sequences of both replicase and CP were used. Multiple alignments were performed using the MAFFT method (Katoh & Standley, 2013). Then, maximum likelihood (ML) trees were inferred using PhyML (Guindon et al., 2010) under the JTT substitution model (Jones, Taylor & Thornton, 1992). Branch support was estimated by the Shimodaira-Hasegawa-like test (Anisimova et al., 2011). Cophylogeny analysis between the betaflexivirus trees was performed using the R program (R Core Team, 2013) with the Plytools (Schliep, 2018) and Phangom packages (Schliep, 2018). Finally, pairwise identity matrices were obtained using the SDT program (Muhire, Varsani & Martin, 2014) and plotted using Evolview (He et al., 2016).

RESULTS
The analysis of HTS data revealed the presence of viruses classified within genera Allexivirus (GVA, GVB, GVC, GVD, and GVX), Carlavirus (GCLV and GLV) and Potyvirus (OYDV and LYSV). Surprisingly, a new virus genome sequence which had close relationship to viruses of the family Betaflexiviridae was also found ( Table 2). Each of these viruses was traced back in each garlic plant (cultivar) by RT-PCR. The betaflexivirus-like virus was detected in seven out of eight garlic cultivars. GVA and GVB isolates were the most frequent viruses, detected in all plants, while GVC and OYDV isolates were only detected in cv. Gigante Lavinia ( Table 2). The genome sequence of the putative new betaflexivirus was assembled from 3,881 reads. A reliable consensus sequence was obtained for this virus since a low number of mutations was observed after read mapping. Conversely, we could not achieve reliable complete genome sequences for the other viruses due to their high diversity and interspecific homology amongst themselves. Thus, only the complete genome sequence of the new putative betaflexivirus, tentatively named garlic yellow mosaic-associated virus (GYMaV), was deposited on the GenBank database under the accession number MH120170 (Fig. S1).
GYMaV has a positive sense, single-stranded RNA genome with 8,209 nt and five ORFs that encode a multi-domain replicase, the triple gene block proteins (TGB1, TGB2 and TGB3), and a CP (Table S1 and Fig. S1). The length and predicted molecular mass of each protein are displayed in Table S1. Since the CP and replicase gene sequences are the criteria for genus demarcation in the family Betaflexiviridae (King et al., 2012), a pairwise identity comparison was performed using all ICTV recognized species (75 sequences) (Fig. S2). GYMaV replicase shared 56% and 55% nt identity, respectively, with potato virus T (GenBank accession number EU835937, genus Tepovirus) and African oil palm ringspot virus (AY072921, genus Robigovirus). On the other hand, the GYMaV CP shares 64%, 62%, and 61% nt identity, respectively, with peach chlorotic mottle virus (EF693898), apple stem pitting virus (D21829), and apricot latent virus (HQ339956), all members of genus Foveavirus. These values are well below the accepted species discrimination level of 72% nt identity for both CP and replicase (Adams et al., 2012). Even though the identity values were above the 45% nt identity threshold for genus demarcation, GYMaV should be considered a representative of the new genus of the family Betaflexiviridae as further discussed.
To infer the evolutionary relationships of GYMaV, a phylogenetic tree was constructed with replicase proteins (complete sequences) of ICTV recognized species in the family Betaflexiviridae (Fig. 1). Despite clustering with other viruses, GYMaV formed an independent and distant evolutionary lineage within this family. Since both the replicase and CP gene sequences are used for genus demarcation, a cophylogeny analysis was also performed. GYMaV clustered together with members of genus Robigovirus and the unassigned banana mild mosaic virus (AF314662) using replicase proteins. In contrast, GYMaV clustered within genus Foveavirus in CP phylogeny as suggested by pairwise comparisons (Figs. S2 and S3). Moreover, the trees were partially incongruent (Fig. 2), bringing up the question of whether these two viral genes should be considered for genus demarcation.

DISCUSSION
Aiming the identification of garlic-infecting RNA viruses following an unbiased approach, total RNA from eight garlic cultivars was high-throughput sequenced. Overall, virus isolates taxonomically classified in ten virus species were identified, nine of them having been previously reported in garlic VCs (Bereda, Paduch-Cichal & Dabrowska, 2017;Mituti et al., 2015;Wylie et al., 2014). The biological effects of these virus infections on the different garlic cultivars remains to be investigated, but based on previous studies they might compromise the growing of plants and bulbs (Conci, Canavelli & Lunello, 2003;Lunello, Rienzo & Conci, 2007). Although all plants presented yellow mosaic, it is hard to conclude whether all identified viruses are associated with this symptom or if there is a synergistic effect among the viruses in the community. In future research, this issue could be addressed via biological isolation of these viruses by mechanical or vector-borne inoculation/transmission onto indicator plants or by construction of infectious cDNA clones.
Apart from the viruses previously reported in garlic viromes, a new betaflexivirus, tentatively named garlic yellow mosaic-associated virus (GYMaV), was found in seven out of eight garlic cultivars tested. The presence of GYMaV in most cultivars indicates that it has likely been spread by vegetative propagation. However, its transmission by an insect vector should be not ruled out. Currently, the family Betaflexiviridae  encompasses two subfamilies (Trivirinae and Quinvirinae) that together include eleven genera (https://talk.ictvonline.org/taxonomy/). GYMaV shares the highest nt identities with African oil palm ringspot virus (genus Robigovirus) and potato virus T (genus Tepovirus) for the replicase (56% and 55%, respectively) and with viruses classified in the genus Foveavirus for the CP (61-64% nt identity). According to the ICTV, viruses of suggested new genera are supposed to be less than 45% nt identical in those genes with viruses already reported (King et al., 2012). However, GYMaV constitutes a distant evolutionary lineage in the Betaflexviridae (Fig. 1), and therefore should be classified in a new genus. As seen in our pairwise identity matrices of theses genes, the sequence identity cut off should be revised since most comparisons for GaYMV are above 45% threshold (Fig. S2). GYMaV virions may be shaped as flexuous filaments as observed for other betaflexiviruses (King et al., 2012). With a typical betaflexivirus genomic organization, GYMaV codes for three proteins (TGB1, TGB2 and TGB3) likely associated with cell-tocell and systemic virus movement in plant hosts (Erhardt et al., 2005;Morozov & Solovyev, 2003). In general, betaflexiviruses have one (30K-like) or three movement proteins (as GYMaV), which is a criterion to assign them to subfamilies Trivirinae or Quinvirinae, respectively. Although the replicase and CP genes are used for genus demarcation, our analyses reinforce the concept of modular evolution, showing that these genes and protein products are phylogenetically incongruent. Thus, either one or another should be used for taxonomical purpose. Our analyses also suggest that either GYMaV underwent recombination or that these genes have different mutation rates due to different selection pressures.
GYMaV as a component of garlic VCs should be considered in the development of virus-free garlic varieties. Many surveys of garlic viruses previously reported were based on target specific methods, since specific detection tools were utilized (Chen & Adams, 2001;Chen, Chen & Adams, 2002;Fajardo et al., 2001;Fayad-Andre, Dusi & Resende, 2011;Nam et al., 2015;Taglienti et al., 2018). Although this is the first report of GYMaV, we cannot rule out its presence on a larger geographical and temporal scale.

CONCLUSIONS
GYMaV is a putative new betaflexivirus found in virus complexes of several garlic cultivars. Based on its high frequency in these plants, GYMaV is likely to be vegetative propagated like other viruses previously reported in such complexes. Although the replicase and CP genes are used as taxonomical criteria for genus demarcation of the family Betaflexiviridae, cophylogeny analysis pointed that these genes sort out the betaflexiviruses differently.