Genomic Analysis of a Novel Torradovirus “Rehmannia Torradovirus Virus”: Two Distinct Variants Infecting Rehmannia glutinosa

Rehmannia glutinosa, a crucial medicinal plant native to China, is extensively cultivated across East Asia. We used high-throughput sequencing to identify viruses infecting R. glutinosa with mosaic, leaf yellowing, and necrotic symptoms. A novel Torradovirus, which we tentatively named “Rehmannia torradovirus virus” (ReTV), was identified. The complete sequences were obtained through reverse-transcription polymerase chain reaction (RT-PCR), 5′ and 3′ rapid amplification of cDNA ends, and Sanger sequencing. The amino acid sequence alignment between the ReTV-52 isolate and known Torradovirus species in the Pro-Pol and coat protein regions were 51.3–73.3% and 37.1–68.1%, respectively. Meanwhile, the amino acid sequence alignment between the ReTV-8 isolate and known Torradovirus species in the Pro-Pol and coat protein regions were 52.7–72.8% and 36.8–67.5%, respectively. The sequence analysis classified ten ReTV strains into two variants. The ReTV-52 genome has two RNA segments of 6939 and 4569 nucleotides, while that of ReTV-8 consists of two RNA segments containing 6889 and 4662 nucleotides. Sequence comparisons and phylogenetic analysis showed ReTV strains clustered within the Torradovirus, exhibiting the closet relation to the squash chlorotic leaf spot virus. The RT-PCR results showed a 100% ReTV detection rate in all 60 R. glutinosa samples. Therefore, ReTV should be classified as a novel Torradovirus species. ReTV is potentially dangerous to R. glutinosa, and necessitating monitoring this virus in the field.


Introduction
Rehmannia glutinosa (family Scrophulariaceae) is an economically significant herbaceous medicinal plant.It is indigenous to China and widely cultivated in China, Korea, Japan, and northern Vietnam [1].The fresh or dried root tubers of R. glutinosa are used as medicine, and the tuberous roots of R. glutinosa are rich in medicinal compounds, such as phenylethanoid glycosides, iridoids, ionones, polysaccharides, flavonoids, sugars, and other components.These active compounds exhibit diverse pharmacological effects on the immune, cardiovascular, blood, endocrine, and nervous systems [2].Rehmannia glutinosa is the second most commonly used among 203 Chinese patent medicine prescriptions, and is one of the top 10 species used in Chinese herbal medicines and products exported from China.In China, R. glutinosa is one of the famous "Four Huai Medicines" and is grown in a significant cultivation area exceeding 1000 hectares in northern China [3].The primary production regions include the Henan, Shanxi, Shandong, and Hebei Provinces.Jiaozuo City in Henan Province stands out as a major production region.Catapol, rehmannioside A, and rehmannioside D contents of R. glutinosa from this area are the highest among plants from all regions [4].
Torradoviruses are classified with the family Secoviridae under the order Picornavirales [15].They were first identified in 2007 as two novel virus species: tomato torrado virus (ToTV) and tomato marchitez virus (ToMarV) [16,17].Natural Torradovirus infections were initially described in tomatoes (Solanum lycopersicum L.).Subsequently, other Torradoviruses were recognized in non-tomato plants; with the development of new tools, several other host plants have been reported, including carrot [18], lettuce [19], motherwort [20], squash [21], cassava [22], and burdock [23].According to the latest International Committee on Taxonomy of Viruses (ICTV) report, Master Species List 39 (MSL39), this genus encompasses nine member species: ToTV, ToMarV, carrot torradovirus 1 (CaTV1), lettuce necrotic leaf curl virus (LNLCV), motherwort yellow mottle virus (MYMoV), squash chlorotic leaf spot virus (SCLSV), codonopsis torradovirus A (CoTVA), cassava torrado-like virus (CsTLV), and fleabane yellow mosaic virus (FbYMV) [24].In addition to the nine viruses previously mentioned, tomato chocolate virus (ToChV), tomato chocolate spot virus (ToChSV) [25], and tomato necrotic dwarf virus (ToNDV) have been suggested as a tentative species of the genus Torradovirus.Torradoviruses are composed of small spherical virions measuring approximately 30 nm in diameter.Their genome includes two linear positive-sense single-stranded RNA molecules of approximately 7 and 5 kb, respectively.RNA1 carries a single large open reading frame (ORF) that encodes a putative polyprotein comprising conserved helicase, protease, and RNA-dependent RNA polymerase (RdRp).RNA2 has two predicted ORFs encoding two polyproteins: RNA2-ORF1 encodes a putative polyprotein of unknown function, while ORF2 encodes a polyprotein that contains a putative movement protein (MP) at its N, followed by three coat proteins (CPs).In species classification, two prevailing criteria for demarcating members of the Secoviridae family include amino acid (aa) sequence identities of <80% in the Pro-Pol region (the "CG" motif of the 3C-like protease and the "GDD" motif of the RNA-dependent RNA polymerase) of RNA1-ORF1 and <75% in the CP regions of RNA2-ORF2.Previous comparisons within this genus have distinguished two groups: tomato-infecting (TI) and non-tomato-infecting (NTI) members, based on the aa sequence identities of encoding putative protein and the length of 3 ′ UTRs [26].
In this study, a putative novel virus that severely affects R. glutinosa in the field, influencing its growth.The virus was named "rehmannia torradovirus virus (ReTV)".The whole genome sequences of ReTV were determined.The genome organization, phylogenetic relationships, and molecular variations of the virus were analyzed.It is proposed ReTV to be a new member of the genus Torradovirus.

Plant Material
Sixty R. glutinosa samples exhibiting virus-like symptoms such as mosaic, yellowing, mottling, and necrosis were collected between June and July 2020.Each sample consisted of two or three leaves from an individual plant.These samples-Wenxian (23/60), Wuzhi (20/60), and Yuzhou (17/60)-were obtained from three locations in Henan Province, China.To investigate the presence of viral agents associated with these virus-like symptoms, highthroughput sequencing (HTS) was performed on all 60-leaf samples of R. glutinosa.Before the HTS, all samples were ground in liquid nitrogen and individually stored at −80 • C.

High-Throughput Sequencing Analysis
To identify potential viruses in the samples, small portions of each collected leaf sample were combined into a mixed sample and sent to Berry Genomics Corporation (Beijing, China) for HTS analysis.Total RNA was extracted from all 60 leaf samples using the RNAprep Pure Plant Plus kit (Tiangen, Beijing, China).RNA quantity and quantity were assessed using a Nanodrop 2000 analyzer and agarose gel electrophoresis, respectively.The transcriptome library was constructed using the NEBNext Ultra RNA Library Prep Kit from Illumina (San Diego, CA, USA).Sequencing was performed using the Illumina Nova Seq6000 sequencing system (Berry Genomics Corporation, Beijing, China).The processing and analysis of the sequencing data were completed by Wuhan Biowefind Co., Ltd.(Wuhan, China), who mainly performed the processing and splicing of the sequencing data.The raw reads were trimmed of adapter sequences and filtered for low-quality reads using FASTP version 1.5.6c[27].The trimmed reads were de novo assembled into larger contigs using IDBA-UD version 1.1.1[28] with k-mer values of 80, 90, and 110.The resulting contigs were aligned to the protein database using a BLASTx search on NCBI (https://www.ncbi.nlm.nih.gov/(accessed on 5 June 2024)).

The Full Genome Assembly of ReTV and Sequence Analysis
At an early stage, the results of HTS showed that contigs contained two ReTV variants.From 60 R. glutinosa samples, ten strains were randomly selected to obtain the full sequences of ReTV variants, for which specific primers were designed based on the contigs of HTS results (Supplementary Table S1).The primers were synthesized by Sangon Biotech Co., Ltd.(Shanghai, China).Total RNA from positive samples was extracted using a Spin Column Plant Total RNA Purification Kit (Sangon Biotech, Shanghai, China).Single-stranded cDNA was synthesized using the PrimeScript TM II 1st Strand cDNA Synthesis kit (Takara, Dalian, China), with RNA serving as the template, and the cDNA was stored at −20 • C.
The overlapping fragments covering the genomic sequence of ReTV were amplified by RT-PCR.The PCR reaction system consisted of the following: 2× taq Master Mix, 10 µL; forward primer (10 µM), 0.5 µL; reverse primers (10 µM), 0.5 µL; cDNA, 1 µL; and ddH 2 O added to create a final volume of 20 µL.PCR conditions included an initial denaturation at 95 • C for 5 min, 35 cycles of denaturation at 95 • C for 30 s, annealing at 55 • C for 30 s, extension at 72 • C for 1-2 min, and a final extension step at 72 • C for 10 min.The mixed samples were then stored at −20 • C. The sequences of the 5 ′ and 3 ′ ends were obtained by rapid amplification of cDNA ends (RACE) using a SMARTer RACE 5 ′ and 3 ′ Kit (Sangon Biotech), respectively.All amplicons were recovered, purified, cloned into the pMD19-T vector (Takara), transformed into competent Escherichia coli TG1 cells, and subsequently sequenced.
Sequence alignment and homology analyses for the complete ReTV genome and its encoded genes were performed by each functional protein using DNAMAN 7.0.

Phylogenetic Analysis
To identify species and calculate similarity percentages, the amino acid sequences of the Pro-Pol and CP genes of ten ReTV strains from this study were selected, together with those of members of the four genera (Torradovirus, Sadwavirus, Stralarivirus, and Cheravirus ) belonging to the family Secoviridae from NCBI.Torradovirus includes MYMoV, CoTVA, LNLCV, CaTV1, ToMarV, ToCSV, ToTV, SCLSV, and CsTLV.Cheravirus includes cherry rasp leaf virus (CRLV), apple latent spherical virus (ALSV), and currant latent virus (CuLV).Sadwavirus includes satsuma dwarf virus (SDV).Finally, Stralarivirus includes Lychnis mottle virus (LycMoV) and strawberry latent ringspot virus (SLRSV).Additional specific information on these species is provided in Supplementary Table S2.Thirty-nine sequences were aligned using MEGA11.0software [29]; the phylogenetic tree was estimated using the maximum likelihood method and a Poisson model with a bootstrap value of 1000.

Recombination Analysis of Ten ReTV Isolates and Other Torradoviruses
The complete genome sequences of the ReTV strains (OR453958-OR453977) and other Torradoviruses (including RNA1 and RNA2) that were same as those listed in Table 1 used for sequence comparisons were aligned using Clustal X1 [30].Alignment results were analyzed using Recombination Detection Program (RDP) v.4.101 software [31].The following analytical methods were employed: RDP, GENECONV, BootScan, MaxChi, Chimacra, SiScan, and 3Seq.Default parameters were applied for each program during recombination detection.Recombination was identified in the RDP analysis when three or more methods detected it, with a p-value below 10 −5 indicating a significant recombination event for each method.Total RNA was extracted, followed by reverse transcription, and PCR amplification was performed.The PCR products were separated using agarose gel electrophoresis, and their sizes were confirmed with an ultraviolet lamp and agarose gel imaging system.The PCR products were purified using the column method (E.Z.N.A. Gel Extraction kit, Omega Bio-tek, Norcross, GA, USA) for Sanger sequencing.Molecular variations were assessed after aligning the nucleotide sequences using DNAMAN software7.0[32].

High-Throughput Sequencing Analysis Data
HTS of the RNA-seq library yielded 28,522,540 raw reads, totaling over 8 GB.After trimming, 27,664,949 high-quality clean reads were obtained and used for contig assembly.These assembled contigs underwent a local BLAST for a BLASTx search in GenBank.All 12 contigs were similar to Torradovirus-like viruses.Among these, three large contigs of 6716, 4782, and 5145 nt shared 51.05, 57.40, and 54.31% aa sequence identity with the highest match to RNA1 of SCLSV (GenBank accession number: UZN89714) of the genus Torradovirus, respectively.Two contigs of 3828 and 3753 nt shared 60.36% and 60.75% aa sequence identity, respectively, with the highest match to RNA2 of SCLSV (GenBank accession number: UZN89715) of the genus Torradovirus.

Genome Organization of ReTV
To acquire complete sequences of ReTV, twenty sequences were obtained through RT-PCR, 5 ′ and 3 ′ RACE of RNA1 and RNA2, including four complete and sixteen nearly full sequences.Upon comparing these sequences, two distinct variants-ReTV-52 and ReTV-8, named "rehmannia torradovirus virus (ReTV)"-were identified.
The full-length RNA1 of ReTV-52 measured 6939 nt (GenBank accession number: OR453962), featuring a poly(A) tail at the 3 ′ terminal.It included a single ORF from 177-6777 nt, encoding a polyprotein of 2200 aa with a predicted molecular mass of 243.1 kDa (Figure 1a).Three conserved motifs were identified using ScanProsite: a helicase (PS51218) at positions 368-538 aa, a 3C-like protease (PS51874) at positions 892-1113 aa, and RdRp (PS50507) at positions 1403-1538 aa.The full-length RNA2 of ReTV-52 is 4569 nt (GenBank accession number: OR453973), featuring a poly(A) tail at the 3 ′ terminal.It contained two ORFs with overlapping regions (nt 701-751).The first ORF from 106-751 nt encodes a protein of 215 aa with a predicted molecular mass of 23.7 kDa.The second ORF from 701-3861 nt encodes a polyprotein of 1053 aa with a predicted molecular mass of 115.6 kDa.The polyprotein is thought to be cleaved into an MP and three CPs.The cleavage sites-Q 336 /M 337 , Q 592 /A 593 , and Q 830 /I 831 -were identified through the alignment of the aa sequences with those of other Torradoviruses (Figure 2).
The full-length RNA1 of ReTV-8 is 6889 nt (GenBank accession number: OR453963), with a poly (A) tail at the 3 ′ terminal.It includes a single ORF from 131-6728 nt, encoding a polyprotein of 2199 aa with a predicted molecular mass of 244.0 kDa (Figure 1b).Three conserved motifs were identified using ScanProsite website: a helicase (PS51218) at positions 368-538 aa, a 3C-like protease (PS51874) at positions 888-1111 aa, and RdRp (PS50507) at positions 1401-1536 aa.The full-length RNA2 of ReTV-8 is 4662 nt (GenBank accession number: OR453968), with a poly(A) tail at the 3 ′ terminal.It contains two ORFs with overlapping regions (nt 726-776).The first ORF from 122-776 nt encodes a protein of 218 aa with a predicted molecular mass of 23.5 kDa.The second ORF from 726-3888 nt encodes a 1054-aa polyprotein with a predicted molecular mass of 116.3 kDa.The polyprotein is thought to be cleaved into an MP and three CPs.The cleavage sites-Q 335 /T 336 , Q 591 /V 592 , and Q 831 /V 832 -were identified through the alignment of the aa sequences with those of other Torradoviruses (Figure 2).The full-length RNA1 of ReTV-8 is 6889 nt (GenBank accession number: OR453963), with a poly (A) tail at the 3′ terminal.It includes a single ORF from 131-6728 nt, encoding a polyprotein of 2199 aa with a predicted molecular mass of 244.0 kDa (Figure 1b).Three conserved motifs were identified using ScanProsite website: a helicase (PS51218) at positions 368-538 aa, a 3C-like protease (PS51874) at positions 888-1111 aa, and RdRp (PS50507) at positions 1401-1536 aa.The full-length RNA2 of ReTV-8 is 4662 nt (GenBank accession number: OR453968), with a poly(A) tail at the 3′ terminal.It contains two ORFs with overlapping regions (nt 726-776).The first ORF from 122-776 nt encodes a protein of Q591/V592, and Q831/V832-were identified through the alignment of the aa sequences with those of other Torradoviruses (Figure 2).

Phylogenetic Analysis of ReTV strains
To further assess the taxonomic position of ReTV, the aa sequences of Pro-Pol and CPs of ReTV were aligned with other torradoviruses using the ClustalW algorithm in MEGA 11.0.A bootstrap value of 1000 was utilized to construct a phylogenetic tree using the maximum likelihood method.The results showed that the ten ReTV strains were closely related to the SCLSV strain Su12-10 (GenBank accession number: KU052530) and were categorized into two variants based on the Pro-Pol regions (Figure 4a), and the ten ReTV strains were closely related to the SCLSV strain Su12-10 (GenBank accession number: KU052531) and were categorized into two variants based on the CP regions (Figure 4b).All torradoviruses were clustered into two groups (NTI and TI).In conclusion, phylogenetic tree analysis based on Pro-Pol regions and CPs demonstrated the highest homology with SCLSV strain Su12-10.Consequently, the ReTV strains in this study were classified within the NTI group and clustered together with SCLSV and relatives.

Phylogenetic Analysis of ReTV Strains
To further assess the taxonomic position of ReTV, the aa sequences of Pro-Pol and CPs of ReTV were aligned with other torradoviruses using the ClustalW algorithm in MEGA 11.0.A bootstrap value of 1000 was utilized to construct a phylogenetic tree using the maximum likelihood method.The results showed that the ten ReTV strains were closely related to the SCLSV strain Su12-10 (GenBank accession number: KU052530) and were categorized into two variants based on the Pro-Pol regions (Figure 4a), and the ten ReTV strains were closely related to the SCLSV strain Su12-10 (GenBank accession number: KU052531) and were categorized into two variants based on the CP regions (Figure 4b).All torradoviruses were clustered into two groups (NTI and TI).In conclusion, phylogenetic tree analysis based on Pro-Pol regions and CPs demonstrated the highest homology with SCLSV strain Su12-10.Consequently, the ReTV strains in this study were classified within the NTI group and clustered together with SCLSV and relatives.

RT-PCR Detection of ReTV in R. glutinosa Samples
To investigate the occurrence of ReTV in R. glutinosa, leaf samples from sixty plants were collected.Eight primer pairs were employed to detect ReTV1 and ReTV2 in these samples using RT-PCR.Fourteen samples tested positive for ReTV-variant1-RNA1-1F/1R, yielding a detection rate of 23.3%.Nine PCR products were randomly selected for sequencing, revealing a molecular variation between 99.1 and 100%.Fifteen samples tested positive for ReTV-variant1-RNA1-2F/2R, showing a detection rate of 25%.Ten PCR

Discussion
In this study, we presented the complete sequences of ReTV through HTS, RT-PCR, and 5 ′ and 3 ′ RACE.HTS has become a popular method for rapidly detecting known and novel plant-infecting viruses [33,34].Alfredo Diaz-Lara et al. [35] discovered grapevine enamovirus 2, a new member of the Genus Enamovirus, using HTS.HTS was utilized to obtain all contigs and compare them with other viruses in the NCBI database.In addition to the six viruses previously mentioned, we discovered 12 contigs that corresponded to viruses within the genus Torradovirus.Using RT-PCR and 5 ′ and 3 ′ RACE, the novel virus was identified.We followed the species classification criteria and compared the Pro-Pol regions and CPs of ReTV with those of other torradoviruses.The name "Rehmannia torradovirus virus" is proposed for this novel putative member of the genus Torradovirus.
The Torradovirus genus was established as containing two distinct groups based on TI and NTI members, determined by aa sequence identities of encoding putative proteins and the length of 3 ′ UTRs.The report indicates that torrado disease in Spain and Poland was generally observed in greenhouses or fields that were heavily infested with whiteflies.This led to suspicions that whiteflies might be insect vectors [36].Three TI torradoviruses-ToTV, ToMarV, and ToChV-are transmitted by three whitefly species: T. vaporariorum, B. tabaci, and T. abutilonea (Haldeman) [37].In the NTI group, CaTV is transmitted by aphids Myzus persicae, M. persicae biotype, and Cavariella aegopodii [38,39].The transmission vectors of NTI torradovirus affecting cassava (CsTLV), lettuce (LNLCV), and motherwort (MYMoV) have not been identified.ReTV has been identified in R. glutinosa and is closely associated with SCLSV.As a new virus and host, future studies should aim to confirm its natural and experimental host ranges as well as vector transmission.
Multiple viral infections within the same host plant are common in the field and often lead to synergism or antagonism among different viruses, usually with varied pathological outcomes [40].Li et al. discovered that tomato chlorosis virus (ToCV) and tomato yellow leaf curl virus (TYLCV) mixed infections induced synergistic tomato disease, leading to a higher disease severity index and decreased stem heights and weights.Additionally, viral accumulation in ToCV and TYLCV mixed infected plants was higher than that in singly infected plants [41].Infection with sweet potato feathery mottle virus and sweet potato

16 Figure 1 .
Figure 1.Genome organization of the rehmannia torradovirus virus (ReTV) showing relative positions of ORFs and their expression products.(a) rehmannia torradovirus virus-52, (b) rehmannia torradovirus virus-8.The RNA1 indicated the positions of sequences encoding conserved protein domains (HEL, Pro, and RDRP), while the RNA2 indicate the putative cleavage sites for the MP and CPs.The molecular weight predicted for each protein is reported above the boxes.RNA1 and RNA2 have indicated the start and stop positions of each virus segment in the viral genome organization.

Figure 1 .
Figure 1.Genome organization of the rehmannia torradovirus virus (ReTV) showing relative positions of ORFs and their expression products.(a) rehmannia torradovirus virus-52, (b) rehmannia torradovirus virus-8.The RNA1 indicated the positions of sequences encoding conserved protein domains (HEL, Pro, and RDRP), while the RNA2 indicate the putative cleavage sites for the MP and CPs.The molecular weight predicted for each protein is reported above the boxes.RNA1 and RNA2 have indicated the start and stop positions of each virus segment in the viral genome organization.

Figure 3 .
Figure 3. Recombination analysis of ReTV-41 isolates using the recombination detection program RDP4.1 Dark gray regions represent a 95% breakpoint confidence interval, light gray region indicates a 99% breakpoint confidence interval, while the pink region highlights a tract of sequence with a recombination origin.ReTV, rehmannia torradovirus virus.

Figure 3 .
Figure 3. Recombination analysis of ReTV-41 isolates using the recombination detection program RDP4.1 Dark gray regions represent a 95% breakpoint confidence interval, light gray region indicates a 99% breakpoint confidence interval, while the pink region highlights a tract of sequence with a recombination origin.ReTV, rehmannia torradovirus virus.

Table 2 .
Nucleotide sequence homology (%) in the corresponding regions of 10 ReTV genome RNA1 and RNA2 (%).Bold numbers indicate that the ReTV sequence has the highest or lowest consistency with other virus sequences.

Table 3 .
Amino acid sequence homology (%) in corresponding regions of 10 ReTV genome Pro-Pol region and CPs.Bold numbers indicate that the ReTV sequence has the highest or lowest consistency with other virus sequences.