Putative Auxin and Light Responsive Promoter Elements From the Tomato spotted wilt tospovirus Genome, When Expressed as cDNA, Are Functional in Arabidopsis

Members of the virus order Bunyavirales cause serious diseases in animals, humans and plants. Family Tospoviridae in this order contains only one genus Orthotospovirus, and members in this genus exclusively infect plants. Tomato spotted wilt tospovirus (TSWV) is considered one of the most economically important plants viruses. Little is known about the regulatory elements in the TSWV genome. Here we show that, when in the cDNA form, the 5′-upstream region of the TSWV-coded GN/GC gene (pGN/GC) possesses putative cis-regulatory elements, including an auxin responsive element (AuxRE) for binding of auxin response factors (ARFs), as well as a circadian clock-associated 1 (CCA1) protein binding site (CBS). Due to the lack of a reverse genetics system, we verified the functionality of these elements in Arabidopsis. pGN/GC showed light-suppressive promoter activity in transgenic Arabidopsis, and mutation in the CBS was sufficient to switch the activity to light inducible. Additionally, exogenous auxin treatments repressed the promoter activity of both wild type and CBS-mutated pGN/GC. Mutation in AuxRE in both promoters abolished their sensitivity to auxin. As transcriptional repressors, both CCA1 and ARF2 were able to bind to pGN/GC directly. To our knowledge, this is the first report that a 5′-terminal sequence of an RNA virus has light-and hormone-responsive promoter activities when expressed as cDNA in host plant’s nuclear background. Our findings suggest new clues on the possible origin, evolution and function of the TSWV genomic sequence and its non-coding regions.


INTRODUCTION
Viruses in the order Bunyavirales are characterized by segmented RNA genome with three RNAs packaged in enveloped virus particles (Briese et al., 2013). Family Tospoviridae contains only one genus Orthotospovirus (Siddell et al., 2019). Viruses in this genus exclusively infect plants, and tospoviruses are unique in that the large RNA is in negative sense while the medium and small RNAs possess an ambisense genome organization (Prins and Goldbach, 1998;Adkins, 2000). Tomato spotted wilt tospovirus (TSWV), a member of the genus Orthotospovirus, is considered one of the world's most important plant viruses (Scholthof et al., 2011). Transmitted by thrips, TSWV causes significant losses to a wide range of economically important crops (Pappu et al., 2009;Oliver and Whitfield, 2016). The three genomic, single-stranded RNAs encode all the essential proteins for virus infection, replication, and particle assembly (Zhai et al., 2014;Tripathi et al., 2015). In recent years, considerable progress has been made on elucidating the tospovirus genome organization, replication, transcription, and molecular interactions with its host plants (Turina et al., 2016).
RNA viral genomes encode a RNA-dependent RNA polymerase (RdRp) that recognizes specific RNA elements (promoter sequences) in either the positive (+) or negative (−) strands of the viral RNA genomes, and viral cis-regulatory elements (either outside or inside the coding regions) may play an important role in virus replication and RdRp-RNA interactions (Koev and Miller, 2000;Li and Stollar, 2007). During tospoviral transcription, the viral RdRp does not add a cap structure to mRNAs due to its lack of methyltransferase activity. Instead, the tospoviral RdRp snatches capped RNA leader sequences from the cytoplasmic pool of host mRNAs and uses these sequences to prime transcription (Kormelink et al., 1992;Duijsings et al., 2001;van Knippenberg et al., 2005). Both tospovirus replication and transcription take place entirely in the cytoplasm, but little is known about the host factors involved or the source of capped RNA leader sequences (Turina et al., 2016).
A major bottleneck in studying structure-function relationships of tospoviruses is the lack of a reverse genetics system. The ambisense genome organization of tospoviruses makes it very difficult to produce an infectious cDNA clone of the RNA genome. Despite years of effort by several groups, an infectious clone for any tospovirus remains elusive. Additionally, the roles of non-translated terminal regions of the genomic RNAs in tospovirus gene expression are largely unknown.
Here we show that, when in the cDNA form, the 5 -upstream region of the TSWV-coded G N /G C gene (pG N /G C ) possesses putative cis-regulatory elements, including an auxin response element (AuxRE) for binding of auxin response factors (ARFs) (Guilfoyle and Hagen, 2007), as well as a circadian clockassociated 1 (CCA1) protein binding site (CBS) (Wang and Tobin, 1998). Since there is no reverse genetic system available for any known tospovirus, we verified the functionality of these elements in Arabidopsis. To our knowledge, this is the first report of a 5 -terminal sequence of an RNA virus with lightand hormone-responsive promoter activities when expressed as cDNA in a plant host.

Plant, Bacteria, Virus, and Plasmid Materials
Arabidopsis thaliana ecotype Col-0 was used for Agrobacteriummediated transformation. The Agrobacterium tumefaciens strain GV3101 was used for Arabidopsis transformation. The TSWV isolate T (M segment GenBank no. AY870389) was used. pCAMBIA1381Z was used as the plant expression binary vector.

Promoter Analysis of cis-Regulatory Elements
The 5 -upstream sequences of the TSWV-coded genes were scanned for putative plant cis-regulatory elements. The list of known transcription factor (TF)-binding sites in plants was obtained from the Arabidopsis cis-regulatory element database (AtcisDB) in the Arabidopsis Gene Regulatory Information Server (AGRIS) 1 . Except for the transcription initiation site, both the original and the reverse-complement sequences of the TF-binding sites were used for scanning.

RNA Extraction, Reverse Transcription and DNA Cloning
Total RNA was extracted from TSWV infected Nicotiana benthamiana leaves for cDNA synthesis. The pG N /G C fragment was amplified from cDNA using primers 5 -CGGAATTCAGAGCAATCAGTGCAAACAAA -3 and 5 -CG GGATCCTTATTTTCCACTTGATAATAAACATTA -3 , and then cloned into EcoRI/BamHI sites of pCAMBIA1381Z. The pN fragment was amplified using primers 5 -CGGAATTCAGAGCAATCGTGTCAATTTTGTGTT -3 and 5 -CGGGATCCGTATTGAGATTCTCAGAATTCCC -3 , and then cloned into EcoRI/BamHI sites of pCAMBIA1381Z. The pRdRp fragment was amplified using primers 5 -CGGAATTCAGAGCAATCAGGTAACAACGA -3 and 5 -CGGGATCCTTATTTATTCTCTCAAACTCATCATC -3 , and then cloned into EcoRI/BamHI sites of pCAMBIA1381Z.
All the PCR amplicons were verified by sequencing.

Plant Transformation and Seedling Growth
The constructs were introduced into Agrobacterium by electroporation and then used to transform Arabidopsis using the floral dip method (Clough and Bent, 1998). T 2 seeds were collected from T 1 transgenic plants and used for the selection of single-locus insertion transgenic lines (3:1 segregation of hygromycin resistant versus sensitive seedlings, verified by the chi-square test). Homozygous transgenic seeds were obtained in the T 3 generation. Two independent transgenic lines with single-locus T-DNA insertions were used for every promoter activity and GUS staining assay. To prepare material for PCR and GUS assays, seeds were placed on half-strength MS medium plates. After stratification at 4 • C for 4 days, the seeds were treated by red light for 5 h to stimulate germination, followed by growing at 25 • C for 4 days, either in dark or under continuous 80 µmol m −2 s −1 white light (Peng et al., 2015). For indole-3-acetic acid (IAA) treatments, seeds were placed on IAA-containing medium plates throughout the experiments.

Quantitative RT-PCR and GUS Staining
Total RNA was extracted from 4-day old seedlings (grown in either in dark or under light), Sigma RNase-free DNase I (St. Louis, MO, United States) was added during the RNA extraction to reduce gDNA contamination. Reverse transcription was done using the iScript Reverse Transcription Supermix (Bio-Rad, Hercules, CA, United States). Quantitative RT-PCR (qRT-PCR) was performed using the SsoAdvanced Universal SYBR Green Supermix (Bio-Rad) and the Applied Biosystems 7500 fast real-time PCR system (Grand Island, NY, United States). The qRT-PCR primers for the detection of GUS transcripts were 5 -GGTAGATCTGAGGAACCGACG-3 and 5 -TCGCGATCCAGACTGAATGCC-3 . The forward primer stretches over the catalase intron of the GUS gene (13 nt upstream and 8 nt downstream of the catalase intron). The Arabidopsis UBQ10 gene was used as reference. The qRT-PCR primers for the detection of UBQ10 transcripts were 5 -TCTTCGTGGTGGTTTCTAAATCTCG-3 and 5 -AAAGAGATAACAGGAACGGAAACATAGT-3 . Each data point had three biological replicates. Unpaired Student's t-test was used to test the significance of difference in gene expression. The GUS staining was performed using the protocol previously described (Jefferson et al., 1987).

Targeted Yeast One-Hybrid Assay
The Gateway-compatible system was adopted for the Y1H assay (Deplancke et al., 2006). The DNA fragment (baits, pG N /G C -CBS in this case) was cloned into pDONR P4-P1R by BP reaction. The fragment was then subcloned into pMW#2 by LR reaction and integrated into the genome of yeast strain YM4271. pG N /G C -CBS was fused with the reporter gene HIS3 in the yeast genome. After verifying for self-activation, the bait yeast strain was transformed with CCA1 or ARF2 cloned in the Gateway prey vector pACT-GW and the empty vector as control. The activation of HIS3 was tested by yeast tolerance to 3-aminotriazole (3-AT, a competitive inhibitor of the His3p enzyme). The primers used for the initial amplification of pG N /G C -CBS were 5 -GGGGACAACTTTGTATAGAAAAGTTGGACTAATCTGATG CTAGAATCTC-3 and 5 -GGGGACTGCTTTTTTGTAC AAACTTGGAAGCATTCAAGCAGTTGTTAGG-3 . The sequence of pG N /G C -CBS is GACTAATCTGATGCTAGAATCTCAG ACTCCTGGAACCCGTCAGATACGAGAAGAAGAATCAAC CATCCCTATTTTTGCTGAGTCAACTACGGAAAAAACAAT CTTTGTCTCGGATCTTCCTAACAACTGCTTGAATGCTTC. All the PCR amplicons were verified by sequencing. All the Gateway cloning reagents came from Invitrogen.

Sequence Analysis of the 5 -Upstream Regions of Five TSWV cDNAs
Whether TSWV non-coding regions have regulatory function in virus replication or gene expression is still unknown. Due to the lack of a reverse genetics system for tospoviruses, it is difficult to investigate the potential functions of these noncoding regions in their original RNA form. Hence we started the functional dissection of the TSWV non-coding regions in their respective cDNA forms. All three genomic cDNAs of TSWV isolate T were scanned for potential transcription initiation and transcription factor (TF) binding elements. As it is hard to define the precise boundaries for promoters, we analyzed the 500 bp in the 5 -upstream regions (including both non-translational sequences and the 5 -terminal gene sequences) of the five TSWV genes. No putative transcript initiation sites (TATA box or its variants) were found in the 5upstream regions of RdRp, NSm and NSs (named as pRdRp, pNSm, and pNSs, respectively, Supplementary Figures S1A-C). However, both pG N /G C (Figure 1A) and the 500-bp fragment upstream of the N gene (named pN; Supplementary Figure S1D) were found to contain a putative TATA box variant (TATAAC) and Pribnow box (TATAAT). The promoter activity of TATAAC was predicted to be weaker than the standard TATA box (TATAAA) (Narang et al., 2005). The Pribnow box is essential for transcription in bacteria but may also have some promoter activity in eukaryotes (Patikoglou et al., 1999). Three putative light-responsive TF-binding elements were found in both pG N /G C ( Figure 1A) and pN (Supplementary Figure S1D), including a SORLIP1 element (Hudson and Quail, 2003), a T-box motif (Chan et al., 2001), and a GATA binding motif (Teakle et al., 2002). pG N /G C was also found to contain CBS and AuxRE ( Figure 1A), which are not present in pN (Supplementary Figure S1D).

The Conservation of Putative Promoter
Elements in pG N /G C Alignment of 56 TSWV G N /G C sequences available in GenBank showed that at least one CBS or AuxRE is present in pG N /G C of any given TSWV isolate ( Figure 1B). 20 out of 56 isolates contained both elements, 22 isolates have only CBS and 14 isolates have only AuxRE ( Figure 1B). With the exception of TSWV, neither CBS nor AuxRE was found in the corresponding same region of other known tospoviral genomes. The Pribnow box, downstream of CBS and AuxRE, is present in 55 out of 56 isolates examined ( Figure 1C), while the upstream TATA and Pribnow boxes in the 5 -untranslated region were found

T G T C T C : A R F binding site (auxin)
A G C C A C : S O R LIP 1 elem ent (light)

A C T T T G : T -box (light) (A /T )G A T A (G /A ): G A T A m otif (light)
C A A C A : R A V 1 -A binding site (leaf senescence) in 36 of the 50 isolates for which complete sequences are available ( Figure 1D).

pG N /G C Has Light-Suppressive Promoter Activity
Both pG N /G C and pN contain a putative TATA box variant and therefore may have promoter activity (Bernard et al., 2010). To test this hypothesis, the pG N /G C and pN fragments were cloned into the binary vector pCAMBIA1381Z to make promoter-GUS fusions (pG N /G C :GUS and pN:GUS), respectively. The TATAabsent pRdRp fragment was also cloned into pCAMBIA1381Z (pRdRp:GUS) for use as a negative control. GUS driven by the constitutive CaMV 35S promoter (p35S:GUS) was used as the positive control. As a compatible host for TSWV, Arabidopsis ecotype Col-0 was used for pG N /G C :GUS transformation (German et al., 1995). Homozygous transgenic lines with singlelocus T-DNA insertion were selected for successive experiments. Four-day old seedlings were used for assessing the promoter activity. Quantitative RT-PCR (qRT-PCR) was used to detect GUS transcript accumulation levels in seedlings grown under continuous white light (80 µmol m −2 s −1 ). Two independent transgenic lines were tested for each construct. Same transgenic plant selection criteria, seedling age and growth conditions applied to all the assays in this research. The promoter strengths of both pRdRp (negative control) and pN were negligible when compared to that of pG N /G C (Figure 2A), which demonstrated that pG N /G C can drive the transcription of the downstream GUS gene. In contrast, pN may not have meaningful promoter activity since pN:GUS/Col-0 lines only had basal-level GUS expression similar to that of pRdRp:GUS/Col-0 lines (Figure 2A). Compared to CaMV 35S, pG N /G C showed much weaker promoter activity (Figure 2A). While p35S:GUS lines showed strong GUS signal as expected, no visible GUS staining was detected from pRdRp:GUS, pN:GUS or pG N /G C :GUS transgenic lines ( Figure 2B). Furthermore, the activity of pG N /G C was down regulated by white light treatment when comparing GUS expression levels in dark-and light-grown pG N /G C :GUS/Col-0 seedlings ( Figure 2C). In contrast, white light versus dark did not significantly affect the expression of p35S:GUS (Supplementary Figure S2). No visible GUS signal was observed in either light-or dark-grown pG N /G C :GUS/Col-0 seedlings ( Figure 2D). It could be that either that the accumulated GUS protein was not sufficient to generate a visible signal, or the translated GUS protein was degraded due to it being targeted to an inappropriate location in the plant cell. Since part of the G N /G C gene sequence was fused to the GUS gene, it may also have affected the stability of the GUS protein. qRT-PCR assay using two independent transgenic lines (#1 and #2) revealed that the promoter activity of CBS-mutated pG N /G C -CBSm was up-regulated by light. (C) GUS staining results indicated that more GUS protein accumulated in light-grown pG N /G C -CBSm:GUS/Col-0 seedlings (#1 and #2) than in the dark-grown counterpart. Three independent yeast clones were shown for each sample. All transgenic Arabidopsis plants (pG N /G C -CBSm:GUS/Col-0) have single-locus T-DNA insertions. Two independent transgenic lines (#1 and #2) were used for each qRT-PCR and GUS staining assay. Seedlings were grown for 4 days at dark or 80 µmol m -2 s -1 white light condition. Leave samples for GUS staining came from 2-week old plants. For qRT-PCR, three biological replicates were conducted for each data point. The error bar denotes SEM. Stars indicate significant difference (tested by unpaired Student's t-test) of GUS expression levels between dark-and light-grown pG N /G C -CBSm:GUS/Col-0 seedlings ( * * P < 0.01, * * * P < 0.001). For seedling and leaf images, scale bar = 1 mm.
transcription of downstream genes. The SORLIP1 element, the T-box motif and the GATA binding motif are all light-activating elements (Chan et al., 2001;Teakle et al., 2002;Rus Alvarez-Canterbury et al., 2014), while CCA1-CBS interaction can either activate (Fujiwara et al., 2008) or suppress (Wang and Tobin, 1998;Li et al., 2011) the expression of target genes. Of the four elements above, CBS is the only one that can act as a cisrepressor. Since pG N /G C :GUS expression was suppressed by light (Figure 2A), CBS could be the major and bona fide cis-regulatory element in pG N /G C that mediates light-induced reduction of target gene expression. Targeted yeast one-hybrid (Y1H) assay showed that CCA1 directly bound pG N /G C (Figure 3A). A 141bp CBS-containing sequence from pG N /G C (named as pG N /G C -CBS) was used as the bait in the assay. CBS was the only light-responsive element located in pG N /G C -CBS.
Disruption of the CBS Element in pG N /G C Switches Its Promoter Activity to Light Inducible pG N /G C :GUS with CBS mutated from AACAATCT to AACGGTCT (named as pG N /G C -CBSm:GUS) was used for stable transformation of Arabidopsis. Homozygous lines with single-locus T-DNA insertions were selected for subsequent experiments. qRT-PCR analysis of two independent transgenic lines (pG N /G C -CBSm:GUS/Col-0 -1 and pG N /G C -CBSm:GUS/Col-0 -2) showed that the promoter activity of pG N /G C -CBSm became light-inducible ( Figure 3B). The result suggested that CBS was responsible for the light-suppressive characteristic of pG N /G C , and its suppression effect was epistasis to other light-inducible elements. The SORLIP1, T-box and GATA elements may contribute to the lightinducible feature of pG N /G C -CBSm after the disruption of CBS. Consistent with the qRT-PCR result, light-grown pG N /G C -CBSm:GUS/Col-0 seedlings and leaves had much stronger GUS staining compared to the leaves from seedlings grown in darkness ( Figure 3C).

Auxin Suppresses the Promoter Activity of pG N /G C via ARF-AuxRE Interaction
Since a putative AuxRE is present in pG N /G C , the auxin responsiveness of pG N /G C was tested using two pG N /G C :GUS/Col-0 lines and two pG N /G C -CBSm:GUS/Col-0 lines mentioned above. The promoter activities of both pG N /G C ( Figure 4A) and pG N /G C -CBSm ( Figure 4B) could be suppressed by exogenous indole-3-acetic acid (IAA) treatments. In contrast, IAA did not significantly affect the expression of p35S:GUS (Supplementary Figure S2). Similar site-mutagenesis approach was used to mutate the AuxRE from TGTCTC to TGGATC in pG N /G C :GUS and the new construct was named as pG N /G C -AuxREm:GUS. Alternatively, CBS and AuxRE were mutated simultaneously in pG N /G C :GUS to make another new construct named as pG N /G C -CBSm-AuxREm:GUS. Both constructs were used for Arabidopsis transformation. Two independent homozygous transgenic lines with single-locus T-DNA insertion from each transformation event were used for the IAA treatment assay. Seedlings were grown for 4 days under 80 µmol m −2 s −1 white light. The results showed that disruption of AuxRE (AuxREm) can abolish the suppression effect of exogenous IAA on both pG N /G C ( Figure 4C) and pG N /G C -CBSm ( Figure 4D). Additionally, in a targeted Y1H . (E) ARF2 directly bind the AuxRE-containing pG N /G C -CBS fragment in targeted Y1H assay. Three independent yeast clones were shown for each sample. All transgenic Arabidopsis plants (pG N /G C :GUS/Col-0, pG N /G C -CBSm:GUS/Col-0, pG N /G C -AuxREm:GUS/Col-0 and pG N /G C -CBSm-AuxREm:GUS/Col-0) have single-locus T-DNA insertions. Two independent transgenic lines (#1 and #2) were used for each qRT-PCR assay. Seedlings were grown for 4 days under 80 µmol m -2 s -1 white light. For qRT-PCR, three biological replicates were conducted for each data point. The error bar denotes SEM. Stars indicate significant difference (tested by unpaired Student's t-test) of GUS expression levels between seedlings treated with 10 µM IAA and non-IAA treatment control groups ( * P < 0.05, * * P < 0.01, * * * P < 0.001).
assay, the transcriptional repressor ARF2 bound pG N /G C -CBS, which contains the AuxRE of pG N /G C ( Figure 4E).

DISCUSSION
Our results suggest that the TSWV has regulatory elements that are found to be responsive to light and auxin when expressed as cDNA in the plant's nuclear background. Following our finding that certain promoter-like elements are present in the cDNA of TSWV M RNA, we used Arabidopsis to test if these elements might be functional in a plant host. Our results showed that the putative promoter elements we identified, indeed, were light and auxin responsive when expressed as cDNA in the host nuclear background. A mutation in CBS was sufficient to switch the pG N /G C activity from light-suppressive to inducible demonstrating the functional validity of pG N /G C in the host plant. Similarly, the promoter activity of pG N /G C was no longer suppressed by auxin when AuxRE was mutated. It will be interesting to investigate which element (SORLIP1, T-box or GATA) is the principal contributor to the light inducible characteristic of pG N /G C -CBSm. It is also possible that these elements simply have additive effect on light response.
Since TSWV is an RNA virus that completes its life cycle in the cell cytoplasm, the pG N /G C fragment exists in the native TSWV genome as RNA, and thus cannot act as a traditional promoter for the initiation of transcription. Our unpublished data also indicated that no DNA form of pG N /G C could be amplified by PCR from TSWV-infected plant tissues. The replication process of TSWV is not fully understood. It could be that the putative cisregulatory elements in the M RNA may regulate the replication of M RNA or the expression of G N /G C mediated by TSWV RdRp. Unlike ARFs which are localized in the nucleus, CCA1 can be detected in both nucleus and cytoplasm (Yakir et al., 2009). Therefore, the CCA1-CBS interaction may also be involved in the replication. Numerous lines of evidence from research on other virus systems showed that the cis-RNA elements are indispensable for RdRp-mediated viral RNA synthesis (You et al., 2004;Filomatori et al., 2006;Sun and Simon, 2006;Hu et al., 2007). Thus, it is possible that the identified cis-elements in pG N /G C have a regulatory function. However, due to the lack of tospoviral infectious clones, it is difficult to test potential protein interactions with the RNA genome of TSWV in vivo. The majority of CCA protein molecules enter plant nucleus rapidly after their biosynthesis (Yakir et al., 2009), which increases the difficulty of investigating the function of CCA1 in cytoplasm.
Another hypothesis is that TSWV may have obtained these cis-acting elements from the host plants during viral evolution. There is evidence that RNA viruses may acquire host-derived sequences during reverse transcription, illegitimate recombination with retrotransposons, and host genome integration-excision processes (Geuking et al., 2009). It is possible that TSWV might have undergone a similar route to acquire sequences from the host. Moreover, TSWV uses a cap-snatching mechanism for transcription initiation, which may also help the acquisition of host sequences. After being integrated into the virus genome, these elements may or may not have significant impact on the biological activities of TSWV in nature. The observation that CBS, AuxRE, and transcription initiation sites are not present in all the known TSWV isolates may indicate that these elements may not be critical to the virus and may get lost in subsequent selection and mutation events.
To our knowledge, this is the first report that the 5 region of a viral RNA genome contains motifs suggestive of promoter elements. These elements were found to be light-and hormoneresponsive when expressed as cDNA in a plant. It remains to be seen if these elements have a role in the virus lifecycle. Overall, we provide new clues on the origin and evolution of a bunyavirid genome sequence, and serves as a starting point to dissect the functions of end-genome or non-translational regions in RNA viruses.

DATA AVAILABILITY
All datasets for this study are included in the manuscript and the Supplementary Files.