Virus‐induced gene silencing database for phenomics and functional genomics in Nicotiana benthamiana

Abstract Virus‐induced gene silencing (VIGS) is an important forward and reverse genetics method for the study of gene function in many plant species, especially Nicotiana benthamiana. However, despite the widespread use of VIGS, a searchable database compiling the phenotypes observed with this method is lacking. Such a database would allow researchers to know the phenotype associated with the silencing of a large number of individual genes without experimentation. We have developed a VIGS phenomics and functional genomics database (VPGD) that has DNA sequence information derived from over 4,000 N. benthamiana VIGS clones along with the associated silencing phenotype for approximately 1,300 genes. The VPGD has a built‐in BLAST search feature that provides silencing phenotype information of specific genes. In addition, a keyword‐based search function could be used to find a specific phenotype of interest with the corresponding gene, including its Gene Ontology descriptions. Query gene sequences from other plant species that have not been used for VIGS can also be searched for their homologs and silencing phenotype in N. benthamiana. VPGD is useful for identifying gene function not only in N. benthamiana but also in related Solanaceae plants such as tomato and potato. The database is accessible at http://vigs.noble.org.

Tobacco rattle virus (TRV)-based VIGS vectors are popularly used for VIGS in solanaceous plants. TRV has two genomes, TRV1 and TRV2, and both the genomes are required for viral replication and movement (Liu, Schiff, Marathe, & Dinesh-Kumar, 2002;Senthil-Kumar & Mysore, 2014). The TRV-VIGS-based fast-forward genetics approach has been widely used in N. benthamiana to identify plant genes involved in disease resistance, Agrobacteriummediated transformation, flower development, and coronatine/victorin-induced cell death Chakravarthy et al., 2010;del Pozo, Pedley, & Martin, 2004;Gilbert & Wolpert, 2013;Kaundal et al., 2017;Lee et al., 2017;Lu et al., 2003;Rojas et al., 2012;Senthil-Kumar et al., 2013;Wangdi et al., 2010). These studies have generated phenotypic data for a large number of genesilenced plants. However, these data are not available in a single platform for researchers. As a first step toward integrating these data, we developed a "VIGS phenomics and functional genomics database" (VPGD) that compiles data from the silencing of 4,117 N. benthamiana genes. Approximately 1,000 of these genes produced a visible phenotype when silenced and is described in our database. These data will enable researchers to determine phenotypes associated with individual gene knockdowns without performing an experiment. We expect that the VPGD will be a useful

| Recording phenotype information
Phenotype information was recorded between 2 and 4 weeks after TRV inoculation. During this period, all visible phenotypic symptoms were systematically recorded at several day intervals and compared with the vector-only inoculated plants. Composite information obtained from these observations that showed consistency in phenotypes throughout development was finalized, and photographs were taken. VIGS for the selected clones showing a phenotype of interest was repeated to confirm the response. This second-level screening was carried out to eliminate false positives from the first screen.

| VIGS database
Nicotiana benthamiana mixed elicitor (NbME) del Pozo et al., 2004)  To determine the identity of the cDNA sequence in each TRV2 clone, NbME and NbTI cDNA libraries were sequenced by the Sanger method. cDNA inserts from the TRV2 clones were PCR amplified from each well using vector-specific primers and electrophoresed on an agarose gel to ensure a single band would represent a single colony. Only the colonies showing a single insert were selected for plasmid purification and sequenced using vector-specific primers. Resulting sequences were processed to remove vector sequences and submitted as EST sequences to NCBI and incorporated into the VPGD. In total, we added 2,779 and 1,332 ESTs from the NbME and NbTI libraries, respectively.
Further, these sequences were annotated and classified according to predicted gene function (Figure 2). Each gene sequence, its annotation, and phenotype were matched and were incorporated into VPGD.
To annotate the EST sequences, we used BLASTX and compared them against three databases (i.e., the Arabidopsis protein sequence, tomato protein sequence, and NCBI protein sequence from all plant species). The top hits with "e" values lower than 1e À10 were kept, and the related Gene Ontology (GO) term and function description (Jain et al., 2013) were used for annotation.
For each GO ID, the VPGD provides the related GO term and associated annotation information. GO terms are widely used to understand the biological significance of genes. We used Arabidopsis and tomato annotations for categorization of ESTs based on GO terms such as molecular function, biological process, and cellular component. The GO terms categories associated with N. benthamiana ESTs derived by homologies to tomato are shown in Figure 3. In biological processes, "cell organization and biogenesis," "other cellular processes," and "protein metabolism" were the most dominant terms with 47%, 26%, and 16% of ESTs, respectively. In the cellular component category, "other membranes" and "other intracellular components" contributed to 70% of the annotations. In the molecular function category, "protein binding" (29%), "other binding" (15%), and "other enzyme activity" (13%) were the most represented classes.

| Site usage
The home page tab provides information about the utility of the database and brief information about the contents of the web site.   Phenotype descriptions: NbTI06E09 silencing shows severely stunted plants, crinkled leaves, reduced apical growth, and severe cell death on top leaves; NbME14A8 silencing shows stunted, bushy plants, and albino green leaves; NbME12B6 silencing shows stunted, bushy plants, green-white mottled, crinkled leaves, and spotted cell death on leaves; NbME12B7 silencing shows moderately stunted plants and yellow leaves; NbME12B10 silencing shows severely stunted plants, thick, and mosaic leaves; NbTI02D02 silencing shows cell death. Four replicates were carried out for each experiment, and two independent experiments were performed 3.2 | Mode of data collection, deposition, and database construction The VPGD web site was constructed using PHP script, an Apache server combined with the MySQL database on a Linux system. In this web site, the sequences and related silencing phenotype information of ESTs or genes derived from two cDNA N. benthamiana libraries (NbTI and NbME) were collected, annotated, and finally imported to the MySQL database (Figure 4).  To provide more details for the GO terms, we also downloaded the full GO annotation database from http://geneontology.org and integrated it into the VPGD. After compiling all the data, the detailed annotation information of each EST including phenotype descriptions, gene function annotations, GO IDs, and their related GO annotation information was added to the database. We also set up a BLAST server and used NbME and NbTI EST sequences as the target database. Users can search the ESTs of interest using this BLAST server.

| Accessing VPGD
The VIGS phenomics and functional genomics database is accessible

| DISCUSSION
The VPGD was specifically designed to facilitate the interaction between the user and the software. For example, the scheme of information presented in all pages is consistent and there is simple navigation to each category within the database. The results of gene silencing in N. benthamiana for a large number of genes can be searched using different methods, including nucleotide sequence, phenotype description, key words, gene names, NCBI IDs, and inhouse assigned IDs ( Figure 5). In particular, we have generated a list of keywords describing phenotype features, such as "crinkled," "mot- F I G U R E 4 Pipeline of the VIGS database construction. The framework for the VIGS database along with its content organization is presented. The database was constructed based on three inputs namely two EST library sequences, photographs of visual phenotypes, and phenotype descriptions. These inputs were processed as indicated in the middle panel. Specifically, the nucleotide sequences were annotated and a BLAST search was constructed. Phenotype information was organized and presented in a searchable format. Users are able to perform the functions as depicted in the right-most panel F I G U R E 5 Screenshot of few tabs from the database and description of contents. This screenshot is taken from the VPGD web site. It has four tabs namely, Home, VIGS Database, Material Request, and BLAST. The Home tab contains the basic description of the database including the background and relevant literature. The VIGS Database page displays all information related to the clone upon clicking the clone name or NCBI ID or keyword search. The display includes silencing phenotype description, gene name annotation, GO terms, EST sequence, and photograph of silencing phenotype. The Material Request tab provides details of construct availability and biosafety information. The BLAST tab has a built-in BLAST page. Query sequence can be used to search for homologs in the database, and this will show all the information available for the desired clone. The red arrows indicate key information provided in the database, and this is to guide the readers to notice the description in the legend N. benthamiana under drought using VIGS (Senthil-Kumar, Govind, Kang, Mysore, & Udayakumar, 2007). Considering the large number of genes and their silenced phenotypes available in the database, potential biological function of a bioinformatically predicted gene can be identified. Many plants did not show a visible phenotype upon silencing, and this could be due to inefficiency in gene silencing. Further, redundancy in the function of many genes and plasticity in metabolic pathways are other reasons for the absence of visible phenotype in some of the silenced plants. In contrary, some of the gene silencing phenotypes reported in the database could be due to off-target silencing. Therefore, researchers should do further validation of the gene-silenced phenotype. This database only provides a starting point for gene functional analyses. Researchers working with organisms other than plants can also use the database for the functional relevance of the orthologous genes of their species of interest. For example, heat-shock protein 70 (HSP70, AJ001365) from Drosophila auraria nucleotide sequence BLAST in VIGS database showed five hits matching to HSP70 proteins. One of the top hits matching to N. benthamiana gene enlisted in the database was NbHSP70 (NbTI07E09). Sequence annotation information with GO terms clearly indicated the putative function of this protein in the plant. Further, the gene-silenced plants were stunted and showed pale yellow leaves along with spotted cell death and similar phenotype was previously reported NbHSP70 in silenced plants (Kanzaki et al., 2003;Senthil-Kumar, Govind, et al., 2007). This suggests that NbHSP70 could be involved in basal metabolic process that is required to maintain growth and normal cellular activities of a plant. Indeed, HSP70 has been shown to be an important protein needed for the cellular function in many plants and animals (Mayer & Bukau, 2005).
Unique features of the VPGD are as follows: (i) provides ready access for gene to phenotype information; (ii) has the potential to provide functional relevance of genes from over 20 species of Solanaceae family; (iii) provides multiple uses for gene sequence information, namely functional annotation (GO terms), phenotype description upon gene silencing, and VIGS construct for a particular gene sequence; (iv) has three input options namely gene ID/NCBI ID, sequence, and phenotype key words to search for user-desired information; (v) has built-in BLAST search along with detailed display of results; (vi) provides access to additional tools such as off-target prediction, VIGS tool, and link to sequences/GO terms; and (vii) indicates availability of gene silencing construct.

| CONCLUSIONS
The VIGS phenomics and functional genomics database (VPGD) is a unique resource that hosts large-scale phenotypic information.
Specifically, the database provides one-stop access to genotype-tophenotype information for over one thousand genes in N. benthamiana and closely related plant species. VIGS is a robust method for generating phenomic data for a large number of genes in a short time span. VPGD can be a model to develop phenomics database for other plant species. The aim of VPGD was to provide information on putative gene function and silencing phenotypes, without performing an experiment, to a wide range of plant species within Solanaceae family. VPGD provides putative gene function information for a large number of genes for plant species that have limited or no genetic resources (e.g., mutant collection).

ACKNOWLEDG MENTS
Authors would like to acknowledge Ms.

CONFLI CT OF INTERESTS
Authors declare that they have no competing interest.

AUTHOR CONTRI BUTIONS
KSM and MS-K framed the concept and coordinated the project.