Data on small cardamom transcriptome associated with capsule rot disease

Small cardamom (Elettaria cardamomum (L.) Maton, also known as the ‘Queen of Spices’ is a rhizomatous herbaceous monocot from the family Zingiberaceae. In the present study, using HiSeq™ 2000 RNA sequencing technology, transcriptome sequencing was performed for both control and disease stressed small cardamom leaf tissues. RNA-seq generated 46,931,637 (101 base) and 31,682,496 (101 base) raw reads and totally 9.93GB and 6.63GB of sequence data for cardamom control and stressed samples respectively. The raw data were submitted to NCBI SRA database of under the accession numbers SRX2512359 and SRX2512358 for the control and diseased samples respectively. The raw reads were quality filtered and assembled using TRINITY de novo assembler which created 1,11,495 (control) and 91,096 (diseased) contigs with N50 values 3013 (control) and 2729 (stressed). The data was further used to identify significantly differentially expressed unigenes between control and stressed samples. Assembled unigenes were further annotated and evaluated in silico to predict the function using publicly available databases and gene annotation tools.


a b s t r a c t
Small cardamom (Elettaria cardamomum (L.) Maton, also known as the 'Queen of Spices' is a rhizomatous herbaceous monocot from the family Zingiberaceae. In the present study, using HiSeq™ 2000 RNA sequencing technology, transcriptome sequencing was performed for both control and disease stressed small cardamom leaf tissues. RNA-seq generated 46,931,637 (101 base) and 31,682,496 (101 base) raw reads and totally 9.93GB and 6.63GB of sequence data for cardamom control and stressed samples respectively. The raw data were submitted to NCBI SRA database of under the accession numbers SRX2512359 and SRX2512358 for the control and diseased samples respectively. The raw reads were quality filtered and assembled using TRINITY de novo assembler which created 1,11,495 (control) and 91,096 (diseased) contigs with N50 values 3013 (control) and 2729 (stressed). The data was further used to identify significantly differentially expressed unigenes between control and stressed samples. Assembled unigenes were further annotated and evaluated in silico to predict the function using publicly available databases and gene annotation tools.

Data
Data shared in this article includes RNA-seq generated paired end strand specific 46,931,637 (101 base) and 31,682,496 (101 base) raw reads and totally 9.93GB and 6.63GB of sequence data for cardamom control and stressed samples respectively.

Plant material
Leaf tissues from both sets, i.e., naturally infected capsule rot and non-infected control plants were collected followed by immediate freezing in liquid nitrogen. Ten biological replicates were pooled from leaf tissues under these two conditions [3].

Total RNA isolation and transcriptome sequencing
RNA extraction was done using a modified protocol of RNeasy Plant Mini Kit (Qiagen) and CTAB method [4]. RNA integrity and quality analysis was done using 2100 BioAnalyzer (Agilent Technologies). Illumina sequencing was performed using the HiSeq™ 2000 platform as per the manufacturer's instructions (Illumina, San Diego, CA). RNA-seq generated paired end strand specific 46,931,637 (101 base) and 31,682,496 (101 base) raw reads and totally 9.93GB and 6.63GB of sequence data for cardamom control and stressed samples respectively.

De novo transcriptome assembly and functional annotation
The raw reads were pre-processed to remove adapter sequences, low quality bases, tRNAs and rRNAs. De novo transcriptome assembly was performed with TRINITY program [5] to generate the Specifications Table   Subject Agricultural and Biological Sciences Specific subject area Plant Science Type of data Text (FASTQ sequence files), Value of the Data Capsule rot disease, commonly known as Azhukal disease is reported to be one of the most serious fungal diseases in small cardamom caused by Phytophthora meadii [1] often leading to annual loss of 30e40% [2]. Under fungal infections R genes and many other defense related genes triggering disease tolerance to plants may get over expressed. Transcriptome data generated from leaves of plants grown under specific conditions could provide information on molecular mechanism underlying disease tolerance. Differential expression analysis of control and treated cardamom could compare the expression variation of particular genes in normal and diseased plant grown under similar conditions. assembled contigs. The assembler created 1,11,495 and 91,096 contigs for control and stressed cardamom samples ( Table 1). The assembled unigenes were used for further downstream analysis such as annotation to publicly available databases, Gene Ontology (GO) enrichment and finally validation of differentially expressed genes using qPCR. Additionally, the reads from both pairs were combined and assembled together to generate a reference transcriptome (1,62,589 contigs, 310.7 MB). The information provided by the current study might be useful in developing molecular markers, SNPs, screening of R genes and marker assisted selection to develop superior cultivar varieties in cardamom.