De novo transcriptome sequencing and comprehensive analysis of the drought-responsive genes in the desert plant Cynanchum komarovii

Background Cynanchum komarovii Al Iljinski is a xerophytic plant species widely distributing in the severely adverse environment of the deserts in northwest China. At present, the detailed transcriptomic and genomic data for C. komarovii are still insufficient in public databases. Results To investigate changes of drought-responsive genes and explore the mechanisms of drought tolerance in C. komarovii, approximately 27.5 GB sequencing data were obtained using Illumina sequencing technology. After de novo assembly 148,715 unigenes were generated with an average length of 604 bp. Among these unigenes, 85,106 were annotated with gene descriptions, conserved domains, gene ontology terms, and metabolic pathways. The results showed that a great number of unigenes were significantly affected by drought stress. We identified 3134 unigenes as reliable differentially expressed genes (DEGs). During drought stress, the regulatory genes were involved in signaling transduction pathways and in controlling the expression of functional genes. Moreover, C. komarovii activated many functional genes that directly protected against stress and improved tolerance to adapt drought condition. Importantly, the DEGs were involved in biosynthesis, export, and regulation of plant cuticle, suggesting that plant cuticle may play a vital role in response to drought stress and the accumulation of cuticle may allow C. komarovii to improve the tolerance to drought stress. Conclusion This is the first large-scale reference sequence data of C. komarovii, which enlarge the genomic resources of this species. Our comprehensive transcriptome analysis will provide a valuable resource for further investigation into the molecular adaptation of desert plants under drought condition and facilitate the exploration of drought-tolerant candidate genes. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1873-x) contains supplementary material, which is available to authorized users.


Background
Drought is one of the most severe and wide-ranging environmental abiotic stresses that significantly threats to agriculture and influences germination, growth, development of plants [1]. In many plant species, seed germination and subsequent seedling growth are the stages most sensitive to environmental stresses [2]. Furthermore, seedling establishment could be particularly affected by the drought stress. The response to drought stress is a sophisticated process in plant, which involves thousands of protein-coding genes and biochemical-molecular mechanisms [3,4]. Exposed to drought stress, plants appear to activate a diverse set of physiological, metabolic and defense systems by altering genes expression patterns for responding and adapting to stress. Drought resistance in plants is based on the expression of several stress-inducible genes, which are generally classified into two groups: those encoding function proteins directly protect plants against environmental stresses; and those encoding regulatory proteins [5,6]. Many studies on the responses to drought from genes to the whole plant have been performed [3]. However, the knowledge about drought response in desert plants was only studied in some species [7][8][9].
Cynanchum komarovii Al Iljinski is a perennial, erect, caespitose herbaceous plant belonging to the family of Asclepiadaceae, mainly distributes in the semi-fixed sand dunes of secondary desert and in the quicksand areas of harsh deserts of Inner Mongolia and Ningxia Autonomous Regions, as well as Gansu Province. C. komarovii, being the indicator species of the last stage of grassland retrogressive succession, adapts to the dry and barren environment in the desertification process and plays a vital role in maintaining desert ecosystems [10]. As a wild desert plant, C. komarovii possesses strong resistance to abiotic stresses, including drought, UV, high light and high temperature, which makes it a valuable non-model species to investigate the stress tolerance mechanisms and to discover the stress-tolerance candidate genes [11]. According to previous studies, seedlings of C. komarovii treated with high concentrations of polyethylene glycol-6000 (PEG6000), showed a relative water content (RWC) decrease and malondialdehyde (MDA) increase, and showed the increases in activities of superoxide dismutase (SOD), peroxidase (POD), and catalase (CAT) [12]. Meanwhile, the contents of betaine, starch, proline, and soluble sugar (i.e., fructose, sucrose, trehalose) increased obviously, suggesting that the accumulation of some osmoprotectant may play an important role in osmotic adjustment during drought treatment and the introduction of osmoprotectant synthesis pathways may be a potential strategy for improving the stress tolerance of desert plant [13][14][15]. Therefore, it would be beneficial and imperative to perform studies whether the xerophytes have unique adaptive strategies in the morphological, physiological and molecular traits. However, the knowledge about the molecular mechanism of drought tolerance in C. komarovii is still poorly understood, due to lack of detailed genetic and sequence information for this non-model plant.
During the last few years, the next generation sequencing (NGS) technology has provided opportunities for the efficient, economical, and comprehensive analysis of plant with and without a reference genome in greater depth than ever before [16]. This powerful highthroughput sequencing technology has been widely applied to transcriptome sequencing, which is an approach to generate functional genomic data, and to discover DEGs in different cultivars, organs, or different treatment conditions [17][18][19]. To better understand the drought tolerance molecular mechanisms of C. komarovii, we performed a transcriptome sequencing using Illumina HiSeq ™ 2000 sequencing platform and compared the transcriptome of drought-treated and control plants to explore the global transcriptional changes in root tissues, and to identify candidate genes involved in drought tolerance. This is the first large-scale reference sequence data of C. komarovii, which enlarge the genomic resources of this species and will facilitate the further novel genes discovery research in C. komarovii.

Illumina sequencing and reads assembly
To investigate the transcriptomic responses to droughtstress in C. komarovii, four cDNA libraries were generated from mRNA of drought-treated (DT) and control (CK) samples, which were sequenced using Illumina deep-sequencing HiSeq ™ 2000. All reads were deposited in the National Center for Biotechnology Information (NCBI) and can be accessed in the Short Read Archive (SRA) database under accession number SRP057292. Among all the raw reads, 97 % had Phred-like quality scores at the Q20 level (an error probability of 1 %). After removing adapters, low-quality sequences and ambiguous reads, we obtained approximately 70 million, 68 million, 66 million and 70 million clean reads from the drought-treated samples (DT_1 and DT_2), and control samples (CK_1 and CK_2), respectively (Table 1). Totally, 275,425,782 clean reads (including CK and DT samples) were used to assemble the transcriptome data using the Trinity method. Using overlapping information in high-quality reads, 215,238 transcripts and 148,715 assembled unigenes were generated (Table 2). Over 83 % reads could be mapped back to the assembled transcripts, indicating a high quality of assembly. All unigenes were longer than 200 bp and 102,013 of them were 200 to 500 bp. Also, 20,368 of unigenes were longer than 1000 bp. The length distribution of the transcripts and unigenes is shown in Fig. 1.

Functional annotation and classification of the unigenes
All assembled unigenes were aligned against the public databases including non-redundant protein (Nr) database, non-redundant nucleotide (Nt) database, Pfam (Protein family), Swiss-Prot, Gene ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) and Clusters of Orthologous Groups (KOG/COG). The number and percentage of unigenes annotated by each database is summarized in Table 3. A total of 20,288 unigenes were annotated in all four databases (Fig. 2). Notably, we found that 42.78 % of the unigenes were un-identified, which was common among previous studies performed using the de novo sequencing strategy (40.70 %, 62.48 %, and 45.73 %) [7,19,20]. Firstly, there is insufficient genome and EST information for C. komarovii. Secondly, the bioinformatics technical limitations including sequencing depth and read length are also the possible reason. Also, we found that the short sequences (200-500 bp) had a large percentage among the all unigenes (68.60 %).
The GO terms for functional categorization was performed according to the Nr annotation. The main GO terms included biological process (BP), cellular component (CC), and molecular function (MF). Based on sequence homology, 66,895 unigenes were mainly categorized into 44 functional groups (Fig. 3, Additional file 1). The GO analysis indicated that a great number of identified genes were associated with various biological processes and molecular functions under drought. In the category of BP, the largest groups were cellular process and metabolic process. About 39,481 genes were annotated as the metabolic process category, which may allow the identification of novel genes involved in secondary metabolism pathways in drought acclimation. As for the MF category, unigenes with binding and catalytic activity formed the largest groups. For CC, the top three largest categories were cell, cell part, and organelle.
To further evaluate the reliability of our transcriptome results and the effectiveness of our annotation process, we searched the annotated sequences for genes with COG classifications. Out of 71,243 Nr hits, 38,709 sequences showed a COG classification (Fig. 4). Among the 26 COG categories, the cluster for "Translation" (5,939) represented the largest group, followed by "Post-translational modification, protein turnover, chaperon" (5,713), "General Functional Predication only" (4,921), and "Signal Transduction" (4,364). The categories "Cell motility" (35) was the smallest group.
To obtain a better understanding of the biological functions of the unigenes, we used the annotated sequences to search against the KEGG database. Among the 85,106 unigenes, 29,737 had significant matches and were assigned to 263 KEGG pathways (Fig. 5, Additional file 2). Of the 29,737 unigenes, the most strongly represented pathways were metabolism pathways. These annotations provide a resource for investigating the processes and pathways involved in responding to drought stress.

Protein coding sequence prediction
All-unigenes were aligned against the protein databases and we obtained 72,537 coding sequences (CDS) from unigenes sequences and translated them into peptide sequences. Using the EST Scan, we assigned 54,145 unigenes CDS that could not be aligned to the protein databases and translated them into peptide sequences. Of these unigenes with CDS sequences, the majority (26,965) were over 500 bp and 11,868 were over 1000 bp. The transcriptome CDS predicted by BLASTx and ESTScan is shown in Additional file 3.

DEGs among the drought stress
The analysis showed that a great number of unigenes were significantly affected upon treatment of C. komarovii seedlings with PEG solution. The Venn diagram showed the number of specific genes between the two treatments ( Fig. 6). Also, we identified 3,134 unigenes  as DEGs including 601 unigenes up-regulated and 2,533 unigenes down-regulated ( Fig. 7, Additional file 4). The cluster analysis of DEGs between CK and DT samples were identified using a heat-map (Additional file 5).

GO and KEGG enrichment analyses of DEGs
Enrichment analysis was performed to illustrate the biological functions of the identified DEGs. In total, 2,424 DEGs were enriched in 27 GO terms. We used upregulated and down-regulated DEGs to perform GO enrichment analysis, respectively. Among the up regulated genes, three GO terms including "oxidation-reduction process" (120) and "single-organism metabolic process" (185) in the category of BP; "oxidoreductase activity" (113) in the category of MF were significantly enriched after drought treatments, indicating that genes in these processes may play pivotal roles in drought sensing (Additional file 6-A). Among the down regulated genes, GO terms including "protein metabolic process" (542), "cellular protein metabolic process" (437), "cellular component organization or biogenesis" (328), and "cellular response to stimulus" (265) in the category of BP; "macromolecular complex" (545) and "cytoplasm" (508) in the category of cellular component; "protein binding" (501) and "hydrolase activity" (484) in the category of MF were significantly enriched between DT and CK samples, suggesting that genes related to these processes may be suppressed in drought perception (Additional file 6-B). Furthermore, in  the category of BP, DEGs with GO terms "cellular response to stimulus" suggesting drought treatment may have caused an efficient abiotic stress.
In total, 2,800 DEGs were enriched in KEGG pathways. The most enriched pathways were "metabolic pathways" (319), "Ribosome" (208), and "Biosynthesis of secondary metabolites" (154). We depicted the scatterplot comparing the KEGG pathway enrichment of the up regulated DEGs and down regulated DEGs, respectively (Additional files 7 and 8). In the KEGG pathway analyses, the genes involved in "cutin, suberine and wax biosynthesis" changed significantly, suggesting this pathway may play a vital role in protecting plants under drought treatment.

Frequency and distribution of Simple Sequence Repeats
In genetics, a molecular marker is a fragment of DNA that is associated with a certain location within the genome, so it is important for gene mapping and molecular breeding. To develop new molecular markers, the unigenes generated in the C. komarovii transcriptome were used to mine potential microsatellites that were defined as di-to hexa-nucleotide motifs. Among the 148,715 examined sequences (89,843,452 bp total length) 36,246 SSRs were identified and the number of SSRs containing sequences reached 25,027. Additionally, 7855 sequences contained more than 1 SSRs. In the C. komarovii transcriptome, the SSRs frequency was 24.37 % and the distribution density was 2.48 per kb. In this study composed, all SSRs loci based on the repeat motifs were divided into mono-nucleotide (20,304), di-nucleotide (9,505), tri-nucleotide (6,025), tetra-nucleotide (374), penta-nucleotide (15) and hexa-nucleotide (23), respectively (Table 4, Fig. 8).
We counted the frequency of SSRs with different numbers of tandem repeats and the results were shown in Table 5. The most abundant type was SSRs with six tandem repeats, followed by five tandem repeats, seven tandem repeats, eight tandem repeats, nine tandem repeats, ten tandem repeats, and more than 10 tandem repeats. The dinucleotide repeat motif AT/AT was the most abundant, followed by AG/CT, AAG/CTT, AAT/ ATT, AC/GT, and AGC/CTG. The least repeat motif in SSRs was CG/CG (Table 5).
For each SSR the repeat unit, length and location were shown in Additional file 9. The results showed that 3'UTR SSRs (7,046) was the most abundant, followed by 5' UTR SSRs (5,511) and CDS SSRs (1,412). The level of polymorphism detected by SSRs from different EST regions (CDS, 5' UTR and 3' UTR) varied across the different taxonomic levels. It suggested that 3' UTR SSRs were more variable at the lower taxonomic levels (more related), the 5' UTR SSRs had intermediate variability, and the CDS SSRs tend to differentiate at the higher taxonomic level [21]. To promote wide usage of the SSRs markers as a resource for gene mapping and molecular breeding, we designed primers for each of the SSRs (Additional file 9).

Validation and expression pattern analysis
To further validate the reliability of the Illumina sequencing reads analysis, 13 candidate DEGs were selected and their expression profiles were compared between CK and DT samples using quantitative real-time (qRT) PCR. Among the chosen DEGs, CCAAT-binding nuclear transcription factor Y subunit B-3 (comp99898_c0), late embryo zinc-finger protein (comp115030_c0), heatshock proteins 70 (HSPs) (comp117600_c5), ethyleneresponsive transcription factor (comp114730_c4), WRKY1b transcription factor (comp107172_c1), nitrate transporter (comp111651_c0), homeobox leucine zipper protein (comp106817_c0), WAX2-like protein (comp117767_c0) have been known to be related to responding to abiotic stresses. The expression pattern of all these DEGs obtained through qRT-PCR data were consistent with High-throughput RNA-sequencing (RNA-Seq) data, confirming that the Illumina results were reliable (Table 6). Regulatory genes identified in transcriptome are mainly involved in signaling transduction pathways and in controlling the expression of functional genes in stress responses, which can be divided into three categories: ionic and osmotic homeostasis signaling, detoxification response, and growth regulation pathways [22]. DEGs involved in signaling transduction and regulation are listed in Table 7.
In the ionic signaling pathway, calcium signaling has been implicated in plant responses to a range of abiotic stresses including chilling, drought, salinity, heat shock, oxidative stress, anoxia, mechanical perturbation [23]. In this study, 21 DEGs encoding Ca 2+ binding proteins were detected in the C. komarovii transcriptome. Stressinduced changes in the cytosolic concentration of calcium ion is thought to be the primary stimulus-sensing signal that is transduced via calmodulins (CaMs), the calcium/calmodulin-dependent protein kinases (CaM kinase/CDPKs), the calcineurin B-like proteins (CBLs) and other calcium binding proteins to effect the downstream responses involved in the protection of the plant and adaptation to the new environmental stimuli [24,25].
Osmotic stresses activate several protein kinases and phosphatases mediating osmotic homeostasis or detoxification responses [22]. Here we detected 44 DEGs involved in mitogen-activated protein kinase (MAPK) signaling pathway and 40 DEGs encoding phosphatase in the C. komarovii transcriptome. MAPK cascades, as a major plant transduction components, regulate numerous processes relaying downstream of receptors or sensors and transduce environmental and developmental signals  into adaptive and programmed responses [26,27]. Meanwhile, the protein phosphatases also have function in stress signaling. In C. komarovii, we identified a saltsensitive 3'-phosphoadenosine-5'-phosphatase HAL2/ SAL1 (comp115181), which had the same transcript profile with phosphatases SAL1 in Arabidopsis [28]. This may play a role in cellular retrograde signaling pathways during the drought stress.
Upon drought stress, changes in phospholipid composition are detected in plants and membrane phospholipids can activate groups of phospholipases that catalyze the formation of multiple second-messenger molecules [29]. In this pathway, 6 diacylglycerol (DAG) and 4 phospholipase C (PLC) are detected, which further play important roles during stress responses [22].
Transcription factors (TFs) are key regulators of plants, which play essential roles in regulating responses to diverse biotic/abiotic stresses and activating the downstream targets to improve the stress tolerance of plants [6]. The most abundant TFs families detected among DEGs in the C. komarovii transcriptome were C3HC4type RING finger, MYB, bZIP, and NAC. Meanwhile, TFs families such as CBF/NF-Y, C2H2-type RING finger, PHD-finger, WRKY, bHLH, GRAS, MYC, and ERF/AP2 were also observed ( Table 7). All these families have been known to be previously acting in enhancing drought tolerance and improving water-use efficiency. Thus, differentially expression of these TFs may affect the expression of functional genes in response to drought stress [30,31].
DEGs associated with the major functional groups in C. komarovii during drought stress During the environmental stresses, plants activate many functional genes that directly protect plants and improve tolerance to adapt adverse environment. Functional genes group that can be comprised of several categories: protection factors of macromolecules (HSPs, LEA proteins, flavonoids, osmotin, and universal stress protein, etc.), membrane related protein (water channels and transporters, etc.), osmolyte biosynthesis (proline, betaine and  sugars, etc.), and detoxification enzymes (glutathione Stransferase, superoxide dismutase, soluble epoxide hydrolase, catalase, and ascorbic peroxidase, etc.) [30]. DEGs associated with the major functional groups are listed in Table 7.
HSPs are environmentally induced proteins that have functions in acquired stress tolerance and enable plants to cope with heat and other environmental stresses, also play a role via operating together with other stress-response mechanisms and functional components, collaborating in decreasing cellular damage [32]. In this study, 24 DEGs detected in the C. komarovii transcriptome were members of the HSPs family and these genes showed significant changes at DT group under drought treatment. Flavonoids are ubiquitous plant secondary metabolites with diverse functions in growth, development, reproduction, and adaptation to environmental stresses [33]. Exposed to the abiotic stresses, genes   Comp115458 and comp110208 were the members of the LEA proteins family with high expression under drought conditions (nearly 6-fold up-regulation). In different plant species, many LEA genes have been identified and demonstrated to be related to tolerance against water deficiency, osmotic, salt, and cold stresses [34]. In addition, several functional proteins such as universal stress protein (USP) and osmotin were also detected. Up-regulation of all these protection factors may protect C. komarovii from the stress and lead to enhance tolerance to drought stress. Drought stress may influence the expression levels of genes encoding water channels and transport related proteins. A total of 20 DEGs were annotated as transporter including ABC transporter (12 DEGs), zinc transporter (1 DEGs), sugar transporter (3 DEGs), and nitrate transporter (4 DEGs), which were significantly affected by abiotic stresses. In Arabidopsis, the function of ABCG25, ZTP29, and NRT1.1 in stomatal opening, regulation of ion channels, and response to stresses have been well studied [35][36][37]. Aquaporins act as water channels, which play a vital role in plant water relations and response to drought. In this study, all of the 12 DEGs encoding aquaporins were down-regulated, which may be a mechanism to reduce membrane water permeability and to allow cellular water conservation during drought stress [38].
To prevent water loss from the cell and protect proteins, plants as accumulate organic osmolytes, including proline, trehalose, sucrose, raffinose, betaine, and mannitol [5]. In previous studies, the contents of betaine, starch, proline, and soluble sugar (i.e., fructose, sucrose, and trehalose) shown significant increases in C. komarovii under drought stress. In this study, 28 DEGs involved in biosynthetic pathways for these osmolytes, suggesting that the increased concentration of these osmolytes play an important role in maintaining cell turgor and conserving water in acute stress.
Detoxification enzymes such as glutathione S-transferase, superoxide dismutase, soluble epoxide hydrolase, catalase, glutathione peroxidase, and ascorbate peroxidase are involved in protection of cells against reactive oxygen species [39]. In this study, there were 30 DEGs detected belonging to detoxification enzymes group. Interestingly, the results indicated that most of these genes were down-regulated during drought stress. Several studies have reported that the activity and expression level of the detoxification enzymes can be induced by the drought stress. However, this induction would be damaged inevitably under heavy stresses conditions, suggesting that drought treatment might lead a severe impact on the detoxification enzymes systems.

DEGs related to formation of plant cuticle under drought stress
Plant cuticle, which can be divided into the inner cutin and outer wax, is the hydrophobic protection layer against water loss and protects plants from the deleterious effects of light, temperature, osmotic stress, high salinity, and physical damage [40,41]. When Arabidopsis is subjected to abiotic stresses (i.e., water deficit, NaCl, or ABA treatments), shown a significant accumulation in leaf cuticle lipids and these stress treatments led to higher amount in wax and alkanes. However, the increases in total cutin monomer amount (by 65 %) and leaf cuticle thickness (49 %) were only observed under water deficit condition [42]. Furthermore, the views of the scanning electron microscopy (SEM) showed the increase in accumulation of wax under drought stress (Fig. 9). It suggested that plant cuticle played a vital role in response to abiotic stresses, especially drought stress and increased amount of cuticle had been associated with improved drought tolerance [42]. The cuticleassociated genes involved in the biosynthesis, export, and regulation were highly induced by environmental stresses [43]. In this study, DEGs related to plant cuticle had been identified, which can be classified into: biosynthesis, export, and regulation (Table 8, Fig. 10). The first category encoding proteins involved in cuticle biosynthesis pathway, which are composed of three stages. In the first stage, the major precursors of all cuticle aliphatic components C16 and C18 fatty acids are generated by de novo synthesis in plastids [43]. During this process, the multifunctional acetyl-CoA carboxylase (ACCase) catalyze biosynthesis of malonyl-CoA from acetyl-CoA. Then de novo synthesis of acyl chains in plastids catalyzed by fatty acid synthases (FAS) complex, which includes acyl carrier protein (ACP) as a cofactor [41]. In the second stage, C16 and C18 fatty acids are further elongated into very long chain fatty acids (VLCFAs) composed of 20 to 36 carbons [43]. The biosynthesis of VLCFAs is catalyzed by the microsomal elongase, a multi-enzyme complex, called fatty acid elongase (FAE), which involves four serial enzyme reactions: condensation by a β-ketoacyl-CoA synthase (KCS), reduction of β-ketoacyl-CoA by a β-ketoacyl-CoA reductase (KCR), dehydration of β-hydroxyacyl-CoA by a β-hydroxyacyl-CoA dehydratase (HCD), and additional reduction of enoyl-CoA to a C2 unit extended acyl-CoA by enoyl-CoA reductase (ECR) [44]. In the present study, all enzymes except KCR were detected among DEGs and the expression rates of these DEGs increased nearly 1.0-7.0 fold under drought condition. However, nine ECRs (not all shown in Table 8) were only detected in the CK groups. The KCS has substrate specificities, nevertheless, the KCR1, HCD/PAS2, and ECR/CER10 could participate in all elongation cycles. It suggests that down-regulation of ECRs may not affect the formation of VLCFAs during this process [44]. In the third stage, VLCFAs are modified into the major wax products, including alcohols, esters, aldehydes, alkanes, and ketones [43]. In the alcohol-forming pathway, two putative fatty acyl-CoA reductase 3 (FAR3/CER4) were highly induced (nearly 1.6 and 3.7 fold up-regulation) by drought stress. However, wax synthase/diacylglycerol acyltransferase (WS/DGAT) encoding gene WSD1 was only detected in CK group. In the alkane-forming pathway, a DEG annotated as ECRIFERUM 1 (CER1) and WAX2-like protein (WAX2) was detected. In Arabidopsis, a multiprotein enzyme complex including CER1, CER3/WAX2/YRE are core components of a very long chain alkane synthesis complex, which plays an important role in wax alkane synthesis [45]. Furthermore, DEGs encoding glycerol-3-phosphate acyltransferase (GPAT), long chain acyl-CoA synthetase 4 (LACS4), cytochrome P450 CYP86A1 (HORST), CYP8 6A2 (ATT1), and GDSL-like lipase were also detected and responsive to drought stress, which have been involved in suberin, cutin and wax synthesis in previous studies [46][47][48][49][50].
The second category encoding proteins involved in export of wax and cutin cuticle from endoplasmic reticulum (ER) through the plasma membrane (PM) to the outside of the cell [43]. ATP BINDING CASSET-TEG32 (ABCG32), an ABC transporter from the pleiotropic drug resistance family is required for exporting wax compounds in Arabidopsis [51]. In this study, a DEG annotated as ABCG32 was detected, suggesting that the exporting process of wax and cutin molecules might be carried by ABCG32. During the transport of cuticular lipids through the cell wall, the lipid transfer proteins (LTPs) were involved in this process. The experimental evidence indicates that non-specific LTPs (nsLTPs) have function in binding a wide variety of acyl chains, including cutin monomers. In addition, the expression of LTPs is significantly up-regulated under drought and ABA mediated stresses [52]. In this study, 5 DEGs encoding LTPs were annotated and upregulated (nearly 1.0 and 4.3 fold), which suggested these wax-associated LTPs might play an important role for wax extracellular transport and accumulation in C. komarovii.
The third category encoding proteins have function in regulation of cuticle biosynthesis and transport. In Arabidopsis, ethylene responsive transcription factors (AP2/ ERFs) have been widely shown involvement in various developmental processes, especially regulation of cuticle biosynthesis, accumulation and transport in responses to abiotic stresses [43]. The WAX INDUCER 1 (WIN1) and SHINE 1 (SHN1) belonging to ERF subfamily have been identified and shown preferentially expressed in roots and floral tissues. Overexpression of SHINE1 had an increasing cuticular wax accumulation in transgenic plants than wild-type plants which improved drought stress tolerance in Arabidopsis [53]. The DEG identified

Conclusions
C. komarovii is a xerophytic plant species, which possesses strong resistance to drought stresses. Here, we generated a transcriptome sequencing of C. komarovii root tissues using a NGS technology and performed comprehensive analysis of the drought-responsive genes and explored the mechanisms of drought-tolerance. The results showed that DEGs encoding regulatory and functional proteins had different expressions under drought stress. Importantly, the DEGs were involved in biosynthesis, export, and regulation of plant cuticle, suggesting that plant cuticle may play a vital role in response to drought stresses and accumulation of cuticle may allow C. komarovii to improve the tolerance to drought stress.
In conclusion, our comprehensive transcriptome analysis will provide valuable resource for further investigation into the molecular mechanisms of desert plants under drought condition and facilitate the exploration of drought-tolerant candidate genes.

Ethics
This research did not involve any human subjects, human material, or human data. C. komarovii in current research did not belong to the endangered or protected species.

Plant materials and drought treatment
The seeds of C. komarovii were collected from Yinchuan City, Ningxia Autonomous Regions, and China. After surface sterilization, the seeds of C. komarovii were planted in pots containing soil and vermiculite (with 2:1

RNA extraction and quality determination
Total RNA were obtained from two different biological replicates for each group sample using TRIzol Reagent (Invitrogen) following the manufacture's protocol. The RNA quality was evaluated using Bioanalyzer 2100 (Agilent Technologies, CA, USA); the A260/A280 ratios of Fig. 10 Schematic representation of cuticular wax biosynthesis and export. Cuticle biosynthesis pathway, which are composed of three stages: In the first stage, C16 and C18 fatty acids are generated by de novo synthesis. The second stage involves the elongation of C16 and C18 fatty acids into VLCFAs composed of C20 to C36. In the third stage, VLCFAs are modified into the major wax products according to two distinct biosynthetic pathways: the alcohol-forming pathway and the alkane-forming pathway. Wax products are mobilized from endoplasmic reticulum (ER) through the plasma membrane (PM) to the outside of the cell wall (CW) by ABC transporters and lipid transfer proteins (LTPs

Quality control, transcriptome assembly and functional annotation
Clean reads were obtained by removing adaptor sequences, unknown sequences 'N' , low quality reads from raw data. At the same time, Q20, Q30, GC-content and sequence duplication level of the clean data were calculated. All the analyses were based on clean reads with high quality. Transcriptome de novo assembly was carried out using Trinity [54] with the default settings, except that min_kmer_cov set to 2 by default. To assign predicted gene descriptions for the assembled unigenes, they were aligned against the plant protein dataset of Nr, Nt, Pfam, Swiss-Prot/Uniprot protein database, and KOG/COG, respectively, using BLASTX with a significance threshold of E-value <10 -5 . For sequences aside from those obtained from BLASTX searches, we used the EST Scan program to determine the sequence orientation. The GO terms for functional categorization was analyzed using Blast2go software [55]. The pathway assignments were carried out by sequence searches against the KEGG database, also using the BLASTX algorithm with an E-value threshold of 10 -5 .

Differential expression analysis of unigenes
Differential expression analysis of two groups was performed using the DEGseq R package (1.10.1) [56]. DEGseq provide statistical routines for determining differential expression in digital gene expression data using a model based on the negative binomial distribution. The resulting P values were adjusted using the Benjamini and Hochberg's approach [57] for controlling the FDR (false discovery rate). In this analysis, the genes with a threshold Pvalue <0.001 and the absolute value of log 2 Ratio (DT/CK) >1 screened by DEGseq were assigned as differentially expressed. Functional enrichment analysis including GO and KEGG enrichment analysis of the DEGs was implemented by the GOseq R packages [58] based on Wallenius non-central hyper-geometric distribution [59]. We calculated the numbers of all DEGs, up regulated and down regulated genes to each GO term, respectively. We used KOBAS software [60] to test the statistical enrichment of DEGs in KEGG pathways.

Validation and expression pattern analysis
To experimentally validate the transcriptional abundance results from sequencing and computational analysis, 13 unigenes were selected for qRT-PCR analysis. 2 μg total RNA was reverse transcribed to first-strand cDNA using the high-capacity RNA-to-cDNA kit (Applied Biosystems, Foster City, CA) according to manufacturer's specifications. Diluted cDNA was used as template in each well for the qRT-PCR analysis. Primers were designed using Primer Premier 5 and listed in Additional file 10. The figure of an agarose gel with PCR products was shown in Additional file 11. The qRT-PCR was performed by 20 μL reaction mixture in 96-well plate with ABI7500 Fast Real-Time PCR System, using SYBR Premix Ex Taq™ II kit (Takara). Reactions were performed at 95°C for 30s, 40 cycles of 95°C for 3 s, and 60°C for 34 s. Each sample was performed with two independent biological replicates and each biological replicate had three technical replicates. Statistical analysis of the relative expression levels of each gene normalized to EF-1-α [61] was calculated using comparative Ct value method.

SEM Analysis
To view cuticular waxes, sections of roots were treated in 1 % (w/v) osmium tetroxide vapor for 24 h, dried using LEICA EM CPD, and then sputter coated with gold using an EIKO IB-3 sputter coater. The images