De Novo Assembly, Annotation, and Characterization of Root Transcriptomes of Three Caladium Cultivars with a Focus on Necrotrophic Pathogen Resistance/Defense-Related Genes

Roots are vital to plant survival and crop yield, yet few efforts have been made to characterize the expressed genes in the roots of non-model plants (root transcriptomes). This study was conducted to sequence, assemble, annotate, and characterize the root transcriptomes of three caladium cultivars (Caladium × hortulanum) using RNA-Seq. The caladium cultivars used in this study have different levels of resistance to Pythium myriotylum, the most damaging necrotrophic pathogen to caladium roots. Forty-six to 61 million clean reads were obtained for each caladium root transcriptome. De novo assembly of the reads resulted in approximately 130,000 unigenes. Based on bioinformatic analysis, 71,825 (52.3%) caladium unigenes were annotated for putative functions, 48,417 (67.4%) and 31,417 (72.7%) were assigned to Gene Ontology (GO) and Clusters of Orthologous Groups (COG), respectively, and 46,406 (64.6%) unigenes were assigned to 128 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. A total of 4518 distinct unigenes were observed only in Pythium-resistant “Candidum” roots, of which 98 seemed to be involved in disease resistance and defense responses. In addition, 28,837 simple sequence repeat sites and 44,628 single nucleotide polymorphism sites were identified among the three caladium cultivars. These root transcriptome data will be valuable for further genetic improvement of caladium and related aroids.


Introduction
Caladium (Caladium × hortulanum) is an important ornamental aroid which is commonly used in containers, hanging baskets, and landscapes for their variably-shaped and colored leaves. The genus Caladium is native to the tropical regions of South and Central America, including Brazil, Colombia, Ecuador, and Peru. Modern cultivated caladiums are diploids with 2n = 2x = 30 chromosomes [1]. Recent molecular and cytological analyses have shown that cultivated caladiums are most likely to have originated from interspecific hybridizations between Caladium bicolor (Aiton) Vent. and Caladium schomburgkii Schott, which share the same chromosome number (30), similar nuclear DNA contents, and similar DNA fingerprints [1][2][3]. The great majority of commercial caladium plants are forced from tubers. More than 95% of the caladium tubers used in the world are produced in Florida [4].
A major issue affecting commercial caladium tuber and plant production is root rot caused by Pythium myriotylum Drech. Several other necrotrophic fungi including Sclerotium rolfsii Sacc. and Rhizoctonia solani Kuehn could also incite rot on caladium roots. Pythium-infected caladiums exhibit stunted growth, substantial losses of roots, and much-reduced tuber yield [4,5]. For example, Riding and Hartman [6] grew caladium tubers in P. myriotylum-infected substrates and observed that Pythium infection led up to 40% reduction in tuber yield. Caladium growers used to rely on methyl bromide fumigation for control of soil-borne pathogens [5], but methyl bromide has been phased out due to its severe side effect of depleting ozone in the atmosphere, and few alternative fumigants are available now. As a result, growers are facing much more severe losses from Pythium root rot.
Developing and using disease-resistant cultivars is an effective and environmentally-friendly approach to managing diseases in many crop production systems. For many soil-borne diseases caused by necrotrophic fungi/oomycetes that exhibit facultative pathogenicity and broad parasitic spectrums, the most common type of host resistance has been partial resistance [7]. Deng et al. [4,5] screened 42 caladium cultivars for Pythium root rot resistance and identified several commercial cultivars with moderate resistance, of which "Candidum" is particularly valuable, because it is also resistant to another necrotrophic pathogen, Fusarium solani (Mart.) Saa [8]. The continuous distribution of disease resistance level among the caladium cultivars evaluated in the reported studies indicates that the observed Pythium resistance is likely controlled by quantitative trait loci [4,5,9].
To date, our understanding of host resistance to necrotrophic pathogens has mainly come from studies in Arabidopsis, tomato, and rice [10][11][12]. These studies have shown that plant resistance toward necrotrophs is primarily mediated by recognition of pathogen-associated molecular patterns (PAMPs) via pattern recognition receptors (PRRs), signaling networks regulated by ethylene and jasmonic acid, and activation of biosynthesis of antimicrobial metabolites [13]. In Arabidopsis, dozens of genes encoding receptor-like kinases (RLKs) are essential for quantitative resistance to Fursarium oxysporum [14]. In rice, the gene OsACS2, which encodes 1-aminocyclopropane-1-carboxylic acid (ACC) synthase, plays an important role in conferring resistance to Rhizoctonia solani [15]. In tomato, expression of Arabidopsis thionin (an antimicrobial compound) gene Thi2.1 significantly enhanced tomato resistance against Fusarium wilt [16]. Necrotrophic pathogens have caused severe losses to many crops in the world, and a better understanding of the type and expression of disease resistance and defense-related genes in plant roots could greatly facilitate further genetic improvement of crop resistance to necrotrophic pathogens.
Understanding the genetic basis for caladium resistance to necrotrophic pathogens, particularly Pythium, has been a very important objective in caladium genetic studies. Such knowledge would enable plant breeders to select the appropriate breeding parents and breeding crosses, and design effective and efficient selection schemes for the development of new caladium cultivars with enhanced resistance to necrotrophic pathogens. So far, little genomic or transcriptomic data are available for caladiums. The only publicly available DNA data in caladium are 99 simple sequence repeats (SSRs) reported by Gong and Deng [2], and 49 nucleotide sequences and 16 genes in the National Center for Biotechnology Information (NCBI) databases. None of the deposited DNA sequences were involved in disease resistance or defense. To discover putative disease resistance/defense genes in caladium, genomic and/or transcriptomic data are urgently needed.
In recent years, advances in next-generation sequencing (NGS) and assembly algorithms have resulted in a fast and deep transcriptome sequencing technique termed RNA-Seq. Recent development of de novo transcriptome assemblers has made RNA-Seq technology readily available for the study of non-model plants whose genomic sequences are largely unavailable [17]. Data from transcriptome sequencing has greatly facilitated genome-wide gene discovery, identification of genes involved in specific biosynthesis pathways, and development of molecular markers [18]. In recent years, de novo transcriptome sequencing has been reported in numerous non-model plant species for identifying putative genes involved in disease or stress responses. For example, Zhang et al. [19] identified 250 nucleotide-binding site and leucine-rich repeat (NB-LRR) transcripts from shoots, leaves and flower buds in two Camellia species. In Oryza officinalis, He et al. [20] discovered disease resistance genes involved in signaling and plant hormone regulation pathways using leaves, stems and roots. In gerbera, Fu et al. [21] sequenced leaf transcriptomes and found 137 transcripts that were related to the phenylpropanoid biosynthesis pathway that impacts disease resistance. Additionally, transcriptome sequencing data have been used for the development of SSR and single nucleotide polymorphism (SNP) molecular markers in many plant species [22,23].
Herein, the root transcriptomes of three caladium cultivars ("Candidum", "Gingerland", and "Miss Muffet") were sequenced using the paired-end sequencing protocol and the Illumina HiSeq 2000 platform and were assembled de novo. The three cultivars were selected because they are very important breeding parents, and most importantly, they displayed significant differences in Pythium root rot resistance. "Candidum" is moderately resistant to Pythium root rot, while "Gingerland" and "Miss Muffet" are highly susceptible to this disease [4,5,9]. The objectives of this study were to (1) assemble, annotate and characterize the root transcriptomes of these caladium cultivars; (2) identify transcripts that are likely involved in disease resistance/defense pathways; and (3) identify SSR and SNP sites in caladium transcriptomes. This is the first transcriptome study in caladium; the data generated here will be of significant value for development and use of molecular markers in genetic improvement of caladium.
So far, there has been no report of transcriptome assembly in caladium. In Amorphophallus, another aroid species, Zheng et al. [24] sequenced and generated 135,822 unigenes with an average length of 523 bp and an N50 length of 635 bp. Overall, this set of assembled caladium unigenes were longer than those of Amorphophallus.

Gene Ontology (GO) Classification
To functionally classify the assembled unigenes, sequences that were annotated in the NR database were then subjected to Gene Ontology (GO) analysis. Results showed that 68,827 caladium unigenes fell into three main categories, including Biological Process, Cellular Component, and Molecular Function (Figure 2 and

Gene Ontology (GO) Classification
To functionally classify the assembled unigenes, sequences that were annotated in the NR database were then subjected to Gene Ontology (GO) analysis. Results showed that 68,827 caladium unigenes fell into three main categories, including Biological Process, Cellular Component, and Molecular Function ( Figure 2 and Table S3). In the Biological Process category, the Cellular Process (  Out of the 68,827 unigenes annotated for GO classifications, 1284 unigenes seem to be rootspecific (Table S4). A great majority of them (1284, 98.75%) could be grouped into seven functional categories, including root morphogenesis (62 unigenes, 4.83%), root meristem growth (65 unigenes, 5.07%), root hair elongation (261 unigenes, 20.33%), root epidermal cell differentiation (47 unigenes, 3.67%), root development (535 unigenes, 41.67%), and lateral root formation (85 unigenes, 6.62%) ( Figure 3). A small number of the root-specific unigenes (16, 1.25%) could not be assigned to specific functional categories, thus grouped into "Others". A total of 103 root-specific unigenes are related to disease resistance or plant defense (Table S5).

CDS Prediction
The protein coding sequence (CDS) and the amino acid sequence of all assembled unigenes were predicted using NCBI BLAST 2.2.28+ and ESTScan (version 3.0.2). Unigenes were firstly aligned in the NR, Swiss-prot, KEGG, and COG databases for CDS prediction. The non-hit unigenes were then subjected to ESTScan (version 3.0.2). In total, the CDS of 68,197 unigenes were successfully predicted in the databases described above. The majority of predicted CDS were in the ranges of 200-500 bp (49.4%), followed by CDS of 600-1000 bp (25.6%), CDS of 1100-2000 bp (18.8%), and CDS of >2000 bp (6.0%) ( Figure 5A). The majority of CDS of unigenes (99.2%) predicted by the ESTScan software were less than 500 bp ( Figure 5B).

CDS Prediction
The protein coding sequence (CDS) and the amino acid sequence of all assembled unigenes were predicted using NCBI BLAST 2.2.28+ and ESTScan (version 3.0.2). Unigenes were firstly aligned in the NR, Swiss-prot, KEGG, and COG databases for CDS prediction. The non-hit unigenes were then subjected to ESTScan (version 3.0.2). In total, the CDS of 68,197 unigenes were successfully predicted in the databases described above. The majority of predicted CDS were in the ranges of 200-500 bp (49.4%), followed by CDS of 600-1000 bp (25.6%), CDS of 1100-2000 bp (18.8%), and CDS of >2000 bp (6.0%) ( Figure 5A). The majority of CDS of unigenes (99.2%) predicted by the ESTScan software were less than 500 bp ( Figure 5B).

Comparison with Reported Transcriptome Assemblies
The caladium root transcriptome as assembled above consists of 137,354 transcripts and 71,825 unigenes. This number of unigenes largely exceeds the number of gene models in the majority of sequenced plant genomes (20,000 to 40,000) [26]. This phenomenon seems to be not uncommon in transcriptome assemblies and analyses of other aroids [25] and other non-model plants [26]. In Anthurium, a high proportion of short unigenes (69.8%) and a large number of unigenes (111,268) were reported [25]. Several factors might have contributed to this phenomenon. The lack of reference genome sequences in non-model plants makes it difficult to reconstruct full-length gene transcripts from short sequence reads and read alignments. Consequently, multiple unigenes representing the same genes might have failed to be assembled and were treated as independent unigenes, resulting

Comparison with Reported Transcriptome Assemblies
The caladium root transcriptome as assembled above consists of 137,354 transcripts and 71,825 unigenes. This number of unigenes largely exceeds the number of gene models in the majority of sequenced plant genomes (20,000 to 40,000) [26]. This phenomenon seems to be not uncommon in transcriptome assemblies and analyses of other aroids [25] and other non-model plants [26]. In Anthurium, a high proportion of short unigenes (69.8%) and a large number of unigenes (111,268) were reported [25]. Several factors might have contributed to this phenomenon. The lack of reference genome sequences in non-model plants makes it difficult to reconstruct full-length gene transcripts from short sequence reads and read alignments. Consequently, multiple unigenes representing the same genes might have failed to be assembled and were treated as independent unigenes, resulting in large numbers of short unigenes and excessive unigene numbers [26]. Alternative splicing of protein-coding genes and expression of transposable elements and pseudogenes will certainly result in larger numbers of transcripts than the number of gene models annotated based on genome sequences. A third potential cause of the large number of unigenes in the current caladium root transcriptome assembly might be the relative large size of the caladium genome (≈4.5 Gb) [1]. When a species possesses a large sized and highly duplicated genome, the number of unigenes tended to be larger [26]. In Amorphophallus bulbifer, an aroid with a genome of ≈9.0 Gb, 119,678 unigenes was reported for its leaf transcriptome. In Allium cepa, a plant with a nuclear genome of 16 Gb, its bulb transcriptome consisted of 115,251 unigenes [27].
Nearly 50% of the caladium unigenes were short (200-500 bp). This was a major and frequent issue in de novo assembly of transcriptomes of non-model plants [28][29][30]. Several factors might have contributed to the presence of large numbers of short unigenes, including under-estimated sequencing depth needs, the use of stringent parameters in the assembly process, lack of reference genomes for read mapping, and highly complex transcriptomes. Deep sequencing using the single molecule real-time sequencing (SMRT) technology may be a very effective approach to extending unigenes to full-length transcripts.
De novo assembly of a soybean root transcriptome resulted in 37,287 contigs and 35,081 unigenes [31]. In Callerya speciosa, a total of 161,926 contigs and 111,706 unigenes were reported for its de novo assembled root transcriptome [32]. A de novo root transcriptome of Louisiana iris consisted of 525,498 transcripts and 313,958 unigenes [33]. These studies, as well as ours, on caladium suggest that the number of assembled contigs and the number of annotated unigenes could vary remarkably among root transcriptomes of different plant species. This variation may indicate a high level of diversity among plant species in root transcriptome (and metabolic) activities.

Identification of Putative Unigenes Involved in Resistance and Defense against Necrotrophic Pathogens
In response to attacks by necrotrophic pathogens, the innate immune system of plants perceives PAMPs with PRRs and activates PAMP-triggered immunity (PTI) against the pathogens [13]. PTI involves a number of cellular processes, including accumulation of reactive oxygen species and nitric oxide, modulation of hormone level, secretion of antimicrobial compounds, and reinforcing of the cell wall [34]. In this study, we compared unigenes in the root transcriptome of "Candidum" (moderately resistant to P. myriotylum) with those in the root transcriptomes of "Gingerland" and "Miss Muffet" (both highly susceptible to the pathogen) and identified 4518 unigenes that were present only in the root transcriptome of "Candidum" (Figure 6 and Table S7). These unique unigenes were examined for potential involvement in disease resistance and plant defense, in particular, (1) early signal perception and transduction in plant-pathogen interactions; (2) hormone regulation during plant defense responses to necrotrophic pathogens; and (3) secondary metabolites with antimicrobial activities in plant defense (Table 5).
"Miss Muffet" (both highly susceptible to the pathogen) and identified 4518 unigenes that were present only in the root transcriptome of "Candidum" (Figure 6 and Table S7). These unique unigenes were examined for potential involvement in disease resistance and plant defense, in particular, (1) early signal perception and transduction in plant-pathogen interactions; (2) hormone regulation during plant defense responses to necrotrophic pathogens; and (3) secondary metabolites with antimicrobial activities in plant defense (Table 5).

Early Signal Perception and Transduction in Plant-Pathogen Interactions
Recognition of PAMPs through membrane-localized PRRs during plant-pathogen interactions can induce an immune response syndrome [35]. Previous studies have indicated that a series of plant RLKs could be PRRs for plant immune responses [36]. Lack of RLKs could attenuate plant responses to pathogens, resulting in disease susceptibility [37]. All RLKs are transmembrane proteins, each containing a signal sequence, a putative extracellular domain at the amino-terminal, and an intracellular kinase domain at carboxyl-terminal [38]. Previous studies have shown that RLKs participate in many biologically important processes that confer disease resistance [39].
Of the 4518 unigenes found only in "Candidum", 50 unigenes can be translated into RLKs, including one L-type lection receptor kinase (LecRK), one cysteine-rich receptor-like kinase (CRK2), 12 receptor-like serine/threonine-protein kinases (STKs), 31 proline-rich receptor-like protein kinases (PERKs), and five possible RLKs (Table 5 and Table S8). LecRKs are important immune receptors that bind to the effectors produced by Phytophthora infestans [40]. Previous studies have shown that LecRK genes confer resistance in Arabidopsis to several necrotrophic oomycetes including Phytophthora brassicae and Botrytis cinerea [41,42]. The one LecRK homolog in "Candidum" may play a positive role in its disease resistance. CRKs are one of the largest classes of RLKs in plants [43]. Studies in Arabidopsis have shown that CRKs regulate plant development, stress adaptation, and disease resistance [44]. Yang et al. [45] found that a novel CRK homolog (TaCRK1) is involved in wheat resistance to Rhizocotonia cerealis and the expression levels of TaCRK1 were significantly higher in moderately resistant lines than in highly susceptible lines. In "Candidum" root transcriptome, only one CRK homolog (CRK2) was observed (Table 5). Another important type of plant receptor-like kinases are STKs, which can phosphorylate the hydroxyl group (OH) of serine or threonine residues, resulting in functional alterations of target proteins [46]. The phosphorylated proteins play an important role in disease resistance pathway by triggering a series of downstream activities including protein modulation, protein-protein interaction, transcription factor activation, and secretion of antimicrobial metabolites [47]. It has been shown that two wheat STK homologs, Tsn1 and Stpk-V, are the key genes conferring resistance against biotrophic Blumeria graminis (wheat powdery mildew) and the necrotrophic pathogens Stagonospora nodorum (Stagonospora nodorum blotch) and Pyrenophora tritici-repentis (tan spot) [46,48]. In this study, 12 STK homologs were observed exclusively in the "Candidum" root transcriptome (Table 5). In addition, we found that a majority of RLKs in the "Candidum" (60%) root transcriptome belong to PERKs. In Arabidopsis, PERKs represent a small family of RLKs, all sharing an extracellular domain, a transmembrane domain, and a kinase domain [38]. PERKs in Arabidopsis roots are involved in callose and cellulose deposition; PERK1 mutants have lost their ability to accumulate callose and cellulose in cell walls [49]. Silva and Goring [50] found that PERK1 could response to both mechanical stresses and infection by fungal pathogens and may have a function in the general perception of wounds or recognition of pathogen stimuli. In the "Candidum" root transcriptome, 31 PERK homologs were observed (Table 5), and these PERKs may be good candidates for further gene expression studies. There were five possible RLK-like unigenes ( Table 5) that could not be further annotated, suggesting that these RLK-like unigenes may represent unique transcripts in caladium and their functions remain to be studied.
After plant perception of PAMPs, signals from the PRR complexes are transferred to downstream targets through several secondary messengers, including influx of calcium (Ca 2+ ), reactive oxygen species (ROS) burst, and activation of mitogen-activated protein kinases (MAPKs) [51]. The calcium influx is an early step in the signal transduction processes [52]. Ca 2+ influx can be sensed by several stimulus-specific Ca 2+ -binding sensors such as calcium-dependent protein kinases (CPKs) and Calmodulins (CaMs). Previous studies have shown that transgenic potato plants containing StCDPK4 and StCDPK5 (CPK homologs) gain resistance to Phytophthora infestans [53]. A wheat CPK homolog (TaCPK7-D) is involved in determining resistance to necrotrophic pathogen Rhizoctonia cerealis. Here, we identified eight CPKs homologs only in "Candidum" (Table 5 and Table S9) and suspect that they may be important signal sensors which modulate resistance to necrotrophic pathogens. CaMs and CaM-like (CML) genes could trigger plant defense responses to a broad range of pathogens. In tobacco, a CML gene, NtCaM13, is a key gene for resistance to the necrotrophic pathogens Pythium aphanidermatum and Rhizoctonia solani [54]. Four CaM gene homologs were identified among "Candidum" unigenes (Table 5). ROS are important signals to modulate the expression of defense-related genes to fungal pathogen attacks [55]. The production of ROS involves a number of oxidative enzymes such as the NADPH (nicotinamide adenine dinucleotide phosphate) oxidase, the superoxide dismutase, the cell wall peroxidase, the oxalate oxidase, the lipoxygenase, and the polyamine oxidase [56]. These enzymes work collaboratively or separately to initiate a series of responses including cell wall remodeling, signal transduction, programming cell death, and post-translation regulation to resist pathogens [56]. So far, our understanding of the roles ROS play in plant roots and resistance against necrotrophic pathogens is still rudimentary [56]. In banana, studies have shown that a NADPH oxidase homolog (Rboh) confers resistance against the necrotrophic pathogen Fusarium oxysporum.

Hormone Regulation during Plant-Pathogen Interactions
Increased biosynthesis of hormones is an important plant defense response to invading pathogens [37]. In plants, ethylene (ET) and jasmonic acid (JA) are two important hormones involved in resistance against necrotrophic pathogens. S-adenosylmethionine (S-AdoMet) and 1-aminocyclopropane-1-carboxylic acid (ACC) are two precursors of ethylene [62]. Enzymes catalyzing ET biosynthesis reactions include ACC synthase (ACS) and ACC oxidase (ACO) [62]. Among the annotated unigenes in "Candidum", both key enzymes participating in ET synthesis were identified ( Table 5 and Table S10). Synthesized ethylene can be then perceived and transduced by ethylene receptors to trigger downstream biological responses. There are several types of ethylene receptors in Arabidopsis, such as ethylene receptors (ETRs), ethylene response sensors (ERSs), and ethylene insensitive (EINs) [63]. However, the roles ethylene receptors play in necrotrophic pathogen resistance seem to be different depending on plant species and the types of pathogens. In Arabidopsis, the ein2 (ethylene insensitive) mutant lines displayed increased susceptibility to the necrotrophic pathogen Botrytis cinerea [64], while in soybean, the etr1 and etr2 (ethylene-resistant) mutant lines showed increased resistance against Phytophthora sojae [65]. In this study, we observed 14 ETR homologs that were expressed in the three caladium cultivars. One ETR-like unigene was expressed exclusively in the "Candidum" root transcriptome; this unigene could be a good candidate for future studies.
JA is another important hormone in plant defense against many necrotrophic pathogens [66]. It has been reported that exogenous applications of methyl jasmonate to Arabidopsis enhance plant resistance to root rot disease caused by the necrotrophic pathogen Pythium mastophorum [67]. Cohen et al. [68] found that methyl jasmonate-treated potatoes showed increased resistance to Phytophthora infestans. Studies in tomato have shown that the CORONATINE-INSENSITIVE1 gene (COI1), the acyl-CoA oxidase gene (ACX1A), the suppressor of prosystemin-mediated responses2 (SPR2), and the defenseless1 gene (def1) are involved in JA-induced defense responses against necrotrophic pathogens such as Botrytis cinerea [37]. Tomato plants with either one of these genes mutated could not correctly process JA synthesis, and JA signal perception or transduction, resulting in disrupted immune responses and susceptible plants [37]. We found that 55 unigenes were potentially involved in JA synthesis and expressed in all three caladium cultivars. One ACX1A homolog was expressed only in "Candidum" (Table 5), which suggests that this unigene may be a good candidate for further gene expression studies.

Secondary Metabolites with Antimicrobial Activities
Plants can secrete secondary metabolites around their rhizosphere to inhibit or repel soil-borne pathogens. In general, phenolics and terpenoids are two primary classes of root-secreted antimicrobial metabolites [69]. Common root-secreted phenolic compounds include phenylpropanoids and flavonoids [70]. Many phenylpropanoids from a broad range of plant species have shown antimicrobial activities [71]. Phenylpropanoids in chickpea roots affect resistance against the necrotrophic pathogen Fusarium oxysporum; resistant chickpea varieties produce significantly more phenylpropanoids than susceptible ones [72]. Lanoue et al. [71] found that phenylpropanoids were constitutively secreted by the barley root system and inhibited Fusarium graminearum spore germination. In "Candidum", we found two unigenes that are likely involved in phenylpropanoid synthesis. Flavonoids are often one of the largest groups of phenolics in root exudates [69]. Studies in maize have indicated that high levels of flavonoids are associated with an increased defensive state in its roots [73]. We identified four unigenes in "Candidum" that are likely involved in flavonoid synthesis (Table 5 and Table S11).
Terpenoids are another large group of compounds in root exudates that have antimicrobial activities [74]. Some volatile terpenoids emitted from roots could act as a direct barrier against pathogens [69]. In Arabidopsis, Steeghs et al. [75] found a below-ground volatile compound (monoterpene 1,8-cineole) which had direct effects on Pseudomonas syringae. Kapulnik et al. [76] observed that strigolactones released from roots inhibited the growth of several soil-borne pathogens, including Fusarium oxysporum, Fusarium solani, and Macrophomina phaseolina. We identified four unigenes in the "Candidum" root transcriptome that are likely involved in the terpenoid synthesis pathway. Derivatives of tryptophan are another important antimicrobial root exudates. Typical examples of tryptophan derivatives include camalexin, which is produced by Arabidopsis and exhibits antimicrobial activities to a wide range of soil-borne pathogens including Pythium sylvaticum, Rhizoctonia solani, and Sclerotinia sclerotiorum [77,78]. In the "Candidum" root transcriptome, we found nine tryptophan synthesis-related unigenes ( Table 5 and Table S11), which may be involved in resistance against soil-borne necrotrophic pathogens. One of these unigenes is root-specific.

Simple Sequence Repeat (SSR) and Single Nucleotide Polymorphism (SNP) Discovery
Transcriptome sequencing data have become a very important resource for discovering nucleotide sequence polymorphisms and developing molecular makers. SSRs and SNPs are two very important types of nucleotide sequence polymorphisms, and each has its own characteristics: SSRs are multi-allelic and hyper-variable, while SNPs are the most abundant type [79]. Molecular markers based on SSRs and SNPs have been widely used in plants for a wide range of research and practical purposes, including estimation of genetic diversity, determination of phylogenetic relationship, development of genetic linkage maps, identification of quantitative trait loci (QTL), association studies, marker-assisted breeding, and cultivar identification and protection [80]. In many uses, SSR and SNP-based molecular markers can complement each other.
To detect SNPs, the unigenes of "Candidum", "Gingerland", and "Miss Muffet" were merged to generate a set of reference sequences. SNP sites were called for "Candidum", "Gingerland", and "Miss Muffet" by mapping their clean reads to the set of reference sequences. As a result, 281,728, 244,685, and 270,217 SNP sites were identified for "Candidum", "Gingerland", and "Miss Muffet", respectively (Table S13). The frequency of SNPs in these unigenes ranged from 1 per 487 bp in "Candidum" to 1 per 560 bp in "Gingerland". The most abundant SNPs were of the transition-type (about 60.1%), cytosine to thymine (C ↔ T) or adenine to guanine (A ↔ G), followed by the transversion-type (A ↔ C, A ↔ T, C ↔ G, or G ↔ T) (39.9%) ( Table 7 and Table S13). Table 6.
Summary of simple sequence repeats (SSRs) identified in caladium root transcriptome sequences.  Molecular markers have become a highly valuable tool in genetic improvement of plant resistance to various pathogens [81]. The main bottleneck that constrains breeding efforts to improve Pythium root rot resistance in caladium has been the low efficiency in phenotyping large breeding populations and the difficulty in obtaining reliable data for resistance to Pythium root rot. The availability of molecular DNA markers closely linked to or associated with caladium root rot resistance will empower breeders to implement highly desired marker-assisted selection in caladium. The putative disease resistance/defense genes and the large numbers of SNPs and SSRs identified in caladium root transcriptomes will serve as a very important genomic resource for developing molecular markers, linkage map construction, association studies, identification and validation of caladium QTL linked to Pythium root rot resistance. Such a marker-assisted selection strategy has been successfully implemented in many agronomic and horticultural plant species and has resulted in enhanced resistance to a wide spectrum of pathogens [82]. It is expected that application of this strategy will speed up the development and deployment of new Pythium-resistant caladium cultivars.

Plant Material
Tubers (approximately 6 to 9 cm in diameter) of caladium cultivars "Candidum", "Gingerland", and "Miss Muffet" were planted individually in plastic containers (15 cm diameter) filled with coarse vermiculite (Palmetto Vermiculite Company, Woodruff, SC, USA) and irrigated with an overhead misting system (once a day). Plants sprouted from the tubers were grown in a greenhouse at the University of Florida's Gulf Coast Research and Education Center, Wimauma, FL, USA. Temperatures inside the greenhouse were maintained between 21 • C (night) and 30 • C (day). Five grams of a controlled-release fertilizer (Osmocote ® ; The Scotts Miracle-Gro Company, Marysville, OH, USA) was applied to the top of the potting mix in each container to provide nutrients for the plants. Two months after the tubers were planted, roots were collected from four plants per cultivar, combined into a 50-mL centrifuge tube, and frozen with liquid nitrogen. The frozen root tissues were stored at -86 • C in a deep freezer (NuAire Corporate, Plymouth, MN, USA) until use.

RNA Extraction and Quality Control
Total RNA was isolated from the frozen root tissues using the pBiozol Total RNA Extraction Reagent (BioFlux, Hangzhou, China) following the manufacturer's instructions. RNA concentration, RNA integrity number (RIN), and the 28S/18S ratio were determined on an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA) at the BGI (Beijing Genomics Institute, Shenzhen, China).

Illumina cDNA Library Preparation and Transcriptome Sequencing
Three cDNA libraries were prepared at BGI using the TruSeq TM RNA Sample Prep Kit v2 (Illumina, Inc., San Diego, CA, USA). Approximately 200 ng of total RNA for each sample were purified with oligo-dT (25) beads. The purified poly(A)-containing mRNAs were fragmented into small pieces of 120-200 bp using the Elute, Prime, and Fragment Mix reagents from Illumina TruSeq™ RNA Sample Prep Kit v2 at 94 • C for 8 min. First-strand cDNA was synthesized using the First Strand Master Mix and SuperScript II from Invitrogen (Carlsbad, CA, USA) and the following PCR program: 25 • C for 10 min, 42 • C for 50 min, and 70 • C for 15 min. Second-strand cDNA was synthesized using the Second Strand Master Mix from Invitrogen at 16 • C for 1 h. End repair was performed using the End Repair Mix from Invitrogen at 30 • C for 30 min and the Ampure XP beads (Beckman Coulter, Brea, CA, USA). The resulting cDNA fragments were adenylated using the A-Tailing Mix from Illumina at 37 • C for 30 min. Subsequently, the cDNA fragments were ligated to the RNA Index Adapter at 30 • C for 10 min using the Ligation Mix from Illumina and then purified with the Ampure XP beads (Beckman Coulter). Purified cDNA fragments were enriched with 10 cycles of PCR: 98 • C for 10 s, 60 • C for 30 s, and 72 • C for 30 s. The resulting PCR products were purified with Ampure XP beads (Beckman Coulter). The final libraries were quantitated using an Agilent 2100 Bioanalyzer (Agilent Technologies). Prepared cDNA libraries were amplified on a cBot TM system (Illumina) to generate clusters on the flow cell using the TruSeq PE Cluster Kit V4-cBot-HS (Illumina) and then subjected to the 125 bp paired-end sequencing on a HiSeq 2000 system (TruSeq SBS KIT-HS V4, Illumina) at BGI.

Raw Sequencing Data Trimming and De Novo Transcriptome Assembly
Raw sequence reads were cleaned using the filter_fq software (an internal version of BGI) to remove adaptors, reads with more than 5% unknown (N) nucleotides, and low quality reads which had more than 20% of the nucleotides scored ≤10. De novo assembly of clean reads was performed using the short reads assembling software, Trinity (version 20130225) [83], and its default parameters, with the following changes: min contig length = 100, reinforcement distance = 85, min glue = 3, group pair distance = 250, and min kmer cov = 3. The decrease of min contig length from 200 to 100 was to allow more sequences to be clustered by the TGI Clustering Tool (TGICL) (version 2.1) [84] and to be available for further analyses. Changes in reinforcement distance (from 75 to 85), min glue (from 2 to 3), and min ker cov (from 1 to 3) were made to increase the quality of resulting assemblies.

Unigenes Annotation and Coding DNA Sequence (CDS) Prediction
The assembled unigenes were annotated against the NR (release 2013_04) and NT (released 2013_04) databases, the Swiss-Prot database, the KEGG (release 2013_03) database, and the COG (release 2009_03) database using NCBI BLAST 2.2.28+ with an E-value < 10 −5 (version 2.2.26). The software BLAST2GO (version 2.5.0) was used to retrieve GO annotations for unigenes annotated with the NR database. Subsequently, all unigenes annotated by GO were subjected to the software WEGO for functional classification. The software Path_finder (an internal version of BGI) was used to perform KEGG pathway annotation against the KEGG database. Unigenes failing to hit any in these databases were then subjected to the software ESTScan (version 3.0.2) for CDS prediction.

Detection of SSR and SNP Sites
MicroSatellite (MISA, http://pgrc.ipk-gatersleben.de/misa/) was used to identify SSR motifs in assembled unigenes with the following standards: mono repeats-12 units, dimer repeats-6 units, trimer repeats-5 units, tetramer repeats-5 units, pentamer repeats-5 units, and hexamer repeats-4 units. The software Primer3 (http://primer3.sourceforge.net/releases.php) was used to design primers for amplifying identified SSR sites in PCR products. To identify putative SNP sites, unigenes from all three caladium cultivars were combined to create a set of reference sequences. The SOAPsnp software (version 1.03) was employed to align unigenes of each cultivar against the combined set of reference sequences to call SNPs.
Supplementary Materials: Supplementary materials can be found at www.mdpi.com/1422-0067/18/4/712/s1. Acknowledgments: This study was supported by the USDA (United States Department of Agriculture) hatch project FLA-GCC-005507. Authors thank BGI staff for technical assistance in sequencing, assembling and annotating caladium transcriptomes described in this paper.
Author Contributions: Zhe Cao performed the experiments, analyzed the data, and drafted the manuscript. Zhanao Deng conceived the study, reviewed data and results, and revised the manuscript. All authors read and consented to the final version of the manuscript.

Conflicts of Interest:
The authors declare no conflicts of interest.