Transcriptomics in the tropics: Total RNA-based profiling of Costa Rican bromeliad-associated communities

RNA-Seq was used to examine the microbial, eukaryotic, and viral communities in water catchments (‘tanks’) formed by tropical bromeliads from Costa Rica. In total, transcripts with taxonomic affiliation to a wide array of bacteria, archaea, and eukaryotes, were observed, as well as RNA-viruses that appeared related to the specific presence of eukaryotes. Bacteria from 25 phyla appeared to comprise the majority of transcripts in one tank (Wg24), compared to only 14 phyla in the other (Wg25). Conversely, eukaryotes from only 16 classes comprised the majority of transcripts in Wg24, compared to 24 classes in the Wg25, revealing a greater eukaryote diversity in the latter. Given that these bromeliads had tanks of similar size (i.e. vertical oxygen gradient), and were neighboring with presumed similar light regime and acquisition of leaf litter through-fall, it is possible that pH was the factor governing these differences in bacterial and eukaryotic communities (Wg24 had a tank pH of 3.6 and Wg25 had a tank pH of 6.2). Archaeal diversity was similar in both tanks, represented by 7 orders, with the exception of Methanocellales transcripts uniquely recovered from Wg25. Based on measures of FPKG (fragments mapped per kilobase of gene length), genes involved in methanogenesis, in addition to a spirochaete flagellin gene, were among those most highly expressed in Wg25. Conversely, aldehyde dehydrogenase and monosaccharide-binding protein were among genes most highly expressed in Wg24. The ability to observe specific presence of insect, plant, and fungi-associated RNA-viruses was unexpected. As with other techniques, there are inherent biases in the use of RNA-Seq, however, these data suggest the possibility of understanding the entire community, including ecological interactions, via simultaneous analysis of microbial, eukaryotic, and viral transcripts.


Introduction
In the Neotropics, plants within the family Bromeliaceae are important epiphytes due to their relative abundance and to their compact foliar-based catchments (or 'tanks') capable of retaining water. Because epiphytic bromeliads virtually lack absorptive roots, these tanks provide the plants with water and nutrients released by heterotrophic decomposition of impounded debris [2,46]. Tank bromeliads can contain as much as 50,000 l of suspended water ha − 1 of canopy [8], a habitat that has potential significance with regard to local biodiversity and community structure. It has been shown that increased environmental complexity results in increased species diversity for both bacteria and animals [21,43]. This may be particularly true for bromeliads, which not only contribute to the heterogeneity of the forest, but themselves are complex structures with compartments of different sizes and unique physical-chemical characteristics [1,24]. Faunal abundance and diversity are enhanced in bromeliad tanks, which house up to 100× greater animal densities than surrounding forest floor habitats [34,39,40]. A vast array of aquatic taxa take advantage of these catchments, including numerous insect families, ostracods, copepods, rotifers, and amphibians, as well as protozoans, algae, fungi, bacteria, and archaea [3,9,11,12,14,28,38].
Due to their small size, defined boundaries, and relatively simple faunal communities, bromeliad catchments are ideal for the investigation of whole ecosystem complexity, a notoriously difficult task for lakes and other large, more complex freshwater habitats [24]. As a consequence, bromeliad habitats have been used for the study of carbon and nitrogen cycling, food web dynamics, and even the effects of warming on trophic cascades [20,27,30,32,33]. Similarly, bromeliad tanks have been investigated for their unique bacterial and archaeal communities, as well as the influence of various 'micro-limnological' parameters, such as pH, oxygen, dissolved organic carbon, and light, on the microbial communities [11,12,15,30]. pH conditions, for example, can have a major impact on the structure of freshwater microbes [17], and in a study of Costa Rican bromeliads it was discovered that Alphaproteobacteria, Acidobacteria, Planctomycetes, and Bacteroidetes dominated tanks of low pH (b 5.1), Computational and Structural Biotechnology Journal 13 (2015) [18][19][20][21][22][23] while Betaproteobacteria, Firmicutes, and Bacteroidetes dominated tanks of high pH (N5.3), based on 16S rRNA gene surveys [11]. The influence of pH was not observed for archaea [12] and is generally less resolved for animal communities [4,7]. Petrin and colleagues determined that the ability of stream invertebrates to adapt to naturally low pH conditions was variable among different taxonomic groups and that total species richness and abundance did not change with pH [35].
With a new appreciation for the intertwined worlds of metazoans and microbes, it is an exciting prospect to explore the possibility of examining the entire community (from viruses to eukaryotes) with a single technique. The current state of knowledge with regard to plantassociated aquatic habitats is based on the counting of animals or protozoa or, in the case of bacteria and archaea, based on 16S rRNA ribosomal surveys (ex. [11,12,14,28,38,42]). There have been no papers published on viral communities within these abundant and important ecosystems. Each of these studies, like most, was conducted with a specific organismal domain in mind. Despite decades of investigation, total community dynamics has not been measured for any given bromeliad tank community. Additionally, DNA-based analysis (ex. 16S rRNA gene surveys), although a technique that has yielded much new information about the living world, does little to inform our understanding of functional microbial status and activity in situ. Metatranscriptomic analysis (the study of the RNA from the entire community of organisms), however, allows for detection of mRNA transcripts from all domains, rather than specific taxa. Advances in high-throughput sequencing and bulk extraction of mRNA allow for RNA-based studies of organismal functioning in complex environments. Despite certain technological challenges, microbial metatranscriptomics has already helped elucidate microbial responses to oil spills and the resulting deep-sea hydrocarbon plumes, differences between anoxic and oxic paddy soils, and metabolic activity of methane-producing microbes, to name a few [31,41,47]. Similarly, metatranscriptomic approaches have been used to investigate the functional diversity of the eukaryotic microorganisms within the rumen of muskoxen [37] and in forest soils [6,44]. Notably, Urich et al. obtained information on the structure and function of organisms from all three domains of life, in a single RNA-centered metatranscriptomic experiment with forest soil [44]. While no single 'omics' approach is able to elucidate the truly complex nature of entire communities, the current results suggest the possibility of using similar RNA-based analyses to profile total community dynamics, including eukaryotes, archaea, bacteria, and possibly viruses. We provide here preliminary data on the diversity of total communities associated with two bromeliad tanks from La Selva Biological Station in a lowland tropical rainforest in Costa Rica.

Sample collection
Bromeliad tank debris and water were collected from two Werauhia gladioliflora specimens in La Selva Biological station, a wet (4 m annual rainfall) lowland neotropical reserve located in northeastern Costa Rica (10°25′52″ N, 84°00′12″ W). Samples were collected in June 2012, from two plants, neither of which was in bloom, growing at heights of~2.0 m from the ground. The bromeliads had tanks of~10 cm depth (i.e. similar vertical oxygen gradients), and were neighboring with presumed similar light regime and acquisition of leaf litter through-fall. Since previous studies revealed strong correlations between bacterial diversity and bromeliad tank pH, which can range from pH 3 to 7 [11], two tanks at the pH extremes were chosen for comparison of gene expression from all 3 domains of life, in addition to viruses. At the time of sample collection, 'Wg24' had a tank pH of 3.56, whereas 'Wg25' had a pH of 6.21. The entire tank contents were collected from the deepest horizons within the central axil of each plant, and put into clean, RNase-free 15 ml conical tubes for transport to the lab. Tank samples were centrifuged at 15,000 rpm for 10 min, water was removed, and the remaining debris was resuspended in 3 × LifeGuard Soil Preservation Solution (MoBio Laboratories), and stored at −80°C for further analysis. Samples were at room temperature in the LifeGuard solution for~15 h during air transport back to Occidental College.

Total nucleic acid extraction
Nucleic acid extraction was performed at Occidental College, using RNase-free materials (tubes and pipette tips) on workstations treated with RNaseZap Wipes (Ambion). The RNA PowerSoil Total RNA Isolation Kit (MoBio Laboratories) was used for bulk extraction of RNA, with slight protocol modifications to extract sufficient quantities of nucleic acid for metatranscriptomic analysis (~90-120 ng μl − 1 ). Modifications included the use of freshly prepared low-pH phenol:chloroform:isoamyl alcohol (25:24:1 ratio, respectively) for nucleic acid extraction (PowerSoil Step 5), a room temperature incubation for secondary debris precipitation (PowerSoil Step 9), gentle encouragement of fluid flow through the RNA capture column using a syringe (PowerSoil Steps [16][17][18][19], and an overnight nucleic acid precipitation step (PowerSoil Step 20). Successful nucleic acid extraction was verified by NanoDrop UV spectrophotometry, ensuring sufficient RNA concentrations and purity. RNA extraction was immediately followed by removal of contaminant DNA from the extracted RNA, using TURBO DNA-free DNase Treatment and Removal Reagents (Ambion). A treatment of 4 Units TURBO DNase per 100 μl RNA was applied, with a 20 min 37°C incubation for sufficient DNA contaminant removal.

RNA preparation and sequencing
Total extracted RNA was submitted to Otogenetics Corporation (Norcross, GA USA) for RNA-Seq analysis. Briefly, 9-12 μg of total RNA was subjected to rRNA depletion using the RiboZero Meta-Bacteria kit (Epicentre Biotechnologies) and cDNA was generated from the depleted RNA using the NEBNext mRNA Sample Prep kit (New England Biolabs). Illumina libraries were created from the cDNA using NEBNext reagents (New England Biolabs). The quality, quantity and size distribution of the Illumina library products were determined using an Agilent Bioanalyzer 2100. The libraries were then submitted for Illumina HiSeq2000 sequencing according to standard operations.

Bioinformatic analysis
Paired-end 100 nucleotide (nt) reads were generated and checked for data quality using FASTQC (Babraham Institute). For removal of ribosomal RNA sequences (5S, 16S, 23S and the eukaryote rRNA equivalents) that were not subtracted via RiboZero depletion, SortMeRNA software [25] was used to filter by comparison against the Silva rRNA archaea, bacteria and eukaryotic rRNA databases [36]. For Wg24, 13,562,864 remained after rRNA removal (78%), out of 17,258,552 initial reads. For Wg25, 10,761,162 remained after rRNA removal (67%), out of 15,998,012 initial reads. These~10-14 million 100 nt reads were assembled into larger contigs (500-5000 nt in length) using CLC Genomics Workbench v.6.0.1 (2533 and 1938 contigs were generated for Wg24 and Wg25, respectively). Sequencing reads were quality filtered and Illumina adapters were trimmed using the CLC Genomics Workbench with a quality threshold of 0.01. Trimmed reads were assembled de novo using the CLC Genomics Workbench using an automatic kmer size of 22 and a bubble size of 50. To assign function to the assembled metatranscriptomic contigs, Prodigal v.2.60 was used to call open reading frames (ORFs; [22]), which were annotated functionally and taxonomically by blastp, using an E-value of b10 −5 against the Integrated Microbial Genomes (IMG) system v.4.0 [29] which includes archaeal, bacterial and eukaryotic full and draft genomes, as well as viral and environmental sourced genes. Of the 2533 contigs within library Wg24, 434 had functional gene annotations without a taxonomic affiliation, while an additional 166 had no homologs in public databases. Similarly, of the 1938 contigs within library Wg25, 128 had functional gene annotations without a taxonomic affiliation, while an additional 88 had no homologs in public databases. For quantification of gene expression, rRNAsubtracted mRNA sequences were mapped against the assembled metatranscriptome using BWA v.0.6.2 [26]. The number of reads that mapped against a particular gene was normalized by the length of the gene in order to generate an FPKG value (number of fragments mapped per kilobase of gene length), using the sam2fpkg.pl script v.0.3 (http:// github.com/minillinim/sam2fpkg; [16]). To allow analysis of differential gene expression between samples, the FPKG values were then normalized by the total read length and multiplied by a million, for ease of comparison. The transcript reads were deposited in the NCBI Sequence Read Archive under accession numbers SRR1697355 (Wg24) and SRR1697356 (Wg25).

Results and discussion
Metatranscriptomic analysis (the study of RNA from the entire community of organisms) was performed on the water catchments formed by tropical bromeliads from a Costa Rican rainforest. The community associated with Costa Rican tropical bromeliads was deduced through affiliation of non-ribosomal transcript sequences with particular taxa in public databases. Overall results from two bromeliad tanks indicate the presence of genes from 25 bacterial phyla (including 4 classes within the Proteobacteria), 7 archaeal orders, 21 metazoan classes (including 6 orders of insect), 8 fungal classes, as well as viruses and transcripts related to those recovered from environments, including plant biomass reactors, freshwater lakes, and the rhizosphere (Tables 1-3). There was a notable difference in the diversity of each taxonomic domain between the two tanks. For example, bacteria from 25 phyla appeared to comprise the majority of transcripts in Wg24, compared to only 14 phyla in Wg25 (Tables 1, 2). Even after accounting for a difference in total reads analyzed (17.2 versus 16.0 million reads, respectively) or contigs recovered (2531 versus 1938, respectively), this represented a greater bacterial diversity within tank Wg24 (Fig. S1). Conversely, eukaryotes from only 16 classes comprised the majority of transcripts in Wg24, compared to 24 eukaryote classes in Wg25 (Table 3), revealing a greater overall eukaryote diversity at less acidic conditions. Archaeal diversity was similar in both tanks, represented by 7 orders, with the exception of Methanocellales transcripts uniquely recovered from Wg25 (Tables 1,  2). Additionally, some transcripts were annotated as environmentallyderived, from uncultured microbes inhabiting aquatic sediments (e.g. Sakinaw Lake, Lake Washington), bioreactors (e.g. Poplar biomass reactors, enrichment cultures), and associated with eukaryotes (e.g. Costa Rican termite hindgut microbiome, Arabidopsis rhizosphere communities; Table 1). Viral transcripts were recovered from both tanks and included insect, plant, and fungi-associated RNA-viruses. Specific results from each of these organismal levels are discussed below.
Bromeliad eukaryote communities are abundant and diverse, typically dominated by annelids and larvae of aquatic insects, including many species of Coleoptera and Diptera [23,24,28,38]. Our RNA-based results are consistent with these previous studies and suggest that certain biota are possibly adapted to more or less acidic conditions in these unique, plant-based aquatic habitats. For example, the higher pH Wg25 tank was host to a large and varied number of eukaryotic RNA transcripts, including many closely related to those of annelids (24% of the total eukaryote transcripts) and insects (37% of total eukaryote transcripts; e.g., Diptera, Hymenoptera, Coleoptera, Hemiptera, Phthiraptera, and Lepidoptera). Additional eukaryote transcripts from Wg25 included members of the Branchiopoda, Cnidaria, Mollusca, Nematoda, and Placozoa, to name a few ( Table 2). Whether these transcripts reflect the actual presence of similar organisms, or whether they are affiliated to these groups, in silico, as limitations of existing databases, remains to be seen. Additionally, chordate-sourced transcripts were present in Wg25 and likely reflected the common transient use of bromeliad tanks for food, water and shelter by monkeys, birds, and amphibians ( [3]; SKG, pers. obs). Interestingly, transcripts affiliated with a chytrid fungus were only recovered from Wg25, from where we also recovered amphibian-affiliated transcripts, most likely the blue-jeans frog Oophaga pumilio, which commonly uses bromeliad tanks as nurseries [45]. On the other hand, eukaryote transcripts were lower in diversity within Wg24, possibly reflecting the putatively more stressful conditions of the acidic catchment (Table 3). Far fewer eukaryote groups were represented and of the insect-sourced transcripts in Wg24 sample, 61% were associated with Coleoptera, while annelid transcripts were comparatively absent (Table 1). Fungal transcripts made up the remaining minority in Wg24, including those from 4 classes (Table 3), likely involved in the hydrolysis of plant material trapped in the tanks. Many eukaryotic viral transcripts were detected in both bromeliad tanks, and mirrored the diverse eukaryotic community, including many likely derived from plant, insect, and fungal (yeast, water mold) dsRNA and ssRNA viruses.
Transcripts from numerous bacterial phyla (including 4 classes within the Proteobacteria), were recovered from the bromeliad tanks, comprising 25 phyla in Wg24 and a subset of these (~14 phyla) from Wg25 (Table 1). Both tanks contained numerous transcripts belonging to the Firmicutes, in relatively similar numbers (13-18% of total bacterial transcripts recovered), but much higher diversity from Wg24 (102 species represented versus 48 species, respectively; Table 2). The greater diversity in Wg24 was generally consistent among all bacterial groups ( Table 2; Fig. S1), even after normalization of sequencing effort and contig yield, which was initially~30% higher in Wg24. Notable differences included the dominance of Acidobacteria in Wg24 (which comprised 26% of the total bacterial transcripts recovered), versus the dominance of Gammaproteobacteria in Wg25 (60% of the total bacterial transcripts recovered). Additional transcripts recovered mainly from Wg24 included those affiliated with Spirochaetes and Alphaproteobacteria, the latter of which were exceptionally diverse in Wg24, compared to Wg25 (transcripts from 48 species compared to  Table 2). Both of these bacterial groups are well known to tolerate more acidic conditions (consistent with Wg24), and a previous 16S rRNA-based study on Costa Rican bromeliad tanks demonstrated pH-dependent structure within the bacterial community, including the dominance of Acidobacteria and Alphaproteobacteria in bromeliad tanks of pH b 5.0 [12]. Additionally, consistent with different pH regimes, transcripts annotated as 'Environmental' from the low pH tank Wg24 resembled those recovered from a Poplar biomass reactor, while the transcripts from the higher pH Wg25 tank resembled those recovered from typical freshwater lakes (Lake Sakinaw and Lake Washington; [10]). The diverse nature of these related habitats is consistent with the duality of the highly stratified bromeliad tank environment, which has many lake-like qualities in the upper less acidic layers, and the decompositional anoxic nature in the deeper, more acidic layers. Although not well constrained, factors that affect the distribution and abundance of the bromeliads themselves, such as rainfall, light, and temperature, also likely have both direct and indirect effects on communities within the tanks [13]. The metatranscriptome data for archaeal presence and diversity also followed the same pattern as recovered from previous DNA-based studies. Transcripts with archaeal taxonomic assignments were present at similar levels in both bromeliad tanks, comprising~7% of the transcript reads. Archaeal diversity was similar in both tanks, represented by 7 orders, with the exception of Methanocellales transcripts uniquely recovered from Wg25 (Tables 1, 2). Both of the archaeal populations were dominated by the order Methanomicrobiales (86-95% of the recovered archaeal transcripts; Table 1), involved in hydrogenotrophic methanogenesis. Previous 16S rRNA gene analysis of Costa Rican bromeliad catchments also revealed that archaea were dominated (up to 90% of 16S rRNA genes) by the hydrogenotrophic methanogens (Methanomicrobiales and Methanocellales; [11]).

Functional nature of the community
Bacterial communities play essential roles in energy and matter flow in aquatic microcosms, through detrital mineralization and nutrient release. Of particular interest is the microbial role in nitrogen and carboncycling in bromeliad tanks, due to the abundance of carbon-based leaf litter, animal carcasses, and feces that constitute the organic detritus. Despite their status as one of the most abundant aquatic ecosystems in the world, and their classification as true microcosms (with high  variability among physical, biological, and chemical parameters), remarkably little is known about the function of bromeliad-associated microbial communities [15,18]. Genes relevant to carbon turnover, including chitinase and methyl coenzyme M reductase (involved in archaeal methanogenesis) have been recovered from bromeliad communities [11]. Claims have even been made that bromeliad tanks also serve as significant sites for global methane production by archaeal microbes [30]. In the current study, measures of fragments mapped per kilobase of gene length (FPKG) revealed genes involved in methanogenesis to be among those most highly expressed only in Wg25 (Table 4), suggesting the differential presence versus activity of methanogens in this unique habitat. A dehydrogenase associated with Acidobacteria, and numerous Spirochaete-related genes ranked among the most expressed in Wg24 (Table 4). The challenge posed by acidic environs might necessitate cellular defense mechanisms in microorganisms that inhabit such niches [5]. Increased expression of genes involved in proton pumping, repair of degraded macromolecules, and transcriptional and metabolic alterations was observed in the Wg24 library (data not shown). For example, gene transcripts for superoxide dismutase and peroxidase, enzymes necessary for the neutralization of toxic superoxide (O 2 − ) and hydrogen peroxide metabolites, and small heat shock proteins, were almost exclusively detected in Wg24. Together, such differences in genes relevant to cellular defense and response mechanisms perhaps reflect the high-stress, low-pH environment in this particular bromeliad catchment.

Summary and outlook
Freshwater tropical ecosystems, including bromeliad tanks, support disproportionate levels of biodiversity, especially microorganisms and invertebrates, relative to their spatial coverage, and are expected to be highly vulnerable to habitat perturbations resulting from climate change [19]. Here we present the results of a RNA-centered metatranscriptomic comparison of tank-associated community structure and function between two neighboring bromeliads. While no single 'omics' approach is able to elucidate the truly complex nature of entire communities, these results suggest the possibility of using similar RNA-based analyses to profile total community dynamics, including eukaryotes, archaea, bacteria, and RNA viruses. The comparison of bromeliad tanks of similar size, but different pH conditions, revealed different communities encompassing all domains of life, as well as viruses. Interestingly, the diversity of animal mRNA transcripts was lower in the low pH tank, with insects primarily dominating. Despite other potential variables shaping these communities, it is possible that the specific and permanent reduction in pH that is expected to accompany future increases in CO 2 on the planet could result in the loss of biodiversity within bromeliad tanks over time. In order to confirm this hypothesis, it will be necessary to conduct a more thorough analysis of the structure and function of organisms in these habitats.