Archaeal and bacterial diversity and community composition from 18 phylogenetically divergent sponge species in Vietnam

Sponge-associated prokaryotic diversity has been studied from a wide range of marine environments across the globe. However, for certain regions, e.g., Vietnam, Thailand, Cambodia, and Singapore, an overview of the sponge-associated prokaryotic communities is still pending. In this study we characterized the prokaryotic communities from 27 specimens, comprising 18 marine sponge species, sampled from the central coastal region of Vietnam. Illumina MiSeq sequencing of 16S ribosomal RNA (rRNA) gene fragments was used to investigate sponge-associated bacterial and archaeal diversity. Overall, 14 bacterial phyla and one archaeal phylum were identified among all 27 samples. The phylum Proteobacteria was present in all sponges and the most prevalent phylum in 15 out of 18 sponge species, albeit with pronounced differences at the class level. In contrast, Chloroflexi was the most abundant phylum in Halichondria sp., whereas Spirastrella sp. and Dactylospongia sp. were dominated by Actinobacteria. Several bacterial phyla such as Acidobacteria, Actinobacteria, Bacteroidetes, Chloroflexi, Deferribacteres, Gemmatimonadetes, and Nitrospirae were found in two-thirds of the sponge species. Moreover, the phylum Thaumarchaeota (Archaea), which is known to comprise nitrifying archaea, was highly abundant among the majority of the 18 investigated sponge species. Altogether, this study demonstrates that the diversity of prokaryotic communities associated with Vietnamese sponges is comparable to sponge-prokaryotic assemblages from well-documented regions. Furthermore, the phylogenetically divergent sponges hosted species-specific prokaryotic communities, thus demonstrating the influence of host identity on the composition and diversity of the associated communities. Therefore, this high-throughput 16S rRNA gene amplicon analysis of Vietnamese sponge-prokaryotic communities provides a foundation for future studies on sponge symbiont function and sponge-derived bioactive compounds from this region.

Bacterial and archaeal diversity in sponges has been characterized from a wide range of marine regions across the globe including the Atlantic, Pacific, and Indian Oceans as well as the Mediterranean, Red Sea, the Caribbean, the Yellow Sea, and the South China Sea (Giles et al., 2013;Li et al., 2007Li et al., , 2011;;Thomas et al., 2016).Vietnam exhibits a high diversity of marine sponges with at least 229 species, belonging to 124 genera, 65 families, 18 orders, and four classes (Quang, 2013), however, sponge-associated prokaryotic communities from Vietnam still remain unexplored.Many sponges in Vietnam have been identified as potential sources for new bioactive compounds: new muurolane-type sesquiterpenes were isolated from the sponge Dysidea cinerea (Kiem et al., 2014), new anticancer sterols from an Ianthella sp.(Nguyen et al., 2009), antifouling 26,27cyclosterols from Xestospongia testudinaria (Nguyen et al., 2013), and new sesquiterpenes and bis-sesquiterpene from D. fragilis (Cuc et al., 2015;Kiem et al., 2016;Nguyen et al., 2015).However, the true producers of many bioactive compounds from sponges are sponge-associated microorganisms rather than the sponges themselves (Fisch et al., 2009;Indraningrat, Smidt & Sipkema, 2016;Unson & Faulkner, 1993).
Therefore, we aimed to characterize the yet uninvestigated prokaryotic diversity and composition of Vietnamese sponges, and to provide a foundation for future studies focusing on sponge-associated bioactive compounds.We collected 27 sponge specimens, comprising 18 sponge species, from the central coastal region of Vietnam.Bacterial and archaeal diversity and sponge-specific community composition was characterized using Illumina MiSeq sequencing of PCR-amplified 16S rRNA gene fragments.To the best of our knowledge, among these 18 sponge species, six species (i.e., Clathria reinwardti, Haliclona amboinensis, Cinachyrella schulzei, Haliclona fascigera, Tespios aploos, and Axos cliftoni) are investigated for the first time regarding their prokaryotic communities.

Sampling
Sponge specimens (n = 27) were collected by scuba diving from May to September 2015 at three locations from the central coastal region of Vietnam at 5-10 m depth (Table S1).Lang Co Bay is located in the Phu Loc district within the Thua Thien Hue province.Con Co is a small island in the Quang Tri province with an area of 2.3 km 2 .Hon Mun Island is a marine conservation area in the Nha Trang Bay, Khanh Hoa province.Lang Co Bay was sampled in May 2015, Con Co Island in August 2015 and Hon Mun Island in September 2015.The distance between sampling sites ranges from 150 to 540 km.
The specimens were transferred directly to zip-lock bags containing seawater to prevent contact of sponge tissue with air.All samples were immediately transported to the laboratory, rinsed three times with sterile artificial seawater (Instant Ocean Aquarium Sea Salt Mixture: Instant Ocean), and stored at -80 C until DNA extraction.

DNA extraction and PCR amplification of 16S rRNA genes
Sponge samples prior to extraction were processed using a modified protocol from (Abe et al., 2012).In brief, the specimens were rinsed three times with sterile artificial seawater to remove any debris attached to the sponge.Then the specimens were further cleaned with a sterile scalpel in order to remove any sediment and other organisms strictly attached to the sponge.Finally, a piece of sponge tissue was ground in TEN buffer (3.5% sodium chloride, 10 mM tris-hydroxymethyl-aminomethane, 50 mM ethylenediaminetetraacetic acid, pH 8.5) with a sterilized mortar and pestle.Cell suspensions were filtered through a large nylon mesh (20 mm) to remove potential contaminants.The filtrates were then centrifuged at 8,000 g for 15 min at 4 C.The pellets were used to extract total genomic DNA using the ZymoBead TM Genomic DNA Kit (Zymo Research, Irvine, CA, USA) according to the manufacturer's protocol.The concentration of the extracted DNA was determined with a Nanodrop 1000 spectrophotometer (Nanodrop Technologies, Wilmington, DE, USA), and its integrity was examined by gel electrophoresis on a 1% (w/v) agarose gel.The extracted DNA was dissolved in TE buffer and stored at -20 C until further analysis.
The prokaryotic communities were characterized by Illumina MiSeq sequencing of 16S rRNA gene fragments using a two-step amplification procedure.The V4 region of the 16S rRNA genes was amplified by PCR using the 2nd version of the bacterial primers 515F/860R (Apprill et al., 2015), which were added to the 3′ end of Unitag1 and Unitag2, respectively (Tables S2 and S3).The PCR amplification was performed in a volume of 40 mL containing 8 mL of 5Â Phusion HF green buffer (ThermoFisher Scientific, Waltham, MA, USA), 1 mL of a 10 mM dNTP mixture (Promega Benelux B.V., Leiden, The Netherlands), 0.4 mL of Phusion Hot Start II High-Fidelity DNA polymerase (2 U/mL; ThermoFisher Scientific, Waltham, MA, USA), 1 mL of a 10 mM solution of each primer, 1 mL template (20 ng/mL), and 27.6 mL nuclease free water.The PCR was performed using the following conditions: an initial denaturation at 98 C for 30 s, followed by 25 cycles of denaturation at 98 C for 5 s, annealing at 56 C for 20 s, elongation at 72 C for 20 s, and a final elongation at 72 C for 5 min.Subsequently, the first PCR product was used as template in a second PCR in order to add sample specific barcodes (eight nucleotides).The second PCR consisted of 10 mL Phusion HF green buffer (ThermoFisher Scientific, Waltham, MA, USA), 1 mL dNTP mixture (Promega, Madison, WI, USA), 0.5 mL Phusion Hot Start II DNA polymerase (ThermoFisher Scientific, Waltham, MA, USA), 31 mL nuclease free water, 2.5 mL 10 mM forward primer (barcodelinker-Unitag1), 2.5 mL 10 mM reverse primer (barcode-linker-Unitag2) (Table S3) and 2.5 mL of the first PCR product.The PCR conditions were 98 C for 30 s, followed by 5 cycles at 98 C for 10 s, 52 C for 20 s and 72 C for 20 s, and final elongation at 72 C for 10 min.All PCR products were analyzed on a 1.8% (w/v) agarose gel to verify the products.The PCR products were purified using the HighPrep TM PCR clean-up protocol-MagBio kit (Magbio, London, UK), and quantified using the Quant-iTdsDNA highsensitivity assay kit and the Qubit fluorometer 2.0 (Invitrogen, Grand Island, NY, USA).Finally, samples were pooled in equimolar concentrations to ensure equal representation of each sample (including two mock communities as an internal standard to compare expected with observed 16S rRNA gene composition).The pooled library was purified, concentrated and quantified again with the HighPrep TM PCR clean-up kit (Magbio, London, UK) and the Quant-iTdsDNA high-sensitivity assay kit (Invitrogen, Grand Island, NY, USA).The samples were sequenced on the Genome Sequencer Illumina MiSeq at GATC Biotech, Germany.Sequencing data was deposited in the NCBI Sequence Read Archive under BioProject ID: PRJNA354731 with accession numbers: SRS1815731-SRS1815757 (Table S3).

Sequence data analyses
Illumina sequencing data was processed and analyzed using the NG-Tax pipeline (Ramiro- Garcia et al., 2016).Briefly, paired-end libraries were combined, and only read pairs with perfectly matching primers and barcodes were retained.To this end, both primers were barcoded to facilitate identification of chimeras produced during library generation after pooling of individual PCR products.Both forward and reverse reads were trimmed to 100 bp to avoid overlap in forward and reverse reads, which would affect the quality filtering.Paired-end trimmed forward and reverse reads were concatenated to yield sequences of 200 bp that were used for subsequent sequence data processing.Demultiplexing, OTU picking, chimera removal and taxonomic assignment were performed within one single step using the OTU_picking_pair_end_read script in NG-Tax.Reads were ranked per sample by abundance and OTUs (at a 100% identity level) were added to an initial OTU table for that sample starting from the most abundant sequence until the abundance was lower than 0.1%.The final OTU table was created by clustering the reads that were initially discarded as they represented OTUs <0.1% of the relative abundance with the OTUs from the initial OTU table with a threshold of (98.5% similarity) (Ramiro- Garcia et al., 2016).Taxonomic assignment using the most abundant sequence of each OTU was done utilizing the UCLUST algorithm (Edgar, 2010) and the Silva 111 SSU Ref database (Yilmaz et al., 2014).

Prokaryotic community analysis
Community composition summaries at phylum and class levels were created using the summarize_taxa_through_plots.pyscript from QIIME version 1.9.1 (Caporaso et al., 2010).Good's coverage index, rarefaction curves, and alpha diversity metrics (e.g., Shannon, inverse Simpson, and evenness) were calculated using the QIIME script alpha_rarefaction.py.Alpha diversity indices calculated from prokaryotic communities of sponge-species with replicates were tested by the Kruskal-Wallis test using function kruskal.testwithin the FSA package (Ogle, 2017) of R v.3.3.1 (R Core Team, 2016).Heatmaps were generated for (a) prokaryotic composition at phylum and class level, (b) the most abundant OTUs containing at least 2.5% of the reads in at least one of the samples, (c) archaeal OTUs, and (d) OTUs related to known nitrifying taxa using the heatmap.2function of the gplots package in R (Warnes et al., 2016).For distance-based multivariate analyses, the 16S rRNA gene OTU table was standardized using the decostand function (method = "hellinger") of the vegan package in R (Oksanen et al., 2016).Hierarchical cluster analysis was performed using the functions vegdist (method = "bray") and hclust (method = "average").The non-metric multidimensional scaling plot was created via the function metaMDS (Bray-Curtis distances) of the vegan package (Oksanen et al., 2016).Multivariate analysis based on Bray-Curtis dissimilarities of sponge-associated prokaryotic communities for sponge-species with replicates were performed using the functions betadisper, permutest, and the permutational multivariate analysis of variance (adonis) functions of the vegan package.

Sponge-enriched OTUs
The 91 most abundant OTUs of our study (i.e., at least 2.5% of all reads in at least one of the samples) were used to identify OTUs which are significantly enriched in the 3,569 sponge specimens (comprising 269 sponge species) from the sponge microbiome project (Moitinho-Silva et al., 2017a).In brief, the representative sequences (n = 91) of the most abundant OTUs from our study were subjected to a BLAST search (Altschul et al., 1990) against a curated sponge microbiome database, containing 64,424 high quality deblurred subOTU sequences that were extracted from the sponge Earth Microbiome Project (EMP) database (https://github.com/amnona/SpongeEMP).The curated spongeEMP BLAST database and additional information describing the database creation can be accessed here: https://github.com/marinemoleco/spongeEMP_BLASTdb.The sponge microbiome project subOTU sequences with 100% similarity to the present 91 query sequences were uploaded to the spongeEMP online server (www.spongeemp.com) in order to identify OTUs that are significantly enriched in sponge EMP specimens (p < 0.05 for the category "host-associated" in the field env_package).

Molecular sponge identification
For the taxonomic identification of the sponge specimens, near full length 18S rRNA gene and cytochrome c oxidase subunit I (COI) gene fragments from all samples were PCR amplified using the primer sets EukF/EukR (Medlin et al., 1988) for the 18S rRNA gene, and jgLCO1490/jgHCO2198 (Geller et al., 2013) for COI (Table S2).PCR products were cloned into pGEM-T Easy vector systems (Promega, Madison, WI, USA) according to the manufacturer's protocol.Positive clones were selected and sequenced using the T7/SP6 primer pair.Sequences were trimmed (error probability limit 0.1) using Geneious 4.8.3 (Kearse et al., 2012).The cloning vector was identified using VecScreen (http://www.ncbi.nlm.nih.gov/tools/vecscreen/) and removed.For the 18S rRNA gene, forward and reverse sequences were assembled to obtain near full length fragments.The 18S rRNA sequences from this study and the three most similar sequences obtained through blasting against the NCBI nr/nt database were added to a set of existing sponge 18S rRNA sequences (Sipkema et al., 2009), and subsequently aligned using the MAFFT (v.7.222) program with the FFT-NS-i strategy (Katoh & Standley, 2013).The same strategy was followed for COI sequences obtained in this study.The additional three COI gene sequences were added based on the highest BLAST similarity to our sequences as obtained from the NCBI database (nr/nt database), and then aligned as described above.Phylogenetic trees for the 18S rRNA and COI genes were created using RAxML version 8.0.0 (Stamatakis, 2014) with the GTRGAMMA model and 1,000 bootstrap replicates.The final sponge taxonomy was determined based on the best position in both phylogenetic trees and their BLAST identity to sequences of sponge species deposited in the NCBI database.Specimens were identified to species level if their identities were at least 99% with sequences in the NCBI database.However, three species showed high identity (!99%) with sequences from sponge species in the NCBI database, which known distribution typically does not include the Pacific Ocean according to the World Porifera Database (http://www.marinespecies.org/porifera/).For those samples, it was decided not to classify down to species level and maintain the generic status instead.All generated sequences were deposited in GenBank under the accession numbers: KX894454-KX894480 (18S rRNA genes) and KX894481-KX894504 (COI genes).

Sponge-associated prokaryotic communities
In total 5,326,187 high-quality reads were retained after quality filtering and were clustered into 926 OTUs.For all samples, rarefaction curves indicated near saturation, and coverage >99% (Table 1; Fig. S3).In addition, analysis of the bacterial profiles at the genus level showed a good correlation between observed and expected profiles (Pearson correlation coefficients were 0.7094 and 0.8308 for the two different mock communities).OTUs were classified into 14 bacterial phyla and one archaeal phylum (Fig. 1).Of all classified reads 88.2% belonged to the domain Bacteria, and 11.8% reads to the domain Archaea.Alpha diversity measures (i.e., Shannon, inverse Simpson, evenness) of the prokaryotic community associated with each specimen showed a wide range of diverse metric values.Evenness values ranged from 0.79 to 0.97, Shannon from 2.08 to 5.95, and inverse Simpson from 2.16 to 40 (Table 1).In our study, T. aploos, Amphimedon sp. 1, Amphimedon sp. 2, and Niphatidae sp.exhibited the lowest alpha diversity indices of the associated prokaryotic communities of all sponge species, whereas Tedania sp., Dactylospongia sp., and C. reinwardti exhibited the highest alpha diversity indices.The Kruskal-Wallis test for sponge species with duplicates showed significant differences among all prokaryotic alpha diversity indices (i.e., Shannon, inverse Simpson, OTU richness, evenness) (Table S4).
In this study, all archaeal OTUs (n = 73) belonged to the phylum Thaumarchaeota (Fig. S4).The Thaumarchaeota were found in a wide range of species (14 out of 18 species as well as all specimens of the same species if replicates were available) with relative abundances ranging from 0.4% (C.reinwardti) to almost 40% (A.cliftoni), with the  majority of samples exhibiting relative abundances of archaea greater than 10%.Most archaeal OTUs were assigned to Marine Group I (n = 68), and only five OTUs were assigned to the Soil Crenarchaeotic Group (SCG).The OTUs belonging to Marine Group I were found in all the 14 species that contained archaea, whereas the OTUs belonging to SCG were found in only two species (Fig. S4).Non-metric multidimensional analysis of the sponge-associated prokaryotic community based on Bray-Curtis dissimilarity showed that replicate specimens of a species (i.e., Axinyssa sp., C. reinwardti, Dactylospongia sp., H. amboinensis, Spirastrella sp., and X. testudinaria) clustered together (Fig. 2).The analysis using adonis based on Bray-Curtis dissimilarity showed strong support for the effect of host-identity on their prokaryotic communities for these species (R 2 = 0.94, p < 0.001) (Table S5).This was further supported by hierarchical clustering with sponge samples belonging to the same species clustering together in spite of different sampling locations (Fig. S5).

Microbial communities in Vietnamese marine sponges
To the best of our knowledge, of the 18 sponge species in the present study, the associated prokaryotic communities were examined for the first time for the following six species: C. reinwardti, C. schulzei, H. amboinensis, H. fascigera, A. cliftoni, and T. aploos.Moreover, this is the first time that the associated prokaryotic community of a sponge   from the genus Axos has been described.All 15 prokaryotic phyla detected belong to the 41 phyla that have so far been detected in marine sponges (Thomas et al., 2016).The prokaryotic taxa with a high relative abundance in the present study (e.g., Nitrospinaceae, PAUC34f, Caldilineaceae, Nitrosomonadales, Rhodobacteraceae, Endozoicomonas, Rhodospirillaceae) are also abundant in other marine sponges, which can be found in vastly different marine regions (Rodrı ´guez-Marconi et al., 2015;Steinert et al., 2016;Thomas et al., 2016).Regarding the six newly studied sponges, in four of them (i.e., C. reinwardti, H. amboinensis, H. fascigera, C. schulzei) all prokaryotic phyla were also found in other sponge species belonging to the same sponge genera (Alex & Antunes, 2015;Cleary et al., 2013;Erwin, Olson & Thacker, 2011;Jasmin, Anas & Nair, 2015;Khan et al., 2013;Naim et al., 2014;Sipkema et al., 2009;Thomas et al., 2016).In contrast, three additional bacterial phyla (i.e., Actinobacteria, Firmicutes, Planctomycetes) have not been found to be associated with the genus Terpios in a previous study (Tang et al., 2011).Furthermore, we observed that members of the Thaumarchaeota were associated with a wide range of Vietnamese sponges and that they accounted for a high relative abundance (>10%) in over half of the prokaryotic communities studied here (15 out of 27 samples).This phylum has also been detected in many other sponge taxa from the Pacific, Atlantic, Antarctic ocean, Mediterranean Sea, Caribbean, and Floridian reefs as well as in cold-water sponges (Alex & Antunes, 2015;Cuvelier et al., 2014;Dupont et al., 2013;Lee et al., 2011;Margot et al., 2002;Polo ´nia et al., 2014;Radax et al., 2012;Rodrı ´guez-Marconi et al., 2015;Webster, Watts & Hill, 2001) with particular high abundances in deep-sea sponges (Jackson et al., 2013;Kennedy et al., 2014).Members of the Thaumarchaeotaformerly Marine Group I Crenarchaeota-are capable of oxidizing ammonia and play an important role in the nitrogen cycle in marine environments (Capone et al., 2008;Prosser & Nicol, 2008;Ward, 2011).These AOA are the primary ammonia-oxidizing components in marine systems, displaying amoA abundances of up to 10 8 copies L -1 (Tolar, King & Hollibaugh, 2013;Wuchter et al., 2006).It has been reported that nitrogen input such as ammonia is one of the factors influencing the abundance of Thaumarchaeota (Hatzenpichler, 2012;Herfort et al., 2007;Hong & Cho, 2015;Kirchman et al., 2007;Oton et al., 2015).The environmental monitoring data at the sampling locations from our study revealed varying concentrations of inorganic nitrogen.For example, the ammonia concentration varied from 0.05 to 0.3 mg/L in Quang Tri, to 0.03-0.1 mg/L in Lang Co bay and was lowest (0.003-0.008 mg/L) in Hon Mun island as this island is in a protected area (Linh et al., 2015;MONRE, 2016;QCEEM, 2015).However, besides nitrogen input, other environmental factors may influence the abundance of Thaumarchaeota lineages, such as pH, depth, oxygen levels, as well as other organic substrates (Di et al., 2010;Hong & Cho, 2015;Isobe et al., 2012;Tolar, King & Hollibaugh, 2013;Verhamme, Prosser & Nicol, 2011;Yao et al., 2013).

CONCLUSION
In our present study we investigated the sponge-associated prokaryotic composition of 18 sponge species collected from the central coastal region of Vietnam.Our study highlights the prokaryotic diversity associated with Vietnamese sponges as well as the pattern of host-specificity among samples with replicates.The presence of significantly sponge-enriched OTUs in sponge microbiome database supports the general consensus that sponges host certain prokaryotic taxa that are not found or found only in a few samples of other environments.In addition, our study reveals the presence of prokaryotic taxa particularly known for nitrification, which indicates nitrification might be an important microbial process in sponge hosts.

Figure 1
Figure1Heatmap of the prokaryotic composition and relative abundance of sponge-associated prokaryotes at phylum level (at class level for the phylum Proteobacteria).Samples were grouped using hierarchical clustering based on the Bray-Curtis distance matrix calculated from relative OTU abundances.Full-size  DOI: 10.7717/peerj.4970/fig-1

Figure 2
Figure 2 Non-metric multidimensional scaling (NMDS) plot derived from Bray-Curtis distances of sponge prokaryotic communities at OTUs level, NMDS stress value = 0.116.The samples of the same species were grouped with ordination ellipse using function ordiellipse of vegan package.Full-size  DOI: 10.7717/peerj.4970/fig-2

Figure 3
Figure3Heatmap of the most abundant OTUs (>=2.5% of the total reads in at least one of the samples).Samples were grouped using hierarchical clustering based on the Bray-Curtis distance matrix calculated from the relative abundances of these OTUs.If applicable, OTU taxonomy was assigned to the phylum (p), class (c), order (o), family (f), or genus (g) by NG-tax.The blue colors of OTU names indicate that these significantly sponge-enriched OTUs in the sponge microbiome project database (http://www.spongeemp.com).Full-size  DOI: 10.7717/peerj.4970/fig-3

Table 1
Sequence statistics and alpha diversity of Vietnamese sponge-associated prokaryotic communities, including the total number of OTUs, coverage, and alpha diversity metrics.
Note:Alpha diversity values (evenness, Shannon, inverse Simpson) were calculated based on subsampling size of the sample with the fewest sequences (n = 1,595 reads).