A comparison of rumen microbial profiles in dairy cows as retrieved by 454 Roche and Ion Torrent (PGM) sequencing platforms

Next generation sequencing (NGS) technology is a widely accepted tool used by microbial ecologists to explore complex microbial communities in different ecosystems. As new NGS platforms continue to become available, it becomes imperative to compare data obtained from different platforms and analyze their effect on microbial community structure. In the present study, we compared sequencing data from both the 454 and Ion Torrent (PGM) platforms on the same DNA samples obtained from the rumen of dairy cows during their transition period. Despite the substantial difference in the number of reads, error rate and length of reads among both platforms, we identified similar community composition between the two data sets. Procrustes analysis revealed similar correlations (M2 = 0.319; P = 0.001) in the microbial community composition between the two platforms. Both platforms revealed the abundance of the same bacterial phyla which were Bacteroidetes and Firmicutes; however, PGM recovered an additional four phyla. Comparisons made at the genus level by each platforms revealed differences in only a few genera such as Prevotella, Ruminococcus, Succiniclasticum and Treponema (p < 0.05; chi square test). Collectively, we conclude that the output generated from PGM and 454 yielded concurrent results, provided stringent bioinformatics pipelines are employed.


Introduction
Microbes are integral components of a diverse group of ecosystems and have co-evolved with their host/habitat in a mutually symbiotic relationship (Hobson & Stewart 1997). The foregut (rumen) of ruminants is comprised of a complex microbial genetic web (rumen microbiome) that plays a pivotal role in the host nutrition and ultimate wellbeing of the animals (Hobson & Stewart 1997). Bacteria are predominant in the rumen microbiome and are responsible for the conversion of indigestible plant biomass to energy and also aid in the formation of microbial protein; both processes drive the production efficiency of ruminants (Firkins 2010). Interactions between different microbial domains in the rumen play a significant role in determining the ruminal microbial ecology and their functional contribution to host metabolism (Kumar et al., 2015).
Rumen microbial dynamics are influenced by a number of factors including host specificity, diet, age and the environment (Edwards et al., 2004). Elucidation of the interactions among microbial domains particularly in dairy cows has the potential to improve production such as feed efficiency and milk fat synthesis (Weimer 2015). The transition period in dairy cows refers to a critical phase in the lactation cycle, lasting from three weeks before calving to three weeks postcalving, where the dairy cow experience stress due to changes in diet, metabolism and physiological status. Although the dynamics of rumen bacteria during the transition period has received attention in the recent past (Lima et al., 2014;Pitta et al., 2014;Wang et al., 2012), further studies are required to understand the rumen microbial dynamics during the different phases of lactation across a large group of dairy cows.
Cultivation-independent approaches have greatly enhanced our knowledge on microbial diversity and also enabled us to assess their functional contribution to the host metabolism (Stahl et al., 1988). Particularly, next-generation sequencing (NGS) technology has enabled the sequencing of human and microbial genomes in a relatively short period of time (Caporaso et al., 2011). The Manuscript to be reviewed most widely used high-throughput sequencing platforms available in the market include Roche 454 pyrosequencing, Ion Torrent Personal Genome Machine (PGM), and Illumina HiSeq (Liu et al., 2012). Although these platforms were originally tailored for large-scale operations such as whole genome sequencing, their bench-top versions (454 Jr, PGM, and MiSeq, respectively) have evolved since 2011 and have been extensively applied to bacterial genome sequencing (Loman et al., 2012). Since 2011, both MiSeq and PGM platforms have undergone improvements, including longer read lengths, more reads per unit cost, faster turn-around time, and a reduction in error rates (Salipante et al., 2014). Although a general comparison between these platforms has been reported (Lam et al., 2012;Quail et al., 2012), studies comparing the efficacy of these platforms on the same samples are limited (Salipante et al., 2014;Scott & Ely 2015). As the Roche 454 platform phases out (Fordyce et al., 2015), there is a need for comparative studies that can aid in the transition from Roche 454 to other platforms.
The use of next generation platforms has greatly enhanced our knowledge of rumen microbes, their genes and enzymes (Brulc et al., 2009;Hess et al., 2011;Jami et al., 2014). To date, there are nearly 55 research articles (based on Pubmed, 28 Jan, 2015) related to bacterial diversity from the rumen environment using Roche 454 while only 2 from MiSeq and 3 were reported based on Ion Torrent platforms. In an attempt to find a suitable alternative to the Roche 454 platform in relation to our microbial genomic work, we evaluated the use of Ion Torrent as an alternative to the 454 platform for the study of rumen microbial composition via 16S tag sequencing.

Sample collection
Dairy cows that were donors of rumen fluid were maintained at Marshak farm and were maintained according to the ethics committee and IACUC standards for the University of Manuscript to be reviewed Pennsylvania (approval #804302). Four primiparous and four multiparous cows were sampled at four weeks prior to the anticipated calving date (S1), and again at 1 to 3 days post-calving (S2).
Details of the animal experiment design, sampling protocol, and type of diet are described in a previous study (Pitta et al., 2014). Two samples were removed from the analysis due to a low number of reads in the Roche 454 sequencing run: one each from the primiparous and multiparous group in the pre-calving period. Thus, we analyzed a total of 14 samples, with six samples from pre-calving period and eight samples from post-calving period (Supplementary Table S1).

DNA extraction, PCR amplification, and 16S rRNA sequencing
The genomic DNA was extracted from all the rumen samples employing PSP Spin Stool DNA Plus Kit (Invitek, Berlin, Germany) using the protocol of McKenna et al. (2008). The genomic DNA was amplified using the specific primers (27F) and BSR357, targeting the V1-V2 region of the 16S rRNA bacterial gene. The primer sequences and PCR conditions for Roche 454 are described in Pitta et al. (2014). Though the primer sequences for Ion Torrent were similar to   Roche  454,  the  forward  primer  carried  the  Ion  Torrent  trP1  (5'-CCTCTCTATGGGCAGTCGGTGAT-3') and the reverse primer carried the A adapter (5'-CCATCTCATCCCTGCGTGTCTCCGACTCAG-3'), followed by a 10 to 12 nucleotide samplespecific barcode sequence and a GAT barcode adapter. The PCR mix was prepared using the Platinum PCR SuperMix High Fidelity kit (Invitrogen, Carlsbad, CA). PCR conditions were the same for Roche 454 and Ion Torrent, as given by Pitta et al. (2014). Amplicons of 16S rDNA were purified using 1:1 volume of Agentcourt AmPure XP beads (Beckman-Coulter, Brea, CA).
The purified PCR products from the rumen samples were pooled in equal concentration prior to sequencing in Roche 454 (Roche 454 Life Sciences, Branford, CT) and Ion Torrent platforms.
To evaluate the similarities and dissimilarities between Roche 454 and Ion Torrent, we analyzed the 16S pyrosequence reads using the QIIME pipeline (version 1.8.0) (Caporaso et al., 2010) and a small number of custom python scripts, followed by statistical analysis in R (Team 2013).
Reads from both platforms were discarded if they did not match the expected sample-specific barcode and 16S primer sequences (forward and reverse primers), or if they were shorter than 50 bp or longer than 480 bp, or if they contained one or more ambiguous base calls. Reads were also discarded if a long homopolymer sequence was present; the threshold used was 5 bp for both the platforms. Operational taxonomic units (OTUs) were formed at 97% similarity using UCLUST (Edgar, 2010  Manuscript to be reviewed OTUs. We ran a Procrustes analysis of weighted UniFrac distance, comparing the principal coordinate matrices from Roche 454 and Ion Torrent. The goodness of fit (M 2 value) was measured by summing over the residuals, and significance was assessed by the Monte Carlo label permutation method (Gower 1975).
Representative sequences from each OTU were chosen and taxonomy was assigned using the default methods in QIIME. To test for differences in taxon abundance, we normalized the abundances to the total number of reads in each sample (relative abundance). We considered the phyla appearing in at least 75% of samples. A generalized linear mixed-effects model was constructed with the lme4 package for R (Bates et al., 2013). The model used a binomial link function and included a random effect term for each animal. Study day was modeled as a continuous longitudinal variable with the following values: S1= -3 weeks, S2=0.285 weeks.

Analysis of 16S sequence clusters
A total of 39,592 and 280,284 raw sequences were obtained from Roche 454 and Ion Torrent respectively, across a total of 14 samples on each platform. To minimize differences between the two platforms, we employed similar quality control protocols, including quality filtering, primer detection, and read demultiplexing.  Manuscript to be reviewed

Taxonomic comparisons
Taxonomic assignment of the OTUs identified 15 (Roche 454) and 18 (Ion Torrent) phyla in the rumen of cows used in this study. The three bacterial phyla that were recovered by Ion Torrent platform alone were Acidobacteria, GN02, and Verrucomicrobia (Supplementary Table S2), which accounted for a very low abundance (0.01%) and were detected only in a few samples. The most abundant phyla in both the platforms were Bacteroidetes followed by Firmicutes, which together constituted over 90% of each sample in Roche 454 and over 86% in Ion Torrent ( Figure   1, Supplementary Table S2). The influence of study day (before and after calving) and study group (primiparous and multiparous cows) on the abundance of Bacteroidetes and Firmicutes appears to be similar for both Roche 454 and Ion Torrent data sets (Table 1). However, the influences of study group and study day in lower abundant phyla were different, for example the influence of study group and study day on Fibrobacteres is significant in Roche 454 whereas it is insignificant in Ion Torrent datasets.
We further compared the Roche 454 and Ion Torrent datasets at the genus level for the two major phyla observed (Bacteroidetes and Firmicutes). Genera with a proportion exceeding 1% in at least one sample were included in the analysis. In Roche 454, three genera from the Bacteroidetes group and one genus among Firmicutes showed differences by study group, whereas three genera from Bacteroidetes and four genera from Firmicutes showed differences by study day. In contrast, a similar analysis of Ion Torrent data revealed no differences within Bacteroidetes group and revealed differences among three genera in the Firmicutes with study group. With regard to study day, two genera in Bacteroidetes group and one genus in Firmicutes showed differences (Table 2). Notably, we found that BF311 (an uncultured genus of Bacteroidetes), Mogibacteriaceae (Firmicutes), and Selenomonas (Firmicutes) were detected only in the Ion Torrent platform. Manuscript to be reviewed and Ion Torrent data sets, whereas, the influence of study group was observed only in Roche 454.
In both platforms, mMost of the remaining sequences in the phylum Bacteroidetes were assigned either at the family or order level to unidentified unnamed species of the Prevotellaceae family or Bacteroidales order (Table 2

Comparison between Roche 454 and Ion Torrent platforms
Alpha diversity The number of observed species per sample was higher for the Ion Torrent method, compared to Roche 454 ( Figure 2). The Shannon index value was similar between the two platforms at various sequencing depths, indicating a similar number of highly abundant species in both platforms.

Beta diversity
We then quantified the resemblance between bacterial communities measured by Roche 454 and Ion Torrent. Weighted UniFrac distances for both communities were plotted using principal coordinate analysis (PCoA), and then the two PCoA plots were aligned using generalized Procrustes analysis (Gower 1975). The aligned PCoA plot is visualized in Figure 3. Manuscript to be reviewed statistically significant agreement in the microbial community composition between the two platforms (M 2 = 0.36; P = 0.001).

Taxonomic comparison at genus level
A sample-to-sample to comparison at the genus level between the two platforms was performed using chi square test (Supplementary Table S3). For this analysis, twenty nine genera were compared by sample for both platforms (Supplementary Table S3). Of these, only Prevotella from Bacteroidetes phylum, and one unclassified bacterial lineage were found to be different between the two platforms in more that 50% of samples (p < 0.05; chi square test). A majority of the remaining genera were found to be different between only one or two samples. Thus, the composition of bacterial communities was generally consistent between platforms, with two notable exceptions.

Discussion
The introduction of NGS technology has had a dramatic effect on researchers' ability to study bacterial communities via DNA sequencing. The throughput has increased by 500,000-fold and the number of reads per genome is increasing 100-fold every year, yet the cost of sequencing is reducing by half every 5 months (Baker, 2010). The rumen microbiome of herbivores is a classic example that has been explored in detail using different NGS approaches (de la Fuente et al., 2014;Jami et al., 2014;Peng et al., 2015;Pitta et al., 2014;Pitta et al., 2010), but several inconsistencies among these reports prevail, some of which may be attributed to differences in the approach employed. In this study, we attempted to account for biases that could be introduced due to different NGS platforms while making comparisons between 16S rDNA bacterial profiles in the rumen of dairy cows. We concluded that, although the percentage of common sequences was low between the platforms, the microbial fingerprints and phylogenetic composition retrieved by both Roche 454 and Ion Torrent platforms are comparable, with minor exceptions. Manuscript to be reviewed , we adopted normalization through subsampling of reads at the minimum sequencing depth for both platforms. This additional step was performed to avoid the influence of this differential distribution of reads on the bacterial composition.

Ruminal Bacterial diversity dynamics
The effect of diet and age on the community composition, evident in our previous study using Manuscript to be reviewed both platforms. Though the overall level of similarity was not great enough to combine data from two platforms in a single analysis of UniFrac distance, we determined that independent distance analyses were reproducible between platforms. Further, we were able to infer that increasing the sequence depth did not unduly influence community profiles when analyzed by UniFrac distance,  Patel et al., 2014;unpublished data Pitta;Singh et al., 2014).
Differences between datasets from the two platforms were evident in phyla that contributed to less than 5% of the populations. A sample-to-sample comparison revealed differences in phylogenetic composition from family and beyond. For example, at the genus level, although Prevotella was abundant in both platforms, the lower abundance of Prevotella observed in Ion Torrent compared to Roche 454 may be due to an increase in number (14) of several genera that were not detected in the Roche 454 data. Despite these differences at the genus level, the effect of age and diet on different bacterial genera in the rumen of dairy cows was consistent between the two platforms. Manuscript to be reviewed It has become evident in the recent past that the rumen microbiome plays a significant role in improving the production efficiencies of dairy cows. As next-generation sequencing platforms and chemistries continue to expand and improve, we expect that major advancements in sequencing may contribute to significant improvements in dairy production through improving nutrition and management that support a production efficient microbiome. While different sequencing platforms are applied to explore the rumen microbiome, it is imperative that studies should also compare and contrast findings from different platforms to avoid discrepancies across rumen microbiome studies. The results presented here show that major conclusions based on

Conclusions
Roche 454 data are also reproduced on the Ion Torrent platform. Thus, we found that the Ion Torrent platform is a suitable option for future rumen microbiology studies, provided that researchers are consistent in DNA extraction methods, PCR protocols, and bioinformatics pipelines.

Competing interests
The authors declare that they have no competing interests.     Manuscript to be reviewed  Manuscript to be reviewed Similarity to the nearest common OTU from each platform-specific OTU. Percent identity was calculated using blastn. X-axis shows the percentage identity between representative sequences from common OTUs and representative sequences from platform-specific OTUs.