Metagenomic and near full-length 16S rRNA sequence data in support of the phylogenetic analysis of the rumen bacterial community in steers

Amplicon sequencing utilizing next-generation platforms has significantly transformed how research is conducted, specifically microbial ecology. However, primer and sequencing platform biases can confound or change the way scientists interpret these data. The Pacific Biosciences RSII instrument may also preferentially load smaller fragments, which may also be a function of PCR product exhaustion during sequencing. To further examine theses biases, data is provided from 16S rRNA rumen community analyses. Specifically, data from the relative phylum-level abundances for the ruminal bacterial community are provided to determine between-sample variability. Direct sequencing of metagenomic DNA was conducted to circumvent primer-associated biases in 16S rRNA reads and rarefaction curves were generated to demonstrate adequate coverage of each amplicon. PCR products were also subjected to reduced amplification and pooling to reduce the likelihood of PCR product exhaustion during sequencing on the Pacific Biosciences platform. The taxonomic profiles for the relative phylum-level and genus-level abundance of rumen microbiota as a function of PCR pooling for sequencing on the Pacific Biosciences RSII platform were provided. For more information, see “Evaluation of 16S rRNA amplicon sequencing using two next-generation sequencing technologies for phylogenetic analysis of the rumen bacterial community in steers” P.R. Myer, M. Kim, H.C. Freetly, T.P.L. Smith (2016) [1].

platform were provided. For more information, see "Evaluation of 16S rRNA amplicon sequencing using two next-generation sequencing technologies for phylogenetic analysis of the rumen bacterial community in steers" P.R. Myer

Value of the data
Additional consideration of primer and platform-specific biases associated with amplicon next-generation sequencing that may confound data and its interpretation.
Further evaluation of sequencing depth of ruminal metagenomic DNA from steers. Greater understanding of potential PCR product exhaustion during sequencing on resultant taxonomic analyses.

Data
Three figures are presented. Fig. 1 contains individual animal, relative ruminal microbial abundance data from rumen samples selected from feed efficient steers (ADG Greater À ADFI Less ), and the 3 with least variability among the 8 samples (animals) in the group [1,3]; Fig. 2 depicts the data from the calculated rarefaction curves from metagenomic DNA mapped to consensus 16S rRNA V1-V3 and V1-V8 regions; Fig. 3 contains relative ruminal microbial abundance data from rumen samples regarding the reduced PCR amplification of the 16S rRNA V1-V8 hypervariable regions and pooling of the amplification products in order to determine any effects on taxonomic classification and analysis. A complete description of the data and methods is presented elsewhere [1].

Experimental design and rumen sampling
This experiment was approved by the U.S. Meat Animal Research Center Animal Care and Use Committee. Feed efficiency was determined as referenced by Myer et al. [3], and utilized. Three steers displaying an equivalent feed efficiency phenotype (ADG Greater À ADFI Less ) and with the least deviation among each other were selected and sampled for the study.

DNA extraction, amplification and sequencing
DNA was extracted from rumen samples using a repeated bead beating plus column (RBB þC) method [2]. Rumen content samples were analyzed similar to Myer et al. [3]. Additionally, isolated metagenomic DNA was sheared to 350 bp (Covaris, Woburn, MA) and used to create TruSeq s PCR Free libraries for sequencing using the 2 Â 150 NextSeq 500 high output kit and the Illumina NextSeq 500 s sequencing platform (Illumina, San Diego, CA). Polymerase chain reaction (PCR) amplification and DNA library preparation of the V1-V8 region was performed using universal primers 27F (5 0 -AGAGTTTGATCCTGGCTCAG) and 1392R (5 0 -GACGGGCGGTGTGTAC) for the Pacific Biosciences instrument. Reduced amplification consisted of 15 cycles, with an annealing temperature of 58°C. PCR products were pooled for sequencing on the Pacific Biosciences RSII platform.

Sequence read processing and analysis
All sequences were processed using the QIIME-1.9.1 software package [4] and Mothur version 1.36.1 [5], as well as the Ribosomal RNA database project 0 s pyrosequencing pipeline [6] for rarefaction analysis. Pacific Biosciences reads were parsed so that quality scores of zero were interpreted as corresponding to an ambiguous base call, and then filtered for quality (ZQ30) using Mothur. Sequences that contained read lengths shorter than 1200 bp were removed. Read directionality was checked and corrected where necessary. For all reads, homopolymers 47 were discarded and chimeric sequences were checked using ChimeraSlayer [7].
Shotgun metagenomic reads were cleaned and mapped against consensus 16S rRNA V1-V3 and V1-V8 sequences, yielding 23,379 and 103,064 reads, respectively. Reads mapping to the respective variable regions were used for analysis and classification using the Greengenes 16S rRNA Gene Database, 13_8 release.   3. The taxonomic profiles for the relative phylum-level (A) and genus-level (B) abundance of rumen microbiota classified by representation at Z 1% of total sequences as a function of PCR pooling. Taxonomic composition of the ruminal microbiota between the two treatments was compared based on the relative abundance (reads of a taxon/total reads in a sample).

Transparency document. Supporting information
Supplementary data associated with this article can be found in the online version at http://dx.doi. org/10.1016/j.dib.2016.07.027.