Metagenomic data of the bacterial community in coastal Gulf of Mexico sediment microcosms following exposure to Macondo oil (MC252)

The data in this article includes the sequences of bacterial 16S rRNA gene from metagenome of Macondo oil (MC252)-treated and non-oil-treated sediment microcosms, collected from coastal Gulf of Mexico and Bayou La Batre, USA. Metacommunity DNA was PCR amplified with 341F and 907R oligonucleotide primers, targeting V3–V5 regions of the 16S rRNA gene. Data were generated by using bacterial tag-encoded FLX-amplicon pyrosequencing (bTEFAP) methodology and then processed using bioinformatics tools such as QIIME. The data information is deposited to NCBI׳s BioProject and BioSample and raw sequence files are available via NCBI׳s Sequence Read Archive (SRA) database.


a b s t r a c t
The data in this article includes the sequences of bacterial 16S rRNA gene from metagenome of Macondo oil (MC252)-treated and non-oil-treated sediment microcosms, collected from coastal Gulf of Mexico and Bayou La Batre, USA. Metacommunity DNA was PCR amplified with 341F and 907R oligonucleotide primers, targeting V3-V5 regions of the 16S rRNA gene. Data were generated by using bacterial tag-encoded FLX-amplicon pyrosequencing (bTEFAP) methodology and then processed using bioinformatics tools such as QIIME. The data information is deposited to NCBI's BioProject and BioSample and raw sequence files are available via NCBI's Sequence Read Archive (SRA) database.
&   Table 1, and are publicly available at NCBI.

Value of the data
This data information provides changes in the microbial community structure and species composition following treatment with MC252.
Data is applicable for comparative studies related to oil spill events that have occurred in similar or different locations in the Gulf of Mexico or other ecosystems.
Accessibility of raw sequence data allows researchers to perform new analyses based on their own research purposes with new bioinformatics tools.

Data
All raw sequence data described in this paper are available through NCBI's BioProject, BioSample and SRA database. Accession numbers of BioProject, BioSample and SRA are elaborated in Table 1. The two sets of nextgen sequence data represent the microbial communities from the MC252-treated and non-oil-treated sediment samples in a laboratory microcosm experiment. The sediments were collected from 1) Bayou La Batre, AL, USA; and 2) Dauphin Island, Petit Bois Island, and Perdido Pass, AL, USA.

Sample collection and preparation
The sediment and seawater samples were collected from Bayou La Batre, AL in March, 2011 using acid-washed plastic containers. The samples were collected from a single location and placed in a 5gallon acid-washed plastic bucket. Then, the top 15-30 cm of sediments were sampled, which were thoroughly mixed by stirring and then used for the microcosm setup. To confirm the initial level of oil present in the sediment samples, total petroleum hydrocarbons and total organic carbon (TOC) were revealed by GC-MS using standard methods [1][2][3]. For each microcosm setup, 200 g (dry weight) sediment and 173.06 g autoclaved (121°C for 15 min at 15 lb/sq inch pressure) seawater were mixed and placed in a 500 mL glass jar. Then, duplicate sediment samples were subjected to MC252treatment (500 ppm) for 14 days and 21 days at room temperature (20 71°C) ( Table 1) [4]. Non-oiltreated control samples (0 h) were maintained throughout the experimentation.
The sediment samples mixed with seawater along the coast of Dauphin Island, Petit Bois Island, and Perdido Pass, were collected in June, 2011 in sterile-cap tubes using a multicorer from the upper were mixed with 200 ml autoclaved seawater in 500 ml glass jars. Then, the non-oil-treated control (0 h) and oil-treated samples were incubated at room temperature (20 71°C) for 7 days and 30 days (Table 1) [4]. Since the aim of the study was to monitor a short-term effect of the MC252 oil in sediment microbial communities, we have used the treatment time and scheme described previously by Cappello et al [5].

DNA extraction
All sediment samples (1 g each in triplicate) from Bayou La Batre and Gulf of Mexico were subjected to metacommunity DNA extraction by using the MoBIO PowerSoil s DNA Isolation Kit (MoBio Laboratories Inc., CA; www.mobio.com; cat 12888-100). The quality and concentration of the extracted DNA samples were measured by using a Lambda 2 spectrophotometer (Perkin Elmer, Norwalk, Conn.) followed by agarose gel electrophoresis in Tris-Acetate-EDTA (TAE, pH 7.5) buffer [6].

Sequencing
After confirming the purity and the concentration of the DNA, triplicate samples from each oiltreated and non-oil-treated sediment samples were pooled, and 100 ng of DNA was used by the Research and Testing Laboratories (RTL) (Lubbock, Texas) for bacterial tag-encoded FLX-amplicon pyrosequencing (bTEFAP) [7]. The pyrosequencing was conducted using 341F (5 0 CCT ACG GGA GGC AGC AG 3 0 ) [8] and 907R (5 0 CCG TCA ATT CMT TTG AGT TT 3 0 ) [9] oligonucleotide primers targeting the V3-V5 regions of the bacterial 16S rRNA gene [10]. Then, initial generation of the sequencing library was conducted by one-step PCR using the HotStarTaq™ Plus Master Mix Kit (Qiagen, Valencia, CA) and 341F and 907R primers. The pyrosequencing was conducted on a Roche 454 s FLX instrument using the Titanium reagents and procedures developed at RTL (Lubbock, TX).

Data analysis
A total of 12 sff files were generated by pyrosequencing and submitted to the NCBI's BioProject, BioSample, and SRA with accession numbers listed in Table 1. These sff files can be converted to FASTA-and QUAL-formatted files by using "process_sff.py" command in QIIME (ver 1.8.0). After converting the sff files, a Mapping file, including Sample ID, BarcodeSequence, and LinkerPrimerSequence information, was created for the analyses. After creating the mapping file, formatting requirements in this file were checked by using "validate_mapping_file.py" in QIIME. Then, these three files (FASTA, QUAL, and Mapping) were used for different analyses by using QIIME as described by Koo et al. [11,12]. All sff files used in this study can be downloaded publicly from NCBI's SRA.

Conflict of interest
The authors declare no conflict of interest associated with this manuscript.
Hughes and Matthew Pace of UAB CAS IT for the necessary computer support for all bioinformatics analyses of the pyrosequencing data.