Bacterial distribution in the Equatorial Indian Ocean using Amplicon sequencing of V3-V4 rDNA hypervariable region data

The Equatorial Indian Ocean (EIO) is a complex system strongly influenced by Indian Monsoon. During a RAMA (Research Moored Array for African-Asian-Australian Monsoon Analysis and Prediction) mooring maintenance expedition during the Southwest monsoon (August-September 2016) onboard ORV Sagar Kanya, seawater samples from the surface, deep chlorophyll maxima (DCM) and 200m were collected for bacterioplankton community structure. Herein we document our amplicon data of the bacterial community at 4 stations (4.01°S, 1.60°S, 0.36°N and 1.78°N) along the 67°00’ E transect. The samples were subjected to next-generation sequencing (NGS), followed by processing with Mothur v 1.48.0, and the taxonomic classification prepared with Silva 138.1nr reference database. Our data indicates Alphaproteobacteria (48 %) and Cyanobacteria (33 %) dominance in the surface and DCM samples.


Specifications
Environmental Science Specific subject area Microbiology Type of data Table  Graph Figure How the data were acquired Samples were collected using Niskin bottle (10 L sampler) (Seabird Inc., USA) attached to CTD rosette equipped with Sea bird CTD system (SBE911 plus, Sea-Bird Electronics, USA). Five litres (5 L) of water samples were filtered through 0.22-μm pore size, 47 mm diameter polycarbonate filters (Merck Millipore, USA). DNA was isolated using Power Water DNA kit (MoBio; USA) and sequencing was carried out using primer 341 F: 5' CCTACGGGAGGCAGCAG 3' and 806 R: 5 GGACTACHVGGGTTCTAAT

Value of the Data
• Metagenomic data from the Equatorial Indian Ocean provides additional information on bacterial community structure in the relatively less explored oceanic region. • Reported data will be used in the marine microbial diversity library construction • Further metabolic prediction and applicational study will be quite easy with the availability of data from oligotrophic areas. • Samples were collected from RAMA buoy sites, which monitor Indian monsoon activity, shall be useful in evaluating microbial response to changing Indian Ocean conditions.

Objective
The Equatorial Indian Ocean plays an important role in regional climate and global climate change through different processes like precipitation patterns in the surrounding land mass and shelter to a huge microbial diversity. To understand the monsoon pattern Ministry of Earth Science, India and NOAA conduct a collaborative objective of surface mooring deployment at different sites in the Indian Ocean. In order to understand the dynamic nature of the Indian Monsoon System, moorings are one of the important tools. Along with physical and chemical parameters, microbial samples were collected to understand bacterial diversity in specific locations. Collected microbial data shall aid in understanding the microbial distribution in the oligotrophic ocean like the Equatorial Indian Ocean in changing global climate.

Experimental Design, Materials and Methods
Twelve water samples were collected from three different depths at four different stations ( Table 1 ) during an expedition onboard ORV Sagar Kanya, cruise number 333 (August-September 2016). Aliquots of samples were collected using Niskin bottle (10 L sampler) (Seabird Inc., USA) attached to CTD rosette equipped with Sea bird CTD system (SBE911 plus, Sea-Bird Electronics, USA) [3] . Five litres (5 L) of water samples were filtered through 0.22-μm pore size, 47 mm diameter polycarbonate filters (Merck Millipore, USA) for bacterial diversity analysis. The filters were sealed in sterile tubes and stored at -80 °C and transported to the laboratory for DNA extraction [4] .

DNA Isolation and Sequencing
Total DNA was extracted from 0.22 μm pore-size polycarbonate membrane filter using Power Water DNA kit (MoBio; USA). The hypervariable region (V3-V4) of bacterial 16S rDNA was amplified using primer 341 F: 5' CCTACGGGAGGCAGCAG 3' and 806 R: 5 GGACTACHVGGGTTCTAAT 3' [5] with Hiseq Rapid V2 Kit for 2 * 250 base pair (bp) sequence. DNA sequencing was outsourced to M/s Agrigenome Laboratory Ltd. (India). The raw data of the V3-V4 sequence were deposited NCBI database. Quality scores and CG base were checked and processed for down streaming bioinformatics analysis ( Table 1 ).

Bioinformatics and Statistics Analysis
Downstream processing of DNA sequences was carried out using Mothur V-1.48.0 (Log file attached as supplementary 1). Contigs were prepared for total 6844662 reads. These contigs were trimmed and around 3852596 sequences were removed using screen.seqs command. Thereafter, 29920 6 6 sequences were selected and thereafter 1624092 sequences have been selected as unique sequences. These selected unique sequences were further aligned using Silva.nr_v138.1 database. Further, sequential downstream processing was carried out and a total 1924212 sequences were obtained and subsampled with reference to the smallest group of 32518 sequences. Chloroplast, mitochondria, Eukaryota, Archaea and unknown samples were removed using remove.lineage command and further shared and taxonomy files were created with cutoff value of 0.03. Grapher 10, Microbiomeanalyst and R softwares were used for downstream processing and final data preparation.

Ethics Statements
This article does not contain any studies with human participation or animal performed by any of the authors.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data Availability
Southern Ocean Carbon Processes (Original data) (NCBI).