Coastal Ocean Metagenomes and Curated Metagenome-Assembled Genomes from Marsh Landing, Sapelo Island (Georgia, USA)

Microbes play a dominant role in the biogeochemistry of coastal waters, which receive organic matter from diverse sources. We present metagenomes and 45 metagenome-assembled genomes (MAGs) from Sapelo Island, Georgia, to further understand coastal microbial populations.

C oastal oceans receive carbon and nutrients from rivers and marshes, driving high productivity. The metabolism of coastal microbes largely determines how much of the resulting organic matter (OM) is exported (1). Metagenomic data can provide insights into how microbial diversity relates to metabolic potential and drives OM processing (2). Coastal microbial biogeochemistry has been well studied at Sapelo Island, Georgia (3)(4)(5). Furthermore, these waters host a summer "bloom" of Thaumarchaeota and have been studied extensively to understand thaumarchaeal ecology (e.g., references 6-9). The metagenomic data presented here will guide an understanding of the microbial taxa in these waters and complement existing data for the same communities.
Seawater was collected at Marsh Landing (31°25=4.08ЉN, 81°17=34.26ЉW) as part of the Sapelo Island Microbial Carbon Observatory (http://www.simco.uga.edu/) by filtering through a 3.0-m-pore-size prefilter and a 0.2-m-pore-size Supor filter (Pall), which was frozen in liquid nitrogen (10). Duplicate filters were collected in August 2008 and 2009, 1 h before both day and night high tide on consecutive days (11). DNA extraction was done using the PowerSoil kit (Mo Bio), as described previously (7). DNA was sheared to ϳ225 bp, and libraries were constructed with the TruSeq DNA kit (Illumina) at the Georgia Genomics and Bioinformatics Core. Replicates from day and night samples on consecutive days were pooled to make 4 libraries (08N, 08D, 09N, and 09D; see Table 1), which were sequenced on 25% of an Illumina HiSeq 2500 platform rapid lane (paired-end, 150-bp reads) at the HudsonAlpha Institute for Biotechnology.
Data availability. The reads, coassembly, and MAGs were deposited under GenBank BioProject number PRJNA552566. The reads are under SRA accession numbers SRX6421373 to SRX6421376. The coassembly and MAGs are under whole-genome sequencing (WGS) project numbers VMBT00000000 to VMDM00000000.

ACKNOWLEDGMENTS
Logistical support in the field was provided by the staff of the University of Georgia Marine Institute (UGAMI) and the Georgia Coastal Ecosystems Long Term Ecological Research (GCE-LTER) program. Shalabh Sharma kindly provided advice on bioinformatics.
This work was funded by National Science Foundation (NSF) grants OCE1538677 and OPP1643466 to J.T.H. and OCE1356010 to M.A.M. and was supported in part by resources and technical expertise from the Georgia Advanced Computing Resource Center, a partnership between the University of Georgia's Office of the Vice President for Research and Office of the Vice President for Information Technology.