Complete Genome Sequence of Sulfitobacter sp. Strain D7, a Virulent Bacterium Isolated from an Emiliania huxleyi Algal Bloom in the North Atlantic

A Rhodobacterales bacterium, Sulfitobacter sp. strain D7, was isolated from an Emiliania huxleyi bloom in the North Atlantic and has been shown to act as a pathogen and induce cell death of E. huxleyi during lab coculturing.

T he coccolithophore Emiliania huxleyi (Haptophyta) is a cosmopolitan marine microalga that plays an important role in global carbon and sulfur cycling by forming massive blooms and producing copious amounts of dimethylsulfoniopropionate (DMSP) and the atmospherically active compound dimethyl sulfide (1). Here, we report the complete genome sequence of Sulfitobacter sp. strain D7, a Rhodobacterales bacterium that, when cocultured with E. huxleyi, causes death of the alga in a DMSPdependent manner (2). It was isolated from the microbiome of copepods collected during a natural E. huxleyi bloom in the North Atlantic, and its cooccurrence with the alga in the water column was confirmed by quantitative PCR (qPCR) (2).
Genomic DNA was extracted from a culture grown overnight in 1/2 yeast extracttryptone-Sigma sea salts (YTSS) medium at 28°C under agitation (150 rpm) using the DNeasy blood and tissue kit (Qiagen, Hilden, Germany). A library with an insert size of ϳ500 bp was prepared using the Nextera XT kit (Illumina, San Diego, CA) and sequenced using the Illumina NextSeq 500 platform to generate 150-bp paired-end reads. A library of fragments longer than 10 kb was prepared following the PacBio (Menlo Park, CA) protocol, loaded with MagBeads, and sequenced on the RS II platform. Hybrid assembly of both PacBio (N 50 , 10,282 bp) and Illumina reads was performed using the software package SPAdes v. 3.11 (3) with "31,51,71,91" for k-mers. After filtering for contig size (Ͼ500 bp) and Illumina read coverage (at least one-third of the coverage level at half the total assembly length) based on Bowtie2 v. 2.3.0 (4) mapping, six contigs longer than 80 kb were retained from the initial draft. Sequence errors, misassemblies, and gaps were corrected by mapping Illumina reads to the contigs with the Burrows-Wheeler Aligner (BWA) v. 0.7.12 (5) and visual inspection in Integrative Genomics Viewer (IGV) v. 2.3.90 (6), as in a previous study (7). The assembly was also checked by mapping the PacBio reads using Basic Local Alignment with Successive Refinement (BLASR) v. 5.3 (8) with the default settings for RS II reads. Completion and circularization of each contig were confirmed by Illumina read pairs and PacBio reads that joined the two ends. The NCBI Prokaryotic Genome Annotation Pipeline v. 4.1 (9) was used for gene predictions and annotations.
The complete genome sequence consists of six circular replicons (Table 1), including one 3,371,091-bp chromosome with an average GC content of 61.4% and five plasmids of between 81 and 193 kb that show low or single copy numbers characteristic of Rhodobacterales and other alphaproteobacteria (10,11). In total, there are 4 complete rRNA operons, 47 tRNAs, 3,744 protein-coding genes, and 52 pseudogenes, with all RNA genes located on the chromosome. Consistent with observation of the production of methanethiol, a metabolic by-product of DMSP, during coculturing with E. huxleyi, the chromosome encoded dmdA for demethylation of DMSP (2). In addition, we identified type I and type II secretion system genes on the chromosome and type IV genes on the plasmids p2SUD7 and p3SUD7, which may be involved in the bacterium's interactions with the alga.
Data availability. The complete genome sequence of Sulfitobacter sp. strain D7 has been deposited in GenBank under the accession numbers CP020694 to CP020699. Raw Illumina and PacBio RS II sequence reads have been deposited in the NCBI Sequence Read Archive under the accession numbers SRR7948486 and SRR7948487, respectively.

ACKNOWLEDGMENTS
This work was supported by the European Research Council (ERC) CoG (VIROCELL-SPHERE grant number 681715) to Assaf Vardi. Chuan Ku was supported by an EMBO long-term fellowship (ALTF 1172-2016), and Noa Barak-Gavish was supported by a JNF fellowship for environmental studies from the Rieger Foundation.
Illumina library preparation was performed at the University of Illinois at Chicago Sequencing Core (UICSQC). PacBio sequencing was performed at the Great Lakes Genomics Center at the University of Wisconsin-Milwaukee. Genome assembly was performed at the Research Informatics Core (RIC) at the University of Illinois at Chicago.