Complete Genome Sequence of Serotype III Streptococcus agalactiae Sequence Type 17 Strain 874391

ABSTRACT Here we report the complete genome sequence of Streptococcus agalactiae strain 874391. This serotype III isolate is a member of the hypervirulent sequence type 17 (ST-17) lineage that causes a disproportionate number of cases of invasive disease in humans and mammals. A brief historical context of the strain is discussed.

S treptococcus agalactiae, originally isolated as a causative agent of bovine mastitis (1), is a commensal of the human gastrointestinal and urogenital tracts of up to 30% of healthy adults (2). S. agalactiae is an opportunistic pathogen that causes sepsis, meningitis, pneumonia, and soft tissue infections, including urinary tract infection. The changing epidemiology of invasive disease due to S. agalactiae has highlighted an increasing incidence of infection in immunocompromised and elderly individuals (3). S. agalactiae strain 874391 (former strain number 24) (4, 5) is a human vaginal isolate previously studied in the context of pathogenomics (6-8); surface antigen structure (9); adhesion to, invasion, and killing of macrophages and epithelial cells (10)(11)(12)(13)(14); and urogenital tract colonization (15)(16)(17). S. agalactiae 874391 is of the hypervirulent sequence type 17 (ST-17) lineage that comprises homogenous serotype III clones that are associated with a disproportionately high number of cases of invasive neonatal disease, particularly meningitis (18)(19)(20). It is likely that the ST-17 S. agalactiae lineage originated from a bovine source (21).
DNA extraction and whole-genome sequencing were performed as follows. For Illumina sequencing, S. agalactiae 874391 genomic DNA was isolated using methods previously described (22). The DNA was used to generate 100-bp paired-end reads using the Illumina HiSeq 2000 platform at the Wellcome Trust Sanger Institute, United Kingdom. For Pacific Biosciences (PacBio) sequencing, DNA was isolated using the UltraClean microbial DNA isolation kit (Mo Bio Laboratories). Single-molecule real-time (SMRT) sequencing was performed on an RS-II machine (Pacific Biosciences, CA, USA) using P6-C4 chemistry at The University of Melbourne, Australia. The sequencing provided 477ϫ coverage (1.17-Gb sequence, 68,825 reads, 17,033-bp mean read length).
For sequence analysis, PacBio sequence read data were assembled de novo using Canu version 1.3 (23). Following assembly, the genome was polished using Illumina sequencing data to resolve single nucleotide insertion and deletion errors associated with homopolymer tracts, generating a complete circular genome of 2,153,937 bp with a GC content of 35.5%. Detection of methylation signatures was carried out using the SMRT analysis package version 2.3.0. Annotation of the 2.15-Mb genome was performed using Prokka version 1.12 (24) and the NCBI Prokaryotic Genome Annotation Pipeline. Annotated features include 2,157 genes with 2,023 coding sequences (CDS), 21 ribosomal RNAs (rRNAs), 80 transfer RNAs (tRNAs), 3 noncoding RNAs (ncRNAs), 32 pseudogenes, and 1 clustered regularly interspaced short palindromic repeat (CRISPR) array. Annotation using the Rapid Annotation Subsystem Technology (RAST) server (25) showed that of the 2,023 CDS, 53% of the genes covered subsystem features. Of these, 68 were associated with virulence and 10 were associated with phages and prophages. Additionally, gene networks were linked to carbohydrate metabolism (n ϭ 231), protein metabolism (n ϭ 261), cell wall and capsule (n ϭ 139), and resistance to antibiotics and toxic compounds (n ϭ 34), including ␤-lactams (n ϭ 1), fluoroquinolones (n ϭ 4), tetracyclines (n ϭ 2), vancomycin (n ϭ 5), and multidrug efflux (n ϭ 3).

ACKNOWLEDGMENTS
This study was supported in part by Griffith University and the Wellcome Trust, United Kingdom.
We do not have a commercial or other association that might pose a conflict of interest.