Complete Genome Sequence of an Extremely Halophilic Archaeon from Great Salt Lake, Halobacterium sp. GSL-19

ABSTRACT An extremely halophilic archaeon, Halobacterium sp. GSL-19, was isolated from the north arm of Great Salt Lake in Utah. Single-molecule real-time (SMRT) sequencing was used to establish a GC-rich 2.3-Mbp genome composed of a circular chromosome and 2 plasmids, with 2,367 predicted genes, including 1 encoding a CTAG-methylase widely distributed among Haloarchaea.

Brine was sampled from 10 cm below the surface of the lake at 28°C, inoculated into CM 1 medium (complete medium plus trace elements), and grown with shaking at 220 rpm at 37°C, as previously described (10,11). The enrichment cultures were plated on CM 1 agar plates, and the isolate, an extremely halophilic, pigmented, phase-bright haloarchaeon, was purified by 3 rounds of streaking.
Nucleic acids were extracted using standard methods for haloarchaea involving hypotonic lysis phenol extraction and ethanol precipitation, as previously described (10)(11)(12), and sequencing was performed using the PacBio Sequel platform (Pacific Biosciences, Menlo Park, CA). SMRTbell libraries were prepared from 2 mg genomic DNA sheared to 40-kbp with a Megaruptor instrument (Diagenode, Inc., Denville, NJ), New England BioLabs (NEB) reagents equivalent to the PacBio library prep kit were used (13), and the library was sequenced on a single-molecule real-time (SMRT) cell with Sequel binding kit 3.0 with 10-h collection and 2-h pre-extension times. A total of 613,574 reads were obtained (subread N 50 , 4,078 bp), which were filtered and assembled de novo using Hierarchical Genome Assembly Process version 4 (HGAP4) with default parameters. The final assembly comprised 3 contigs, of which all circularized automatically using HGAP4, with mean coverage of 3,924Â.
The GSL-19 genome contained 2,367 genes, including a single rRNA operon and 44 tRNA genes all carried on the chromosome. The proteome was highly acidic (19)(20)(21), with a calculated mean pI value of 4.91 (22). All 799 core haloarchaeal orthologous groups (cHOGs) were encoded in the GSL-19 genome (23). The genome contained 8 genes encoding origin recognition complexes, 1 gene encoding a TATA-binding protein, and 5 genes encoding transcription factor B (24)(25)(26). A gvp gene cluster was also present, consistent with the production of gas vesicles observed as phase-bright inclusions (27,28). Taxonomy was assigned using the 16S rRNA sequence and average nucleotide identity according to NCBI taxonomy, and the isolate has been designated Halobacterium sp. GSL-19 (29).
Data availability. The Halobacterium sp. GSL-19 genome sequence has been deposited in GenBank with the accession numbers CP070375.1 to CP070377.1, and raw data are available in the NCBI Sequence Read Archive with the accession number SRX10230949.

ACKNOWLEDGMENTS
The DasSarma laboratory was supported by NASA grant 80NSSC17K0263 and NIH grant AI139808.
B.P.A. and R.J.R. work for New England BioLabs, a company that sells research reagents, including restriction enzymes and DNA methyltransferases, to the scientific community.