Genome Sequence and Methylation Patterns of Halorubrum sp. Strain BOL3-1, the First Haloarchaeon Isolated and Cultured from Salar de Uyuni, Bolivia

Halorubrum sp. strain BOL3-1 was isolated from Salar de Uyuni, Bolivia, and sequenced using single-molecule real-time sequencing. Its 3.7-Mbp genome was analyzed for gene content and methylation patterns and incorporated into the Haloarchaeal Genomes Database (http://halo.umbc.edu).

H alorubrum sp. strain BOL3-1 is the first haloarchaeon from Bolivia to be cultured and sequenced. It was isolated from salt samples from Salar de Uyuni, Department of Potosí, Bolivia, the largest salt flat in the world and an environment remarkable for its high elevation and high albedo and UV radiation exposure (1). The environment is unique and of significant interest to the astrobiology community due to its multiple extremes (2).
Stratified salt crust was sampled from the Salar in March 2015 at a remote site, (20°33=28.58ЉS and 67°12=29.56ЉW [-20.5579389°, -067.2082111°]), 3,647 m above sea level. Typical conditions are pH 7.3 to 7.6, Ն28% NaCl (wt/vol) concentration, and temperatures of -15 to 22°C. Salt samples were dissolved in CM ϩ medium (3), and growth was stimulated under illumination at 37°C with shaking at 220 rpm (Innova 4230 refrigerated incubator shaker; New Brunswick, NJ, USA). The enrichment culture was plated on CM ϩ agar plates and purified by 3 rounds of streaking. The isolated strain, BOL3-1, formed biofilms in liquid culture, and colonies were bright red and translucent.
Nucleic acids were extracted using standard methods (3), and sequencing was performed using the PacBio RS II platform. A SMRTbell sequencing library was prepared from 3 g genomic DNA randomly sheared to 20 kb with a Megaruptor instrument (Diagenode, Denville, NJ). The library was sequenced using a single-molecule real-time (SMRT) cell with C4-P6 chemistry and a 360-min collection time. Sequencing reads were filtered (quality, Ն0.80; length, Ն100 bp) and assembled de novo (98,158 reads with a mean subread length of 3,908 bp) using RS_HGAP_Assembly.3 (4) in the SMRT Analysis 2.3.0 environment (minimum seed read length, 5,000 bp; minimum coverage for correction, 8ϫ). Error correction and closure were performed using RS_BridgeMapper.1, and methylation patterns were determined using RS_Modification_and_Motif_Analysis.1 within SMRT Analysis using default settings (minimum modification quality value [QV], 30).
Over 100 transposase genes are present, suggesting a large number of insertion sequences in the genome. There are 2 clustered regularly interspaced short palindromic repeat (CRISPR) arrays (a type I-B CRISPR-associated protein, Cas5, on p163 and a type I-B CRISPR-associated protein, Cas7/Csh2, on p117). The methylated DNA motifs and the methyltransferases (MTases) predicted to be responsible for some of these proteins are shown in Table 1.
Data availability. The Halorubrum sp. strain BOL3-1 genome sequence has been deposited in GenBank with the accession numbers CP034692, CP034691, CP034690, and CP034693 and is also available on HaloWeb (https://halo.umbc.edu/cgi-bin/ haloweb/haloweb.pl). The raw data are available in the NCBI Sequence Read Archive with the accession number SRP175004.

ACKNOWLEDGMENTS
Work in the DasSarma laboratory is supported by NASA Exobiology grants NNX15AM07G and NNH18ZDA001N. B.P.A. and R.J.R. work for New England Biolabs, a company that sells research reagents, including restriction enzymes and DNA methyltransferases, to the scientific community.
D.G. thanks the Swedish International Development Cooperation Agency (ASDI) for supporting his work.