Partial genome sequence of the haloalkaliphilic soda lake bacterium Thioalkalivibrio thiocyanoxidans ARh 2T

Thioalkalivibrio thiocyanoxidans strain ARh 2T is a sulfur-oxidizing bacterium isolated from haloalkaline soda lakes. It is a motile, Gram-negative member of the Gammaproteobacteria. Remarkable properties include the ability to grow on thiocyanate as the sole energy, sulfur and nitrogen source, and the capability of growth at salinities of up to 4.3 M total Na+. This draft genome sequence consists of 61 scaffolds comprising 2,765,337 bp, and contains 2616 protein-coding and 61 RNA-coding genes. This organism was sequenced as part of the Community Science Program of the DOE Joint Genome Institute.


Introduction
Soda lakes are found in many arid zones across the world, such as the Kulunda Steppe in Russia, North-Eastern China, the Rift Valley in Africa, and in arid parts of North America, i.e. California and Nevada. The defining characteristics of these lakes are the abundance of carbonate/bicarbonate anions rather than chloride and their moderate to high salinities. This makes soda lakes a unique habitat with stable, alkaline pH values above nine and up to 11 [1]. Despite the high salinity and alkalinity, soda lakes harbor a rich microbial diversity that is responsible for highly active elemental cycles. Aside from the carbon cycle, the sulfur cycle is of great importance in these lakes [2], yet little is known about their precise biogeochemistry and dynamics [3]. A better understanding of these processes will lead to improved insights into the ecology and biogeochemical cycling in soda lakes. Additionally, sulfur-cycling extremophilic prokaryotes have important applications in bioremediation [4] and more detailed knowledge of their physiology may improve industrial waste processing. For these reasons, we have sequenced more than 70 strains belonging to the genus Thioalkalivibrio, a dominant cultivated group of chemolithoautotrophic haloalkaliphilic sulfur-oxidizing bacteria in soda lakes worldwide. Here we present the partial genome sequence of Thioalkalivibrio thiocyanoxidans ARh 2 T .

Organism information
Classification and features T. thiocyanoxidans ARh 2 T forms motile vibrio-like cells of approximately 0.5-0.6 by 0.8-1.4 μm (basic properties are summarized in Table 1). The cells grown with thiocyanate as electron source have a remarkably extended periplasm (Fig. 1). It is a Gram-negative bacterium belonging to the Gammaproteobacteria (Fig. 2). The species description is based on four strains (ARh 2, ARh 3, ARh 4 and ARh 5) that were isolated from sediment samples of South-Western Siberian, Kenyan and Egyptian soda lakes. Strain ARh 2 is a type strain of the T. thiocyanoxidans species. As a chemolithoautotroph, ARh 2 T derives energy from the oxidation of inorganic sulfur compounds, such as sulfide, thiosulfate, thiocyanate, elemental sulfur and polysulfides. The most interesting properties are its ability to grow on thiocyanate as the sole source of energy, sulfur and nitrogen and its ability to grow in saturated soda brines brines with thiosulfate as energy source [5].

Genome sequencing information
Genome project history Thioalkalivibrio thiocyanoxidans ARh 2 T was sequenced as part of a project aimed at sequencing a large number of Thioalkalivibrio isolates. The goal of this project is to enable the study of the genomic diversity of the dominant genus of sulfur-oxidizing bacteria in soda lakes. T. thiocyanoxidans ARh 2 T was selected for its ability to grow in salt-saturated brines (4.3 M Na + ) and for its ability to grow on thiocyanate as the sole energy, sulfur and nitrogen source. The permanent draft genome we present here consists of approximately 2.8 million basepairs divided over 61 scaffolds. Sequencing was performed at the Joint Genome Institute under project 1008667. The genome sequence was released in Genbank on December 25, 2014. An overview of the project is given in Table 2.

Growth conditions and genomic DNA extraction
T. thiocyanoxidans ARh 2 T (DSM 13532) was cultured in a standard buffer containing sodium carbonate and bicarbonate at pH 10. The total salt concentration was 0.6 M Na + [6]. The energy source was thiosulfate, at a concentration of 40 mM. After harvesting, the cells were stored at −80°C for further processing. Genomic DNA was extracted using a chloroform-phenol-isoamylalcohol mixture and precipitated with ethanol. After vacuum drying, the pellet was dissolved in water and the quantity and quality of the DNA determined using the JGIprovided Mass Standard Kit.

Genome sequencing and assembly
This strain was sequenced as part of the Community Science Program of the US Department of Energy Joint Genome Institute. The Illumina HiSeq 2000 platform was used for sequencing, with a depth of 1819X. More details regarding the library construction and sequencing are available at the JGI website. Reads were filtered using DUK and assembled using Velvet 1.1.04 [7]. Pseudoreads (1-3 Kb) were generated from the Velvet output using wgsim and reassembled using ALLPATHS-LG r42328 [8]. The final assembly consists of 61 scaffolds.

Genome annotation
Genes were predicted using Prodigal [9], followed by a round of manual curation using GenePRIMP [10] to detect pseudogenes. The resulting predicted genes were  translated and annotated using the NCBI NR database in combination with the UniProt, TIGRFam, Pfam, KEGG, COG and InterPro databases and tRNAScanSE [11] for tRNA prediction. Ribosomal RNAs were detected using models built from SILVA. Further annotation was performed using the Integrated Microbial Genomes platform. All annotation data is freely available there, with IMG submission ID 12214.

Genome properties
The final draft of the genome comprises 2.8 million base pairs in 61 scaffolds, with a G + C percentage of 66.18 %. The gene calling and annotation pipeline detected 2677 genes, of which 2616 code for proteins. Basic statistics concerning the genome sequence are shown in Table 3.
In total, 70 % of the genes could be assigned functional categories based on COGs (see Table 4). Thioalkalivibrio thiocyanodenitrificans ARhD T (AY360060)

Halorhodospira halophila SL1 T (CP000544)
Thiohalospira halophila HL3 T (DQ469576)) Thiohalospira alkaliphila ALgr 6sp T (EU169227) Fig. 2 Phylogenetic tree based on 16S rRNA sequences comprising the Thioalkalivibrio type strains and several other members of the Ectothiorhodospiraceae family. Black dots mark nodes with a bootstrap value between 90 and 100 %. 16S rRNA sequences of members of the Alphaproteobacteria were used as the outgroup, but pruned from the tree. The tree was constructed using ARB [21] and bootstrap values calculated using MEGA6 [22]   The total is based on the total number of protein coding genes in the genome