Draft genome sequence of Dethiosulfovibrio salsuginis DSM 21565T an anaerobic, slightly halophilic bacterium isolated from a Colombian saline spring

A bacterium belonging to the phylum Synergistetes, genus Dethiosulfovibrio was isolated in 2007 from a saline spring in Colombia. Dethiosulfovibrio salsuginis USBA 82T (DSM 21565 T = KCTC 5659 T ) is a mesophilic, strictly anaerobic, slightly halophilic, Gram negative bacterium with a diderm cell envelope. The strain ferments peptides, amino acids and a few organic acids. Here we present the description of the complete genome sequencing and annotation of the type species Dethiosulfovibrio salsuginis USBA 82T. The genome consisted of 2.68 Mbp with a 53.7% G + C. A total of 2609 genes were predicted and of those, 2543 were protein coding genes and 66 were RNA genes. We detected in USBA 82T genome six Synergistetes conserved signature indels (CSIs), specific for Jonquetella, Pyramidobacter and Dethiosulfovibrio. The genome of D. salsuginis contained, as expected, genes related to amino acid transport, amino acid metabolism and thiosulfate reduction. These genes represent the major gene groups of Synergistetes, related with their phenotypic traits, and interestingly, 11.8% of the genes in the genome belonged to the amino acid fermentation COG category. In addition, we identified in the genome some ammonification genes such as nitrate reductase genes. The presence of proline operon genes could be related to de novo synthesis of proline to protect the cell in response to high osmolarity. Our bioinformatics workflow included antiSMASH and BAGEL3 which allowed us to identify bacteriocins genes in the genome. Electronic supplementary material The online version of this article (doi: 10.1186/s40793-017-0303-x) contains supplementary material, which is available to authorized users.


Introduction
The bacteria belonging to the phylum Synergistetes, a robust monophyletic branch of the phylogenetic tree based on rRNA data, are widespread in a wide range of anoxic ecosystems. Jumas-Bilak & Marchandin [1] have delineated several habitats in which the members of this phylum live. These include sludge and wastewater from anaerobic digesters [2][3][4], natural springs [5], natural seawater and sulfur mats [6], water related to petroleum and gas production facilities [7,8] and host-associated microbiota [9][10][11].
Bhandari and Gupta [24] identified molecular markers consisting of conserved signature insertions/deletions (indels) (CSIs) present in protein sequences which are specific for Synergistetes. Of these, seven are specifically present in Jonquetella, Pyramidobacter and Dethiosulfovibrio. In this study, we verified whether these CSIs are also present in D. salsuginis USBA 82 T .

Organism information
Classification and features D. salsuginis USBA 82 T was isolated in 2007 from the saline spring named Salpa, in the Colombian Andes. The spring has a temperature~21°C and pH~6.5 throughout the year. The predominant dissolved ion is sulfate (20 g.l −1 ) and the conductivity is approximately 50 mS. cm −1 [25]. Samples were collected in sterile containers, which were capped, stored over ice, transported to the laboratory and maintained at 4°C until use [5]. Enrichments were done as described in Díaz-Cárdenas et al. [5]. Briefly, they were initiated in a medium prepared by filtering saline spring water through polycarbonate membranes (Durapore) with a pore size of 0.22 μm. The medium was supplemented with peptone (0.2%, w/v), yeast extract (0.02%, w/v) and the trace element solution (1 ml l −1 ) as described by Imhoff-Stuckle & Pfenning [26]. Then, the medium was boiled and then cooled to room temperature under a stream of oxygen-free nitrogen. An 8 ml aliquot was dispensed into Hungate tubes under oxygen-free nitrogen gas and sterilized by autoclaving at 121°C for 20 min at a pressure of 1-1.5 kg cm −2 . The enrichment medium was inoculated with 2 ml water samples, incubated at 36°C for up to 2 weeks. To isolate pure cultures, serial dilutions of the enrichment cultures were made in an artificial basal medium (BM) fortified with 2% (w/v) Noble agar (pH = 7.1) using the roll-tube technique [5].
Cells of strain USBA 82 T are slightly curved rods with pointed or rounded ends (5-7 × 1.5 μm) and occur singly or in pairs. Cells are motile by laterally inserted flagella (Fig. 1). This non-spore-forming, strictly anaerobic, slightly halophilic, Gram negative bacterium with a diderm cell envelope, presents some particular metabolic features. It ferments arginine, casamino acids, glutamate, histidine, peptone, serine, threonine, tryptone, pyruvate and citrate, but growth is not observed on carbohydrates, alcohols or fatty acids. The main end products of fermentation are acetate and succinate [5]. As other members of the genus, strain USBA 82 T reduces thiosulfate and sulfur to sulfide but sulfate, sulfite, nitrate and nitrite are not used as electron acceptors [5]. The reduction of sulfur or thiosulfate is not required for growing on the amino acids arginine, glutamate and valine. The strain USBA 82 T ferments these amino acids, in contrast to that observed on D. peptidovorans.
The isolate was assigned to the phylum Synergistetes, close to D. peptidovorans, by comparison of the 16S rRNA sequence with a similarity value of 94.2% [5,7]. Comparison of the phylogenetic, chemotaxonomic and physiological features of strain USBA 82 T with all other members of Dethiosulfovibrio, suggested that it represents a novel species for which the name D. salsuginis was proposed [5].
D. salsuginis was stored since the collection date at the Collection of Microorganisms of Pontificia Universidad Javeriana (CMPUJ, WDCM857) (ID CMPUJ U82 T =DSM 21565 T =KCTC 5659 T ) with the ID USBA 82 T growing anaerobically on the BM medium described by Díaz-Cárdenas et al. [5]. Cells are preserved at −20°C in BM supplemented with 20% (v/v) glycerol [5]. The general features of the strain are reported in Table 1.

Genome project history
Jumas-Bilak & Marchandin [1] pointed out that bacteria belonging to the phylum Synergistetes remain poorly characterized by molecular approaches, particularly by typing methods, and the only gene sequences currently available for most organisms of the phylum are 16S rDNA sequences. Currently, there are twenty-eight isolates that are fully sequenced and annotated or in the phase of final sequencing. The type strain USBA 82 T was selected to sequencing on the basis of its novelty and this genome contributes with the Genomic Encyclopedia of Bacteria and Archaea [27]. In addition, this work is part of the bigger study aiming at exploring the microbial diversity in extreme environments in Colombia. More information can be found on the Genomes OnLine database under the study Gs0118134. The JGI accession number, sequence project ID is 1,094,809 and consists of 68 scaffolds. The annotated genome is publically available in IMG under Genome ID FXBB01000001-FXBB01000068. Table 2 depicts the project information and its association with MIGS version 2.0 compliance [28].
Growth conditions and genomic DNA preparation (heading level 2) D. salsuginis strain USBA 82 T was grown anaerobically on 100 mL of BM supplemented with 1.0 g yeast extract and 0.5% (w/v) peptone [5] at 30°C for 24 h. The growth was monitored by OD 580nm . Cells were harvested by centrifugation at 4000 rpm when the mid exponential phase (OD 580nm = 0.2) was reached, pelleted and immediately used for DNA extraction. We extracted the genomic DNA using the Wizard® Genomic DNA Purification Kit (Promega) according to the manufacturer's instructions.

Genome sequencing and assembly
The draft genome of D. salsuginis was generated at the DOE Joint Genome Institute (JGI) using the Illumina technology [29]. An Illumina 300 bp insert standard shotgun library was constructed and sequenced using the   [30], which removes known Illumina artifacts and PhiX. Reads with more than one "N" or with quality scores (before trimming) averaging less than 8 or reads shorter than 51 bp (after trimming) were discarded. Remaining reads were mapped to masked versions of human, cat and dog references using BBMAP [30] and discarded if identity exceeded 93%. Sequence masking was performed with BBMask [30]. The following steps were then performed for assembly: (1) artifact filtered Illumina reads were assembled using Velvet (version) [31]; (2) 1-3 kbp simulated paired end reads were created from Velvet contigs using wgsim (version 0.3.0) [32]; (3) Illumina reads were assembled with simulated read pairs using Allpaths-LG (version r46652) [33]. Parameters for assembly steps were:

Genome annotation
Annotation was done using the DOE-JGI annotation pipeline [34]. Genes were identified using Prodigal [35]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information nonredundant database, UniProt, TIGRFam, Pfam, KEGG, COG, KOG, MetaCyc (version 19.5) and Gene Ontology databases. The first category of noncoding RNAs, tRNAs, were predicted using tRNAscan-SE 1.3.1 tool [36] Ribosomal RNA genes (5S, 16S, 23S) were predicted using hmmsearch tool from the package HMMER 3.1b2 [37]. Other non-coding RNAs such as the RNA components of the protein secretion complex and the RNase P were identified by searching the genome for the corresponding Rfam profiles using INFERNAL [38]. Additional gene prediction analysis and manual functional annotation was performed within the Integrated Microbial Genomes -Expert Review platform [39] developed by the Joint Genome Institute, Walnut Creek, CA, USA. The annotated genome of strain USBA 82 T is available in IMG (genome ID = 2,671,180,116).
We used IMG tools for data mining to explore potential production of secondary metabolites of D. salsuginis genome. In addition, we developed a bioinformatics workflow which included platforms such as antiSMASH [40], BAGEL3 [41] and NaPDoS [42].

Genome properties
The genome of D. salsuginis is 2.68 Mbp with a 53.7% GC content. A total of 2609 genes were predicted and of those, 2543 were protein coding genes and 66 were RNA genes. The properties and statistics of the genome are summarized in Table 3. The distribution of genes into COGs functional categories is presented in Table 4. Most genes were classified in the category of amino acid transport and metabolism (11.8%), followed by general function (8.3%) and inorganic ion transport and metabolism (6.6%).

Insights from the genome sequence
The draft genome provides phylogenetic and metabolic information. Phylogenetic relationship was evaluated using 16S rRNA gene sequence and seven conserved signature indels identified as specific for a clade consisting of Jonquetella anthropi, Pyramidobacter piscolens and D. peptidovorans [24].
Sequences of the 16S rRNA gene of strain USBA 82 T and related strain types currently characterized in the phylum Synergistetes were aligned using MEGA 7 program version 7.0.25 [43]. The evolutionary distance was analyzed by Neighbour-Joining (NJ) [44], using Jukes-Cantor method [45] (Fig. 2) and Maximum-Likelihood (ML) using the General Time Reversible (GTR) model plus gamma distribution and invariant sites see Additional file 1: Figure S1) [46]. Bootstrap support was computed after 1000 reiterations for NJ and ML analysis. Thermodesulfatator indicus DSM 15286 T (GenBank accession number AF393376) was used as outgroup in all phylogenetic analyses. The topology of the trees confirmed that the strains belong to subdivision B of the phylum Synergistetes together with members of the genera Dethiosulfovibrio, Jonquetella, Pyramidobacter and Rarimicrobium. We compared seven conserved signature indels that are present in the following protein sequences: penicillin binding protein, 1A family; ribonucleoside diphosphate reductase (nrdA); putative DEAD/DEAH box helicase (indel position 398-457); putative DEAD/DEAH box helicase (indel position 437-496); DNA directed RNA polymerase, ß subunit (rpoB); 1-acyl-sn-glycerol-3-phosphate acyltransferase (plsC) and tRNA modification GTPase TrmE (trmE). We used an identification pipeline with BlastP [43] searches of the reported CSIs over the genome of D. salsuginis USBA 82 T and J. anthropi DSM 22815/1-750, P. piscolens W 5455/1738 and D. peptidovorans DSM 11002/1-743, and multiple alignments using Mafft [44]. The indels that we detected correspond in size to those previously reported by Bhandari and Gupta [24]. We found a 4 amino acids (aa) deletion in the penicillin binding protein, 1A family (see Additional file 2: Figure S2), a 1aa insertion in the nrdA gene (see Additional file 3: Figure S3), a 13aa insertion in the rpoB gene (see Additional file 4: Figure S4), a 1aa insertion in the plsC gene (see Additional file 5: Figure S5) and a 1aa insertion in the trmE gene (see Additional file 6: Figure S6). DEAD/DEAH box CSIs were neither detected in our genome, nor have they ever been detected in previously analyzed species (see Additional file 7: Figure S7).
We also evaluated ultrastructure characters including the cell-wall structure, which currently supports the separation of the Synergistetes clade from other members of the family Syntrophomonadaceae. We detected the presence of a particular deletion in the Hsp60 protein in USBA 82 T (see Additional file 8: Figure S8). It differentiates the traditional Gram-negative diderm bacterial phyla from atypical taxa of diderm bacteria such as Negativicutes, 'Fusobacteria' , 'Elusimicrobia' and Synergistetes [47]. It has been reported that Synergistetes species contain an outer membrane and also have genes that are used for lipopolysaccharide biosynthesis in other microorganisms. However, they lack the genes for the TolAQR-Pal complex that are required for assembly and maintenance of typical outer membrane [48] suggesting that the nature and the role of the outer membrane in Synergistetes could be different than those of other bacteria. This observation was also confirmed in the D. salsuginis strain USBA 82 T genome.
We used MAUVE [49] for whole genome alignment of D. salsuginis strain USBA 82 T with D. peptidovorans type strain (SEBR 4207 T ). The alignment showed conserved clusters and synteny of the majority of the genes (Fig. 3). However, there are some rearrangements dispersed in the genome of D. salsuginis. There is a clear inversion of two regions at the end of the genome and small translocations of regions. Those differences are consistent with the phylogenetic distance between the two species.
Metabolic information contained in the genome of D. salsuginis includes genes related to amino acid transport and metabolism, thiosulfate reduction, and heat shock proteins (hsps). Ammonification genes, mainly nitrate reductase genes (narG,H,I,J), were also observed throughout the genome. In addition, the presence of proline operon proHJ and proA gene could be related to the response to high osmolarity through de novo synthesis of proline to protect the cell from stress [50].
The fermentation of amino acids observed in this species is more commonly found in the phylum Synergistetes, which have a high proportion of amino acid transport and metabolism genes (COG E), than in any other bacterial phylum to date [48]. D. salsuginis contained a total of 229 genes related to this COG category. This represents 11.8% of the genes of this genome. The total is based on the total number of protein coding genes in the genome; COG was obtained from the JGI IMG pipeline [34]  In contrast, carbohydrate fermentation has only been exhibited by a few cultured species in the phylum Synergistetes, such as Thermanaerovibrio velox [51] and Acetomicrobium spp. [14,52,53]. These observations, based on cultured members of the phylum Synergistetes, suggest that members of this phylum are specialists with relatively shallow ecophysiological niches [3]. As was expected, only 5.5% of the genes in the genome of D. salsuginis were categorized as carbohydrate transport and metabolism genes.
IMG tools were used to identify nine biosynthetic gene clusters that are associated with secondary metabolites. With the exception of a cluster reported as a bacteriocin, clusters were identified as putative. antiSMASH 3.0.5 was used to detect 11 clusters of biosynthetic genes related to bacteriocins (18.2%), fatty acids (18.2%), lipopolysaccharide (9.1%) and putative biosynthetic clusters (11%). We found that one of the putative biosynthetic clusters is related to exopolysaccharide (EPS) production. This cluster includes an EPS biosynthesis domain protein, a polysaccharide export protein, a sugar transferase, a nucleotide sugar dehydrogenase and a NADdependent epimerase/dehydratase. It has been reported that EPS of benthic bacteria is involved in motility [54], in absorbing nutrient elements [55] and in assisting attachment of bacteria to organic particles and other surfaces [56]. The presence of this biosynthetic cluster related to EPS production could be an adaptive advantage for growth of this strain in its natural habitat. Using BAGEL3, we identified two biosynthetic clusters. The bacteriocin Linocin M18-like structural protein (>10KDa) (BAGEL3 bacteriocin III database PF04454.7 [1.8e-80] -BlastP 3e-143) belongs to the peptidase U56 family (see Additional file 9: Figure S9a). It presents a similarity of 73% with the Linocin-M18 protein identified in D. peptidovorans. The other cluster was a sactipeptide (see Additional file 9: Figure S9b), but there were no significant BlastP hits for the putative structural gene product. We also identified a gene related to a transposase (BlastP 2e-33) in this cluster. This gene is frequently found in association with bacteriocins, but we also found a putative ABG transporter (PF03806.8 [5.8e-148] -BlastP 0.0) and genes predicted to encode a radical SAM (S-adenosylmethionine) which are involved in bacteriocin maturation (PF14319.1 [9.2e-05] -BlastP 3e-147).

Conclusions
The genome of D. salsuginis USBA 82 T provides insights into many aspects of its physiology and evolution. Sequence analysis and comparative genomics corroborated the taxonomic affiliation of D. salsuginis into the Synergistetes phylum. We detected six of the seven conserved signature indels (CSIs) identified by Bhandari and Gupta [24] as useful for distinguishing the species of the phylum. Our results grouped Jonquetella, Pyramidobacter and Dethiosulfovibrio species together and confirmed the specificity of these CSIs in highly conserved regions of proteins as targets for evolutionary studies in Synergistetes.
The genome of D. salsuginis USBA 82 T contains genes related to amino acid transport and metabolism, thiosulfate reduction and ammonification. This agrees with experimental data and physiological observations. The presence of proline operon genes demonstrates the possibility of a cellular response to high osmolarity through de novo synthesis of proline to protect the cell from stress. Using our bioinformatics workflow, we identified bacteriocin genes associated with secondary metabolites in the genome. Future research will address whether or not these clusters of biosynthetic genes express the associated secondary metabolites that we have identified.