Thermus oshimai JL-2 and T. thermophilus JL-18 genome analysis illuminates pathways for carbon, nitrogen, and sulfur cycling

The complete genomes of Thermus oshimai JL-2 and T. thermophilus JL-18 each consist of a circular chromosome, 2.07 Mb and 1.9 Mb, respectively, and two plasmids ranging from 0.27 Mb to 57.2 kb. Comparison of the T. thermophilus JL-18 chromosome with those from other strains of T. thermophilus revealed a high degree of synteny, whereas the megaplasmids from the same strains were highly plastic. The T. oshimai JL-2 chromosome and megaplasmids shared little or no synteny with other sequenced Thermus strains. Phylogenomic analyses using a concatenated set of conserved proteins confirmed the phylogenetic and taxonomic assignments based on 16S rRNA phylogenetics. Both chromosomes encode a complete glycolysis, tricarboxylic acid (TCA) cycle, and pentose phosphate pathway plus glucosidases, glycosidases, proteases, and peptidases, highlighting highly versatile heterotrophic capabilities. Megaplasmids of both strains contained a gene cluster encoding enzymes predicted to catalyze the sequential reduction of nitrate to nitrous oxide; however, the nitrous oxide reductase required for the terminal step in denitrification was absent, consistent with their incomplete denitrification phenotypes. A sox gene cluster was identified in both chromosomes, suggesting a mode of chemolithotrophy. In addition, nrf and psr gene clusters in T. oshmai JL-2 suggest respiratory nitrite ammonification and polysulfide reduction as possible modes of anaerobic respiration.


Introduction
The Great Boiling Spring (GBS) geothermal system is located in the northwestern Great Basin near the town of Gerlach, Nevada. Geothermal activity is driven by deep circulation of meteoric water, which rises along range-front faults at temperatures up to 96 ºC. A considerable volume of geomicrobiology research has been conducted in the GBS system, including coordinated cultivation-independent microbiology and geochemistry studies [1][2][3][4], habitat niche modeling [3], thermodynamic modeling [1,5], microbial cultivation and physiology [6,7], and integrated studies of the nitrogen biogeochemical cycle (N-cycle [5,6,8]). The latter group of studies is arguably the most detailed body of work on the Ncycle in any geothermal system. Those studies revealed a dissimilatory N-cycle based on oxidation and subsequent denitrification of ammonia supplied in the geothermal source water. In high temperature sources such as GBS and Sandy's Spring West (SSW), ammonia oxidation occurs at temperatures up to at least 82 ºC at rates comparable to those in nonthermal aquatic sediments [5]. Several lines of evidence, including deep 16S rRNA gene pyrosequencing datasets and quantitative PCR, suggest ammonia oxidation is carried out by a single species of ammonia-oxidizing archaea closely related to "Candidatus Nitrosocaldus yellowstonii", which comprises a substantial proportion of the sediment microbial community in some parts of the springs [5,9]. Nitrite oxidation appears to be sluggish or nonexistent in the high temperature source pools since Standards in Genomic Sciences nitrite accumulates in these systems and 16S rRNA gene sequences for nitrite-oxidizing bacteria have not been detected in clone library and pyrotag censuses [1,5]. Finally, the nitrite and nitrate that are produced are denitrified in the sediments to both nitrous oxide and dinitrogen; however, a high flux of nitrous oxide, particularly in the ~80 ºC source pool of GBS, suggested the importance of incomplete denitrifiers [6] and electron donor stimulation experiments suggested a key role for heterotrophic denitrifiers [5]. A subsequent cultivation study of heterotrophic denitrifiers in GBS and SSW resulted in the isolation of a large number of denitrifiers belonging to Thermus thermophilus and T. oshimai, including strains T. oshimai JL-2 and T. thermophilus JL-18 [6]. Strikingly, although Thermus strains were isolated using four different isolation strategies, nine different electron donor/acceptor combinations, and four different sampling dates, all isolates of these two species were able to convert nitrate-N stoichiometrically to nitrous oxide-N, but appeared unable to reduce nitrous oxide to dinitrogen. This physiology, combined with high nitrous oxide fluxes in situ suggested a significant role of T. oshimai and T. thermophilus in the unusual N-cycle in these hot springs. However, the genetic basis of this phenotype remained unknown. Here we present the complete genome sequences of T. oshimai JL-2 and T. thermophilus JL-18, compare them to genomes of other sequenced Thermus spp., and discuss them within the context of their potential impacts on biogeochemical cycling of carbon, nitrogen, sulfur, and iron.

Classification and features
The genus Thermus currently comprises 16 species and includes the well-known T. aquaticus and the genetically tractable T. thermophilus. The genome of T. oshimai JL-2 is the first finished genome to be reported from that species, while T. thermophilus JL-18 is the fourth genome to be sequenced from that species, the other being T. thermophilus HB27, HB8, and SG0.5JP17-16. Figure 1 shows the relationship of T. oshimai JL-2 and T. thermophilus JL-18 to other Thermus species, as determined by phylogenomic analysis of highly conserved genes, which supports the taxonomic identities previously determined by 16S rRNA gene phylogenetic analysis [6]. Table 1 shows general features of T. oshimai JL-2 and T. thermophilus JL-18. Phylogenomic tree highlighting the position of Thermus oshimai JL-2 and Thermus thermophilus JL-18. Thirty-one bacterial phylogenetic markers were identified using Amphora [10]. Maximumlikelihood analysis was carried out with a concatenated alignment of all 31 proteins using RAxML Version 7.2.6 [11] and the tree was visualized using iTOL [12]. Red circles indicate bootstrap support >80% (100 replicates). Scale bar indicates 0.1 substitutions per position. The protein FASTA files for all the species are from NCBI, except for the following species, which are from IMG: Thermus igniterrae ATCC 700962 (Taxon OID: 2515935625), Thermus oshimai DSM 12092 (Taxon OID: 2515463139), Thermus oshimai JL-2 (Taxon OID: 2508706991), Thermus sp. RLM (Taxon OID: 2514335427).

Genome sequencing information
Genome project history T. oshimai JL-2 and T. thermophilus JL-18 were selected based on their important roles in denitrification and also for their biotechnological potential. The genome projects for both the organisms are deposited in the Genomes OnLine Database [29] and the complete sequences are deposited in GenBank. Sequencing, finishing, and annotation were performed by the DOE Joint Genome Institute (JGI). A summary of the project and information associated with MIGS version 2.0 compliance [13] are shown (T. oshimai JL-2; Table  2(a) and T. thermophilus JL-18; Table 2(b)).

Growth conditions and DNA isolation
Axenic cultures of T. oshimai JL-2 and T. thermophilus JL-18 were grown aerobically on Thermus medium as described [6] and DNA was isolated from 0.5-1.0 g of cells using the Joint Genome Institute's (JGI) cetyltrimethyl ammonium bromide protocol [30].  All general aspects of library construction and sequencing performed at the JGI can be found at [30].
The initial draft assemblies of T. oshimai JL-2 and T. thermophilus JL-18 contained 39 contigs in 2 scaffolds and 75 contigs in 3 scaffolds, respectively. The 454 Titanium standard data and the 454 paired end data were assembled together with Newbler, version 2.3-PreRelease-6/30/2009. The Newbler consensus sequences were computationally shredded into 2 kb overlapping fake reads (shreds). Illumina sequencing data was assembled with VEL-VET, version 1.0.13 [33], and the consensus sequence were computationally shredded into 1.5 kb overlapping fake reads (shreds). We integrated the 454 Newbler consensus shreds, the Illumina VEL-VET consensus shreds and the read pairs in the 454 paired end library using parallel phrap, version SPS -4.24 (High Performance Software, LLC). The software Consed [34] was used in the following finishing process. Illumina data was used to correct potential base errors and increase consensus quality using the software Polisher developed at JGI (Alla Lapidus, unpublished). Possible mis-assemblies were corrected using gapResolution (Cliff Han, unpublished), Dupfinisher [35]

Genome annotation
Initial identification of genes was done using Prodigal [36], a part of the DOE-JGI Annotation pipeline, followed by manual curation using GenePRIMP [37]. The predicted ORFs were translated into putative protein sequences and searched against databases including: NCBI nr, Uniprot, TIGR-Fam, Pfam, PRIAM, KEGG, COG, and Interpro. Additional annotations and curations were performed using the Integrated Microbial Genomes -Expert Review (IMG-ER) platform [33].

Genome properties
The T. oshimai JL-2 genome includes one circular chromosome of 2,072,393 bp (2205 predicted genes), a circular megaplasmid, pTHEOS01 (0.27 Mb, 268 predicted genes), and a smaller circular plasmid, pTHEOS02 (57.2 Kb, 75 predicted genes), for a total size of 2,401,329 bp. Of the total 2,548 predicted genes, 2,488 were protein-coding genes. A total of 2,015 (79%) protein-coding genes were assigned to a putative function with the remaining annotated as hypothetical proteins. The properties and the statistics of the genome are summarized in Table 3a, Table 3b, Table 3c and Figure 2). The T. thermophilus JL-18 genome includes one circular chromosome of 1,902,595 bp (2,057 predicted genes), a circular megaplsmid, pTTJL1801 (0.26 Mb, 279 predicted genes), and a smaller circular plasmid, pTTJL1802 (0.14 Mb, 172 predicted genes), for a total size of 2,311,212 bp. Of the total 2,508 predicted genes, 2,452 were protein-coding genes. A total of 1,979 (79%) of protein-coding genes were assigned to a putative function with the remaining annotated as hypothetical proteins. The properties and the statistics of the genome are summarized in Table 4a, Table 4b, Table 4c and Figure 3.  The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome. b Pseudogenes may also be counted as protein coding or RNA genes, so is not additive under total gene count. The total is based on the total number of protein coding genes in the annotated genome.

Comparison with other sequenced genomes
The chromosome of T. thermophilus JL-18 was compared with the chromosomes of T. thermophilus strains HB8 and HB27 [38] using nucmer [39]. The megaplasmid pTTJL1801 was also compared with the megaplasmid sequences of HB8 and HB27. Dot plot results from this analysis (Figure 4(a)) demonstrate a high degree of synteny between the chromosomes of JL-18, HB8, and HB27; however, little synteny exists between the megaplasmids. T. oshimai JL-2 chromosome and megaplasmid sequences were also compared with those of T. thermophilus JL-18; however, little very synteny was apparent (Figure 4(b)).

Profiles of metabolic networks and pathways
T. oshimai JL-2 and T. thermophilus JL-18 genomes encode genes for complete glycolysis, tricarboxylic acid (TCA) cycle, and pentose phosphate pathway ( Figure 5). The genomes also encode glucosidases, glycosidases, proteases, and peptidases, highlighting the ability of these species to use various carbohydrate and peptide substrates. Thus, central carbon metabolic pathways are very similar to those of T. thermophilus HB27 [38] and T. scotoductus SA-01 [41]. The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome. b Pseudogenes may also be counted as protein coding or RNA genes, so is not additive under total gene count.

Genes involved in denitrification
Denitrification involves the conversion of nitrate to dinitrogen through the intermediates nitrite, nitric oxide, and nitrous oxide and is mediated by nar, nir, nor, and nos genes [4]. Incomplete denitrification phenotypes terminating in the production of nitrous oxide have recently been reported for a large number of Thermus isolates, including T. oshimai JL-2 and T. thermophilus JL-18 [6]. Figure 6 shows the organization of the nar operon and neighboring genes involved in denitrification in T. oshimai JL-2, T. thermophilus JL-18, and T. scotoductus SA-01. These gene clusters are located on the megaplasmids of T. oshimai JL-2 and T.
thermophilus JL-18, as in other T. thermophilus strains [44,45]. They are located on the chromosome in T. scotoductus SA-01 [41]. The nar operons show a high degree of synteny and all include genes encoding the membrane-bound nitrate reductase (NarGHI), the associated periplasmic cytochrome NarC, and the dedicated chaperone NarJ. All three strains contained homologs of NarK1, which is a member of the major facilitator superfamily that likely functions as a nitrate/proton symporter [46,47]. However, some experiments in T. thermophilus HB8 suggest NarK1 might also function in nitrite extrusion [39]. T. oshimai JL-2 and T. scotoductus SA-01 also contain homologs of NarK2 (annotated as nep in T. scotoductus SA-01 [41]), which likely encodes a nitrate/nitrite antiporter [44,48]. No significant BLASTP hits for periplasmic nitrate reductase subunits NapB and NapC were found in T. oshimai JL-2 and T. thermophilus JL-18, consistent with the use of the Nar system in the Thermales. All three strains contain a dnrST operon adjacent to, but divergently transcribed from, the narGHJIK operon. dnrST encodes transcriptional activators responsible for upregulation of the nitrate respiration pathway in the absence of O 2 and the presence of nitrogen oxides or oxyanions [42] ( Figure  6). Both the species contain a putative nirK, which encodes the NO-forming, Cu-containing nitrite reductase. In addition, T. oshimai JL-2 and T. scotoductus SA-01 both harbor nirS [41], which encodes the isofunctional tetraheme cytochrome cd 1 -containing nitrite reductase. Previous studies have suggested that bacteria use either NirK or NirS, but not both, for the reduction of nitrite [49].
The unique presence of NirK and NirS in T. oshimai JL-2 and T. scotoductus SA-01 likely enhances their denitrification abilities since isoenzymes are typically kinetically distinct and/or regulated differently. This idea is consistent with the distinct denitrification phenotypes of T. oshimai strains as compared to T. thermophilus strains reported previously, including strains T. oshimai JL-2 and T. thermophilus JL-18 [6]. In those studies, nitrite accumulated in the medium at concentrations of <150 µM in T. thermophilus strains, whereas it was rapidly produced to concentrations >200 µM but consumed rapidly to below method detection limits in T. oshimai strains.  NirK functions as a homo-trimer [50] and contains type 1 (blue) and type 2 (non-blue) copper-binding residues [49]. Comparison of the NirK from T. oshimai JL-2 and T. scotoductus SA-01 with previously studied NirK amino acid sequences revealed that six of the seven copper-binding residues are conserved, except for a single methionine (M) to glutamine (Q) substitution in both Thermus proteins (Figure 7; indicated by an asterisk (*)). Glutamine, not methionine, is the copper-binding ligand in the case of stellacyanin, a blue (type 1) copper-containing protein [52,53]. A M121Q recombinant protein of Alcaligenes denitrificans azurin showed similar electron paramagnetic resonance (EPR), but exhibited a 100-fold lower redox activity when compared to wild-type azurin [54]. Therefore, although the methionine is replaced with a glutamine in the T. oshimai JL-2 NirK, it is possible that this glutamine residue can function as a copper-binding ligand similar to stellacyanin and azurin. The large and small subunits of nitric oxide reductase (NorB and NorC) are predicted to be co-transcribed along with nitrite reductases in T. oshimai JL-2, T. thermophilus JL-18 and T. scotoductus SA-01 ( Figure 6). Genes encoding the 15 subunit NADH-quinone oxidoreductase [55] were identified in both genomes (Theos_0703 to 0716, 1811 in T. oshimai JL-2; TTJL18_1786 to 1799, 1580 T. thermophilus JL-18). nrcDEFN, a four gene operon encoding a novel NADH dehydrogenase, is adjacent to the nar operon in the megaplasmid of T. thermophilus HB8 and has been previously implicated in nitrate reduction [43]. In T. thermophilus JL-18, the operon is present (Figure 6), although (TTJL18_2313) is truncated (NarE in HB8: 232 AA, in JL-18: 78 AA). In T. oshimai JL-2, only nrcN is present. Theos_0161 and Theos_0162, orthologs of Wolinella succinogenes NrfA and NrfH [56], respectively, were identified in T. oshimai JL-2 suggesting that T. oshimai JL-2 may be capable of respiratory nitrite ammonification, although this phenotype has not yet been observed in Thermus [6].

Genes involved in iron reduction
T. scotoductus SA-01 has been reported to be capable of dissimilatory Fe 3+ reduction; however, the biochemical basis of iron reduction has not been elucidated in Thermus [41,59]. Sequences of proteins involved in iron reduction [60] in Shewanella oneidensis MR-1 (MtrA, MtrF, OmcA) and Geobacter sulfurreducens KN400 (OmcB, OmcE, OmcS, OmcT, OmcZ) were used as search queries into Thermus genomes using BLASTP. No hits were found in T. oshimai JL-2, T. thermophilus JL-18, or T. scotoductus SA-01. This suggests that the biochemical basis of iron reduction is distinct in Thermus compared to Shewanella and Geobacter, and offers no predictive information on whether T. oshimai JL-2 and T. thermophilus JL-18 may be able to respire iron.

Genes involved in sulfur oxidation
A complete sox cluster comprising of 15 genes, including soxCD, is present in T. oshimai JL-2 and T. thermophilus JL-18 genomes. SoxCD is essential for chemotrophic growth of P. pantotrophus [61]. Taken together, this suggests that T. oshimai JL-2 and T. thermophilus JL-18 may use thiosulfate as an electron donor and are similar to other sulfuroxidizing Thermus strains including T. scotoductus IT-7254 [62] and T. scotoductus SA-01 [41]. Other T. thermophilus genomes also harbor this gene cluster, suggesting thiosulfate oxidation may be widely distributed in Thermus [38].   A variety of chemotrophs and anoxygenic phototrophs can oxidize hydrogen sulfide, organic sulfur compounds, sulfite, and thiosulfate as electron donors for respiration [63]. Reconstituted proteins of SoxXA, SoxYZ, SoxB and SoxCD together, but not alone, mediate the oxidation of thiosulfate, sulfite, sulfur, and hydrogen sulfide in Paratrophus pantotrophus [61]. The absence of free intermediates of sulfur oxidation and the occurrence of sulfite oxidation without SoxCD in P. pantotrophus excludes SoxCD as a sulfite dehydrogenase and provides evidence to its role as a sulfur dehydrogenase with protein-bound sulfur atom [61].

Genes involved in DNA uptake
A significant number of genes in hyperthermophilic bacteria are of archaeal origin, and appear to have been acquired through inter-domain gene transfer [67], which is mediated by both transformation and conjugation systems [68]. T. thermophilus HB27 is naturally competent to both linear and circular DNA, and DNA transport mechanisms in this species have been well studied [69,70]. The genome of T. oshimai JL-2 and T. thermophilus JL-18 both contain homologs of DNA transport genes ( Table 5), suggesting that both T. oshimai JL-2 and T. thermophilus JL-18 are naturally competent.

Conclusions
We report the finished genomes of T. oshimai JL-2 and T. thermophilus JL-18. T. oshimai JL-2 is the first complete genome to be reported for this species, while T. thermophilus JL-18 is the fourth genome to be reported for T. thermophilus. Analysis of the genomes revealed that they encode enzymes for the reduction of nitrate to nitrous oxide, which is consistent with the high flux of nitrous oxide reported in GBS [6], and explains the truncated denitrification phenotype reported for many Thermus isolates obtained from that system [6]. It is intriguing that Thermus scotoductus SA-01 also has genes encoding the sequential reduction of nitrate to nitrous oxide but lacks genes encoding the nitrous oxide reductase. The high degree of synteny in the respiratory gene cluster combined with the conserved absence of the nitrous oxide reductase suggests incomplete denitrification might be a previously unrecognized but conserved feature of denitrification pathways in the genus Thermus, although T. thermophilus NAR1 appears to be capable of complete denitrification to N 2 [73]. Another unusual feature of the T. oshimai JL-2 and T. scotoductus SA-01 denitrification systems is the apparent presence of the NO-forming, Cucontaining nitrite reductase, NirK, and the isofunctional tetraheme cytochrome cd 1 -containing nitrite reductase, NirS. T. oshimai JL-2 and T. thermophilus JL-18 also may be capable of sulfur oxidation since they both encode a complete, chromosomal sox cluster. However, experiments with GBS sediments failed to demonstrate a stimulation of denitrification when thiosulfate was added in excess [74], suggesting thiosulfate oxidation may not be coupled to denitrification in these organisms. The presence of psrA, psrB and psrC genes encoding polysulfide Standards in Genomic Sciences reducatase in T. oshimai JL-2 suggests the ability to reduce polysulfide. The function of these putative pathways could be tested with pure cultures in the laboratory. The presence of complete macromolecular machinery for natural competence and the presence of megaplasmids harboring genes for nitrate/nitrite reduction and thermophily points out that T. oshimai JL-2 and T. thermophilus JL-18 could have acquired innumerable genes through intra-and inter-domain gene transfer, and suggests considerable plasticity in denitrification pathways. Considering the importance of these organisms in the nitrogen biogeochemical cycle, and their potential as sources of enzymes for biotechnology applications, the complete genome sequences of T. oshimai JL-2 and T. thermophilus JL-18 are valuable resources for both basic and applied research..  ComZ  Theos_1239  TtJL18_0832  IM protein, function unknown  PilM  Theos_0439  TtJL18_0669  ATPase, function unknown  PilN  Theos_0438  TtJL18_0668  IM protein, function unknown  PilO  Theos_0437  TtJL18_0667  IM protein, function unknown  PilW  Theos_0436  TtJL18_0666 OM protein, stabilization of PilQ † BLASTP analysis using sequences of known competence proteins from T. thermophilus HB27 as queries. Table   modified from [72].