Complete Genome Sequences of Thermus thermophilus Strains AA2-20 and AA2-29, Isolated from Arima Onsen in Japan

We isolated halophilic and thermophilic Thermus thermophilus strains AA2-20 and AA2-29 from nonvolcanic, oceanic Arima Onsen (hot spring) in Japan. Here, we report the complete genome sequences of these organisms to gain insights into halophilicity.

To prepare genomic DNA, the two strains were grown in TYSS broth at 65°C until saturation. Genomic DNA was extracted using the Nexttec 1-step DNA isolation kit for bacteria (Nexttec Biotechnologie) according to the manufacturer's instructions.
For long-read sequencing, the extracted genomic DNA was passed through a Circulomics short-read eliminator kit to remove short fragments. Sequencing was performed using a GridION X5 system (Oxford Nanopore Technologies [ONT]); the library was constructed from unfragmented genomic DNA (1.0 g) using a ligation sequencing kit (ONT) and applied to a FLO-MIN106 R9.41 flow cell (ONT). The long-read sequences, which were base called using Guppy v.3.0.3, further generated 278,671 reads (1,550 Mb) with an average length of 5,563 bases during a 24-h run time (the data are associated with quality-filtered reads with average phred quality values of Ͼ8.0, obtained using NanoFilt v.2.3.0 [11]). The longest read had 91,228 bases.
For short-read sequencing, the extracted genomic DNA was used for library preparation with the Nextera DNA library preparation kit (Illumina). Prepared libraries were subjected to 100-bp paired-end sequencing on the Illumina HiSeq 2500 platform. Adapters and low-quality sequencing data were trimmed using fastp v.0.14.1 (12), and 6.9 million paired-end reads (660 Mb) with an average length of 96 bases were obtained.
Hybrid assembly of long-read and short-read data was conducted using Unicycler v.0.4.7 (13), followed by a final polishing with Pilon v.1.23 (14), which resulted in the production of a single circular chromosome and a single circular plasmid. Automated annotation was performed using DFAST v.1.1.0 (15). Default parameters were used for all software unless otherwise noted.
The genome statistics and genomic features are listed in Table 1. A JSpecies analysis (16) revealed that the genomic sequences of AA2-20 and AA2-29 were nearly identical (97.2% average nucleotide identity) with no large gaps or rearrangement. Additionally, these sequences showed high average nucleotide identity (89.6%) with the genomic sequence of T. thermophilus JL-18 (GenBank accession number NC_017587), which was isolated from freshwater hot springs in Great Basin National Park, USA. The strains AA2-20 and AA2-29 differed from JL-18 with respect to large rearrangements in their sequences, including inversions and indels.
Data availability. The accession numbers for the complete genome sequences of T. thermophilus strains AA2-20 and AA2-29 are listed in Table 1