Data on draft genome sequence of Caldanaerobacter sp. strain 1523vc, a thermophilic bacterium, isolated from a hot spring of Uzon Caldera, (Kamchatka, Russia)

The draft genome sequence of Caldanaerobacter sp. strain 1523vc, a thermophilic bacterium, isolated from a hot spring of Uzon Caldera, (Kamchatka, Russia) is presented. The complete genome assembly was of 2 713 207 bp with predicted completeness of 99.38%. Genome structural annotation revealed 2674 protein-coding genes, 127 pseudogenes and 77 RNA genes. Pangenome analysis of 7 currently available high quality Caldanaerobacter spp. genomes including 1523vc revealed 4673 gene clusters. Of them, 1130 clusters formed a core genome of genus Caldanaerobacter. Of the rest 3543 Caldanaerobacter pangenome genes, 385 were exclusively represented in 1523vc genome. 101 of 2801 Caldanaerobacter CDS were found to be encoding carbohydrate-active enzymes (CAZymes). The majority of CAZymes were predicted to be involved in degradation of beta-linked polysaccharides as chitin, cellulose and hemicelluloses, reflecting the metabolism of strain 1523vc, isolated on cellulose. 5 of 101 CAZyme genes were found to be unique for the strain 1523vc and belonged to GH23, GT56, GH15 and two CE9 family proteins. The draft genome of strain 1523vc was deposited at DBJ/EMBL/GenBank under the accessions JABEQB000000000, PRJNA629090 and SAMN14766777 for Genome, Bioproject and Biosample, respectively.


a b s t r a c t
The draft genome sequence of Caldanaerobacter sp. strain 1523vc, a thermophilic bacterium, isolated from a hot spring of Uzon Caldera, (Kamchatka, Russia) is presented. The complete genome assembly was of 2 713 207 bp with predicted completeness of 99.38%. Genome structural annotation revealed 2674 protein-coding genes, 127 pseudogenes and 77 RNA genes. Pangenome analysis of 7 currently available high quality Caldanaerobacter spp. genomes including 1523vc revealed 4673 gene clusters. Of them, 1130 clusters formed a core genome of genus Caldanaerobacter . Of the rest 3543 Caldanaerobacter pangenome genes, 385 were exclusively represented in 1523vc genome. 101 of 2801 Caldanaerobacter CDS were found to be encoding carbohydrate-active enzymes (CAZymes). The majority of CAZymes were predicted to be involved in degradation of beta-linked polysaccharides as chitin, cellulose and hemicelluloses, reflecting the metabolism of strain 1523vc, isolated on cellulose. 5 of 101 CAZyme genes were found to be unique for the strain 1523vc and belonged to GH23, GT56, GH15 and two CE9 family proteins.
© 2020 The Author(s

Value of the Data
• Genome data for Caldanaerobacter sp. 1523vc can be used for genome-based phylogenetic and evolutionary analysis of Caldanaerobacter genus • 385 of 3543 Caldanaerobacter pangenome genes were found to be represented exclusively in strain 1523vc genome. Among them are several carbohydrate-active enzymes (CAZymes, http: //www.cazy.org ) attributed to GH23, GT56 and GH15 and two CE9 family proteins, which can be further explored by biotechnologists using heterologous expression and activity analysis • The genome encodes a high number of CAZymes, participating in degradation of various beta-glucans, which could be relevant to various applications, including 2nd generation bioethanol production, as well as pulp and food industries. Genomic data, presented in this article unlock the coding potential of strain 1523vc for further biochemical analysis of its enzymes in the scope of biotechnological applications

Data Description
Caldanaerobacter is a genus of Firmicutes phylum, which was proposed by Fardeau et al., in 2004 upon isolation of two thermophilic bacterial strains and reclassification of three species, formerly representing the genus Thermoanaerobacter as well as Carboxydibrachium pacificum [1] . Later, a second species of the genus was proposed by Kozina and co-authors in 2010 [2] . The members of the genus are Gram-positive thermophilic strictly anaerobic chemoorganoheterotrophic bacteria, growing on carbohydrates and proteinaceous substrates. Among the biopolymers, known to be hydrolyzed by the genus members are xylan, starch and agarose [1,2] as well as keratins [3,4] .
Strain 1523vc was isolated from an in situ enrichment culture proliferating on a linen rope in a 70 °C hot spring, and it is a first Caldanaerobacter representative, capable of growing on microcrystalline and carboxymethyl cellulose [4] .
Strain 1523vc genome was sequenced using Illumina MiSeq TM platform. The complete genome assembly was of 2 713 207 bp with GC-content of 37.2 mol%. Completeness of the assembly was estimated to be 99.38%. Analysis of average nucleotide identity of 1523vc and genomes of Caldanaerobacter spp. ( Fig. 1 , Supplementary Table 2) showed that strain 1523vc is closely related to C.subterraneus subsp. yonseiensis, which was also isolated from a geothermal hot spring [1,5] .

Strain isolation and deposition into collection
Strain 1523vc isolation procedure was described previously [4] . The strain is maintained in the extremophiles metabolism laboratory (Winogradsky Institute of Microbiology, now a part of FRC "Biotechnology", RAS) collection by annual transfer on the medium, described previously [4] . For genomic sequencing one liter of the same medium was prepared, and strain 1523vc was cultivated in its optimal growth conditions. The grown cells were harvested by centrifugation at 120 0 0 g .

DNA extraction, library preparation and sequencing
Genomic DNA was isolated using ISOLATE II Genome DNA kit (Bioline, UK). Fragmentation of genomic DNA was performed with Bioruptor TM sonicator (Diagenode, Belgium) to achieve an average fragment length of 400 bp. Further steps of library preparation were performed with NEBNext® Ultra TM fragment library kit (New England BioLabs) according to the manufacturer's instructions. Bead-based size-selection was performed to get fragment sizes in the range of 300-500 bp. Sequencing was done with Illumina MiSeq TM platform (Illumina, USA) using 300 cycles paired-end sequencing reagents. 1,600,832 read pairs were obtained from the sequencing run.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships which have, or could be perceived to have, influenced the work reported in this article.