Genome sequence of Anoxybacillus ayderensis AB04T isolated from the Ayder hot spring in Turkey

Species of Anoxybacillus are thermophiles and, therefore, their enzymes are suitable for many biotechnological applications. Anoxybacillus ayderensis AB04T (= NCIMB 13972T = NCCB 100050T) was isolated from the Ayder hot spring in Rize, Turkey, and is one of the earliest described Anoxybacillus type strains. The present work reports the cellular features of A. ayderensis AB04T, together with a high-quality draft genome sequence and its annotation. The genome is 2,832,347 bp long (74 contigs) and contains 2,895 protein-coding sequences and 103 RNA genes including 14 rRNAs, 88 tRNAs, and 1 tmRNA. Based on the genome annotation of strain AB04T, we identified genes encoding various glycoside hydrolases that are important for carbohydrate-related industries, which we compared with those of other, sequenced Anoxybacillus spp. Insights into under-explored industrially applicable enzymes and the possible applications of strain AB04T were also described.

In the present report, we describe the cellular features of A. ayderensis AB04 T and we present a high-quality annotated draft genome of strain AB04 T . Additionally, we provide a comparative analysis of the GHs of strain AB04 T and other sequenced Anoxybacillus spp. In addition, we discuss the presence of other under-explored industrial enzymes and the potential applications of the bacterium.

Classification and features
A. ayderensis AB04 T (= NCIMB 13972 T = NCCB 100050 T ) was isolated from mud and water samples from the Ayder hot spring located in the province of Rize in Turkey [30]. Microscopic examination revealed that colonies of strain AB04 T were cream-colored, regular in shape with round edges, and 1-2 mm in diameter.

Genome project history
Genomic studies on the genus Anoxybacillus are relatively limited [45]. Hence, the findings of the genomic study on A. ayderensis AB04 T presented in this study are important because they contribute to the body knowledge of the Anoxybacillus genomes. This whole-genome shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession number JXTG00000000. The NCBI BioProject accession number is PRJNA258494. The GOLD Project ID for strain AB04 T is Gp0026071. Table 2 presents the project information and its association with MIGS version 2.0 compliance.

Growth conditions and genomic DNA preparation
A. ayderensis AB04 T was plated on Nutrient Agar (pH 7.5) and incubated at 50°C for 18 h. A single colony was transferred into Nutrient Broth (pH 7.5) and incubated at 50°C with rotary shaking at 200 rpm for 18 h. The cells were harvested by centrifugation at 10,000 × g for 5 min using a Microfuge ® 16 centrifuge (Beckman Coulter, Brea, CA, USA). Genomic DNA was extracted using a Qiagen DNeasy Blood & Tissue Kit (Qiagen, Hilden, Germany) according to the manufacturer's protocol. The purity, quality, and concentration of the genomic DNA were determined using a 6 % (w/v) agarose gel, NanoDrop 1000 spectrophotometer (Thermo Scientific, Wilmington, DE, USA), and Qubit 2.0 fluorometer (Invitrogen, Merelbeke, Belgium).

Genome annotation
Genes, tRNAs and tmRNAs, and rRNAs were predicted with Prodigal [54], ARAGORN [55], and RNAmmer [56], respectively. For functional annotation, the predicted coding sequences were translated and used to search for the closest matches in the NCBI non-redundant database and the UniProt [57], TIGRFAM [58], Pfam [59], CRISPRfinder [60], PRIAM [61], KEGG [62], COG [63], and InterProScan 5 [64] databases. The GHs were identified and verified  Class Bacilli TAS [8,9] Order Bacillales TAS [1,10] Family Bacillaceae TAS [1,2] Genus Anoxybacillus TAS [5,6] Species Anoxybacillus ayderensis TAS [30] Type strain: AB04 T (NCIMB 13972 T , NCCB 100050 T ) TAS [30] Gram stain Positive TAS [30] Cell Evidence codes -IDA Inferred from Direct Assay, TAS Traceable Author Statement (i.e., a direct report exists in the literature), NAS Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [76] Fig. 2 Phylogenetic tree based on 16S rRNA gene sequences showing the relationship between A. ayderensis AB04 T and representative Anoxybacillus spp. The 16S rRNA accession number for each strain is shown in brackets. The 16S rRNA sequences were aligned using ClustalW and the tree was constructed using the ML method with 1000 bootstrap replicates embedded in the MEGA6.0 package [49]. The scale bar represents 0.01 nucleotide substitutions per position. Brevibacillus brevis NCIMB 9372 T [77] was used as an out-group. Type strains are indicated with a superscript T. Published genomes are indicated in blue using the dbCAN CAZy [65], NCBI BLASTp, and Inter-ProScan 5 [64] databases. Genome comparison was done by the ANI function in the EzTaxon-e database [66].

Genome properties
The overall genome coverage was approximately 239-fold. The draft genome was assembled into 74 contigs with a total length of 2,832,347 bp and a G + C content of 41.8 % (Fig. 3 and Table 3). The longest and shortest contigs were 448,584 bp and 606 bp, respectively. The mean length of the contigs was 38,275 bp and the N50 contig length was 112,260 bp. We did not detect any additional DNA elements. The genome consisted of 2,998 predicted genes, of which 2,895 were protein-coding sequences and 103 were RNA genes including 14 rRNAs, 88 tRNAs, and 1 tmRNA. A total of 235 (8.1 %) genes were assigned a putative function. The remaining annotated genes (1023; 35.3 %) were hypothetical proteins. The properties and the statistics of the genome are summarized in Table 3. The distribution of genes into COGs and KEGG functional categories is presented in Table 4 and Fig. 3. The genome sizes of the currently sequenced Anoxybacillus spp. are shown in Fig. 2. Most of the reported Anoxybacillus draft genome sizes are between 2.60 and 2.86 Mb [31, 33, 38-40, 43-45, 47], and the completely sequenced A. flavithermus WK1 genome has a size of 2.85 Mb [29]. The incomplete genome sequence of A. tepidamans PS2 has a size of 3.36 Mb (Fig. 2), which is the largest Anoxybacillus genome sequenced to date [37]. However, cumulative information on the Anoxybacillus genomes ( Fig. 2) indicates that Anoxybacillus has a smaller genome size than the closest genus, Geobacillus (~3.50 Mb) [27,45]. The genomes of other genera within Bacillaceae such as Bacillus [1,28] and Lysinibacillus [67] are at least 40 % larger than that of Anoxybacillus [5,6,45]. The average G + C content of the Geobacillus spp. genomes (~50.0 %) [27,45] is slightly higher than that of the A. ayderensis [30] genome (Fig. 2), while most Bacillus genomes have less than 40 % G + C content [1,28,45]. Table 5 summarizes the pairwise ANI values of Anoxybacillus spp. [66]. A. ayderensis AB04 T showed the highest ANI of 97.6 % with Anoxybacillus sp. SK3-4 [46]. As this ANI value is greater than 95 % [68], Anoxybacillus sp. SK3-4 [45,46] is likely to be a subspecies of A. ayderensis [30].

Analysis of the GHs in A. ayderensis AB04 T and other Anoxybacillus genomes
We detected 14 genes in the AB04 T genome encoding GH enzymes belonging to GH families 1, 10, 13, 31, 32, 51, 52, and 67 (Table 6). On average, the AB04 T GHs shared 93.9 % similarity with GHs identified in other Anoxybacillus spp. The GHs could be grouped into two types according to their predicted catalytic ability (Table 6). Nine GH enzymes were predicted to be active on α-chain polysaccharides whereas the remaining five GH enzymes were specific for β-linked polysaccharides (i.e., cellulose and xylan).
Interestingly, we found two GH enzymes that were uniquely present in strain AB04 T : endo-1,4-β-xylanase (NCBI locus ID: KIP21668) and α-glucuronidase (KIP 21917) ( Table 6). The closest homologs of endo-1,4-βxylanase and α-glucuronidase were found in Geobacillus thermoglucosidans and Geobacillus stearothermophilus The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome The total is based on the total number of protein coding genes in the annotated genome with 81.9 % and 87.1 % sequence similarity, respectively [27].
Other A. ayderensis AB04 T enzymes with potential applications Apart from the GHs, we found that A. ayderensis AB04 T has genes coding for other industrially important enzymes such as xylose isomerase, esterase, and aldolase. Xylose isomerase (EC 5.3.1.5) catalyzes the isomerization of xylose to xylulose and of glucose to fructose, which is important in the industrial production of high-fructose corn syrup [20]. Earlier, a xylose isomerase from A. gonensis G2 T was characterized and the enzyme displays 96.8 % amino acid sequence similarity to the one identified in strain AB04 T (KIP21927) [20].
In addition, strain AB04 T carries genes for an arsenate reductase (KIP20402) and an arsenic efflux pump protein (KIP20401). The function of these genes will be studied in the close future.

Conclusions
Knowledge on the genomics, industrial enzymes, and relevant applications of Anoxybacillus spp. are rather limited compared to that in their closest relatives, Geobacillus and Bacillus. In the present work we presented a whole-genome sequence of A. ayderensis AB04 T and its annotation. Additionally, we provided insights into several GHs, under-explored enzymes, and putative applications of strain AB04 T .