Draft genome sequence data of plant growth promoting and calcium carbonate precipitating Bacillus velezensis CMU008

A Gram-positive, spore forming bacterium designated as strain CMU008 was isolated from a soil sample in Chiang Mai University campus, Chiang Mai, Thailand. This strain is able to precipitate calcium carbonate and promote growth of sunflower sprouts. The whole genome sequencing was done using Illumina MiSeq platform. The draft genome of strain CMU008 was 4,016,758 bp in length with 4,220 protein coding sequences and an average G + C content of 46.01 mol%. The ANIb values of strain CMU008 and the type strains of its closely related neighbors, Bacillus velezensis NRRL B-41580T and B. velezensis KCTC13012T were 98.52%. Phylogenomic tree also supports the assignment of strain CMU008 as B. velezensis. The genome sequence data of B. velezensis strain CMU008 provide insightful information for the taxonomic characterization and further biotechnological exploitation of this strain. The draft genome sequence data of B. velezensis strain CMU008 has been deposited in the DDBJ/EMBL/GenBank databases under the accession number JAOSYX000000000.


Specifications
Biology Specific subject area Microbiology, genomics Type of data Table  Figure Draft genome sequence How the data were acquired Genome sequencing was performed on Illumina MiSeq Sequencer at Omics Science and Bioinformatics Center, Faculty of Science, Chulalongkorn University, Bangkok, Thailand Data format Raw, analyzed and assembled genome sequence Description of data collection A pure culture of Bacillus velezensis CMU008 was routinely cultured on tryptic soy agar (TSA) at 37 °C. Genomic DNA was extracted from a 24 h culture on TSA and used as template for sequencing reaction Data source location

Value of the Data
• The draft genome of Bacillus velezensis CMU008 can provide insights for the understanding of calcium carbonate precipitation and plant growth promotion potential. • These data are valuable resources for researchers working in the field of Microbiology, Genomics, and Molecular Biology. • This genome data can be used for comparative genomics of members of the genus Bacillus for biotechnological and taxonomic purposes and allow in depth analysis of Bacillus velezensis CMU008 via genome mining.

Objectives
The generation of draft genome sequence of Bacillus velezensis CMU008 aims to identify protein encoding genes that are responsible for calcium carbonate precipitation and plant growth promotion. The identification of such genes in B. velezensis CMU008 genome is important for the understanding of molecular mechanisms that can support calcium carbonate precipitating and plant growth promoting abilities of this bacteria for further applications.

Data Description
Bacillus sp. strain CMU008 was isolated from garden soil collected from Chiang Mai University, Chiang Mai, Thailand (18.48 N 98.57 E). Top soil (5 cm depth) was collected using sterile spoon. The strain was isolated by dilution spread plate on nutrient agar. Strain CMU008 was able to precipitate calcium carbonate (CaCO 3 ) via the production of urease enzyme. Bacillus sp. strain CMU008 exhibited ability to promote the growth of sunflower sprout and increase CaCO 3 precipitation in soil. Here we report the genome sequence of Bacillus sp. strain CMU008 to facilitate further study of gene related to CaCO 3 precipitation and plant growth promotion. Table 1 summarized the genome characteristics of Bacillus sp. strain CMU008. The draft genome contains 52 contigs with genome length of 4016,758 bp, N50 and L50 value of 174, 703 and 7, respectively. The genome contains 4220 protein coding sequence (CDS), 75 tRNAs genes,  1. The distribution of annotated genomic features. This includes, from outer to inner rings, the contigs, CDS on the forward strand, CDS on the reverse strand, RNA genes, CDS with homology to know antimicrobial resistance genes, CDS with homology to known virulence factors, GC content and GC skew. The color of the CDS on the forward and reverse strand indicates the subsystem that these genes belong to (Fig. 2).
3 rRNA genes with 46.01 G + C content (%). The data have been deposited in GenBank and can be viewed at https://www.ncbi.nlm.nih.gov/bioproject/886473 The annotated genome of Bacillus sp. strain CMU008 was analyzed using the PATRIC genome analysis server ( https://www.patricbrc.org/ ) [1] to identify the genome and protein features which revealed two types of protein families [2] . The 3899 proteins belong to genus-specific protein families (PLFams) and 4024 proteins belong to the cross-genus protein families (PGFams) . A circular map of Bacillus sp. strain CMU008 genome presents the distribution of genome annotation ( Fig. 1 ). Fig. 2 displays the subsystem of proteins which were categorized into 239 subsystems with 11 biological processes. The number of genes assigned to each biological processes is as followed: metabolism (731), cellular processes (233), protein processing (223), energy (213), stress response, defense and virulence (132), DNA processing (83), membrane transport (79), RNA processing (52), cell envelop (15), miscellaneous (10) and regulation and cell signaling (10).   The genome annotation of Bacillus sp. strain CMU008 sequence in PATRIC uses k-mer-based antimicrobial resistance (AMR) genes detection method which utilizes PATRIC's curated collection of representative AMR gene sequence variants [1] . Each AMR gene functional annotation, broad mechanism of antibiotic resistance, drug class and, in some cases, specific antibiotic it confers resistance to, are shown in Table 2 .

Bacterial cultivation and genomic DNA extraction
Bacillus velezensis CMU008 was cultured on tryptic soy agar (TAS) at 37 °C for 24 h. Genomic DNA extraction was performed by the following procedures. Bacterial cells were lysed by extraction buffer. Phenol was added to remove protein and centrifuged at 15,0 0 0 rpm for 5 min at 4 °C. The supernatant was repeated for phenol extraction step. To precipitate DNA, sodium acetate, isopropanol and absolute ethanol were added and incubated at −20 °C for 15 min. After incubation, precipitated DNA was harvested by centrifugation at 15,0 0 0 rpm for 5 min at 4 °C. Then, DNA was washed by 70% (v/v) ethanol and centrifuged at 15,0 0 0 rpm for 5 min at 4 °C. DNA was dried for 30 min and dissolved with sterile ultrapure water.

Whole genome sequencing, assembly annotation and analysis
The genomic DNA of Bacillus velezensis CMU008 was sequenced using service of Omics Science and Bioinformatics Center, Faculty of Science, Chulalongkorn University, Bangkok, Thailand.
The genomic DNA library was prepared using QIASEQ FX DNA library preparation kit (Qiagen, USA). The libraries were sequenced on Illumina MiSeq sequencer in 2 × 250 bp paired end. Raw reads quality was checked using FASTQC software version 0.11.9 [3] . Adaptors and poor-quality reads were removed using Fastp version 0.23.2 [4] , and the filtered reads were used as an input for Unicycler, genome assembly program [5] . Annotation of assembled genome was done using the PATRIC RASTtk-enabled Genome Annotation Service [6] . In addition, ANIb was calculated and compared using JSpeciesWS version: 3.9.7, web server tool [7] . Phylogenomic tree was constructed using the Type (Strain) Genome Server (TYGS) [ 8 , 9 ] ( https://tygs.dsmz.de/ ). All software was run with default parameters.

Ethics Statements
This study did not involve any human subjects and animal experiments. No ethical approval was required.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.