De novo genome sequence of endophytic biocontrol strain Bacillus sp. ST24 isolated from rice seed

ABSTRACT In this study, we present the genome sequence of Bacillus sp. strain ST24, an endophytic bacterium isolated from rice seeds. The genome assembly comprises a total of 5,799,877 bp, with a GC content of 34.81%. Furthermore, our analysis revealed the presence of various genes associated with antibiotic production, as well as genes involved in polyketide biosynthesis and non-ribosomal polyketide-like clusters.

B acillus species are endospore-forming aerobic or facultatively anaerobic, Gram-posi tive bacteria known to be ubiquitous in the environment, with recognized antag onistic ability against diverse phytopathogens (1)(2)(3)(4). Bacillus sp. ST24, an endophytic bacterium, was isolated from rice seeds collected from a research field in Beaumont, Texas. Approximately 20 g of seeds was ground in phosphate-buffered saline (PBS) buffer using a sterile mortar and pestle and plated on tryptic soy agar (TSA). Strain ST24 exhibited strong antagonistic activities against Rhizoctonia solani AG11 and R. solani AG4, the causative agents of rice seedling blight (5), in our in vitro dual-culture plate assays (6).
Genomic DNA extraction was performed using a Zymo Quick Fungal/Bacterial DNA kit following the manufacturer's instructions, from pure isolated colonies that had been grown on TSA at 27°C for 48 h, suspended in PBS buffer, and adjusted to 10 8 cells/mL. DNA integrity was evaluated through electrophoresis on a 1% agarose gel. The quality of the DNA was assessed in Quickdrop Micro-volume spectrophotometer (Molecular Devices), while quantification was performed using a Qubit double-stranded DNA high-sensitivity assay kit (Thermo Fisher Scientific). Illumina libraries were prepared using NEB Ultra II DNA Library Prep. Kit and genome sequencing were conducted by Novogene Corporation, Inc., using an Illumina-Novaseq 6000. Strain ST24 generated 9,70,156 paired end reads with insert size of 150 bp in length 220-fold genome coverage. Raw reads below Q30 were trimmed using TrimGalore v0.6.7 (7). High-quality reads were assembled de novo using MegaHit v.1.2.8 (8). All assembled reads were assembled into 53 contigs with 5,799,877 total bp. The largest contig and N 50 value of the assembly were 1,153,785 and 462,236 bp, respectively. The GC content of the assembly was 34.81%. Pilon v. 1.23 (9) and Quast v. 5.0.2 (10) were used to improve and assess the quality of the genome, respectively. BUSCO (11) assessed the genome assembly at 99.2% complete with 95.2% complete and single-copy BUSCOs. Finally, the assembly was annotated using NCBI prokaryotic genome annotation pipeline (12). Genome annotation predicted 5,816 genes with 5,585 protein coding genes, 19 genes encoding rRNA, 45 genes encoding tRNA, 5 ncRNAs, and 1 gene encoding tmRNA. Default parameters were used for all software unless otherwise stated.
The whole 16S rRNA gene of strain ST24 matched 100% with multiple B. cereus group through blast on EzTaxon server (13). The average nucleotide identify (ANI) (14) values for strain ST24 were 96.44% and 91.1% when compared to two B. cereus strains, ATCC 14579 and strain 30075, respectively. Additionally, the ANI values for strain ST24 were 96.38% and 96.39%, when compared to two B. thuringensis strains, ATCC 10792 and MYBT18246, respectively. Furthermore, the Type Genome Server (15) placed strain ST24 in the same species cluster with B. cereus ATCC 14579 and B. thuringensis ATCC 10792 but with different subspecies cluster.

ACKNOWLEDGMENTS
Portions of this research were conducted with the advanced computing resources provided by Texas A&M High Performance Research Computing.
This work was supported, in part, by Texas Rice Research Foundation (grants M2101762 and M2202432) and USDA-Hatch (TEX09714).