Complete Genome Sequence of the Prototrophic Bacillus subtilis subsp. subtilis Strain SP1

Here, we present the complete genome sequence of the Bacillus subtilis strain SP1. This strain is a descendant of the laboratory strain 168. The strain is suitable for biotechnological applications because the prototrophy for tryptophan has been restored. Due to laboratory cultivation, the strain has acquired 24 additional sequence variations.

T he Bacillus subtilis strain SP1 is a descendant of the highly transformable laboratory strain 168. After the creation of B. subtilis strain 168 by X-irradiation, the strain developed genetic competence under laboratory conditions (1) and through this became a model bacterium for basic and applied research (2,3). The plasmid-free strain was also subjected to genome reduction in order to identify the genes required for growth under defined conditions (4)(5)(6). However, the X-ray treatment made strain 168 auxotrophic for tryptophan due to a 3-bp deletion in the trpC gene (7,8). The trpC gene encodes indole-3-glycerol phosphate synthase, which catalyzes an essential step in the de novo synthesis of tryptophan. The tryptophan prototrophy in the B. subtilis strain SP1 was restored by transforming strain 168 with a DNA fragment containing the wild-type trpC gene of the B. subtilis Marburg strain ATCC 6051 (9). The simplified medium requirements make strain SP1 suitable for industrial applications like vitamin production (9, 10) and basic research like examining the influence of substances that inhibit the biosynthesis of the aromatic amino acids phenylalanine, tryptophan, and tyrosine (11). To uncover all sequence variations between strain 168 and strain SP1 that might have occurred during the construction of SP1 (9), we have sequenced its genome.
The chromosomal DNA was isolated from growing cells using a commercially available kit (peqGOLD bacterial DNA kit; VWR International GmbH). Illumina (San Diego, CA, USA) paired-end sequencing libraries were generated with the Nextera XT DNA sample preparation kit and sequenced with the MiSeq system and reagent kit v.3 (2 ϫ 300 bp) as recommended by the manufacturer (Illumina). Base calling was performed with MiSeq Control Software v.2.6.2.1. Default parameters were used for all software unless otherwise specified. The paired-end reads obtained (2.4 million) were quality processed and adapter trimmed with Trimmomatic v.0.39 (12). The 2.3 million high-quality paired-end reads recovered were used for single-nucleotide polymorphism analysis with Geneious Prime v.2020.0.5 (Biomatters, Ltd., Auckland, New Zealand) employing the genome of strain 168 (GenBank accession number NC_000964), as described (13,14). Sequence deviations, identified with an average coverage of 135-fold and a minimum variant frequency of 0.98, confirmed the recovery of the trpC locus in strain SP1, which is responsible for its prototrophic phenotype, and revealed 24 additional genome modifications, which are summarized in Table 1. The identified sequence deviations were applied to the genome sequence of strain 168, which resembles the genome of strain SP1. The final genome sequence of SP1 was extracted with Geneious Prime v.2020.0.5 (Biomatters, Ltd.) and cross-verified with breseq v.0.35.1 (15) to ensure genome consistency and to confirm the absence of structural rearrangements. The single circular genome of SP1 resembled the genome of the ancestor strain 168 in all genetic properties except for the genome size, which was 7 bp larger for SP1, with 4,215,613 bp. Automated gene annotation was carried out by the Prokaryotic Genome Annotation Pipeline (PGAP) (16).
Data availability. The genome sequence of the B. subtilis subsp. subtilis strain SP1 has been deposited in GenBank under the accession number CP058242.1. The raw sequence reads have been submitted to the NCBI Sequence Read Archive (SRA) database (17) under the accession number SRR12076664. The BioProject accession number is PRJNA641411, and the BioSample accession number is SAMN15352467.

ACKNOWLEDGMENTS
This project received funding from the European Union Horizon 2020 research and innovation program under grant agreement 720776 and from the Federal Ministry of Education and Research (Hessen Agentur, German BioIndustry 2021 Program to DSM Nutritional Products Ltd.) The funders had no role in the study design, data collection and interpretation, or the decision to submit the work for publication.
We thank Melanie Heinemann and Sarah-Teresa Schüßler for technical assistance.