Short Genome Communications Complete genome sequence of the nematicidal Bacillus thuringiensis MYBT18247

The Gram-positive spore forming bacterium Bacillus thuringiensis MYBT18247 encodes three cry toxin genes, ( cry 6Ba2, cry 6Ba3 and cry 21-like) which are active against nematodes. For a better understanding of the evolution of virulence and cry toxins, we present here the complete genome sequence of Bacillus thuringiensis MYBT18247. Various additional virulence factors such as bacteriocins, proteases and hemolysins were identi-ﬁ ed. In addition, the methylome and the metabolic potential of the strain were analyzed and the strain phy- logenetically classi ﬁ ed.


A B S T R A C T
The Gram-positive spore forming bacterium Bacillus thuringiensis MYBT18247 encodes three cry toxin genes, (cry6Ba2, cry6Ba3 and cry21-like) which are active against nematodes. For a better understanding of the evolution of virulence and cry toxins, we present here the complete genome sequence of Bacillus thuringiensis MYBT18247. Various additional virulence factors such as bacteriocins, proteases and hemolysins were identified. In addition, the methylome and the metabolic potential of the strain were analyzed and the strain phylogenetically classified.
Bacillus thuringiensis is a ubiquitous, Gram-positive, spore-forming, bacterium . Strains of the species are used as a biopesticide because of their ability to produce parasporal protein crystals (Bechtel and Bulla, 1976;Ibrahim et al., 2010). These protein crystals consist of δ-endotoxins which are active against a broad spectrum of invertebrates including species of the orders Lepidoptera, Diptera, Coleoptera, Hymenoptera, Homoptera, Orthoptera, Mallophaga as well as mites, protozoa and nematodes (Feitelson, 1993;Schnepf et al., 1998). Here we present the annotated genome sequence of B. thuringiensis MYBT18247 that was isolated and used for single as well as multiple infection co-evolution experiments within the nematode Caenorhabditis elegans (Masri et al., 2015;Schulte et al., 2010).
Genomic DNA was isolated using the DNeasy blood and tissue kit (Qiagen, Hilden, Germany) and the Genomic-Tip 100/G Kit (Qiagen, Hilden, Germany). For 454 pyrosequencing the genomic DNA was sheared (∼700 bp), end repaired and universal barcoded sequencing adaptors were ligated (Rapid-GSFLX Titanium, Roche 454, Branford CT). The library preparation was done with the GS Titanium Sequencing Kit XLR70t (Roche 454, Branford CT). Illumina sequencing DNA libraries were generated with the Nextera XT DNA Library Prep Kit (Illumina, San Diego, USA). For SMRT-sequencing (Pacific Biosciences, Menlo Park, USA) the C2/P4 chemistry was applied for three SMRT-cells and C4/P6 chemistry were used for two additional SMRT-cells, respectively. Whole-genome sequencing was performed using the 454 GS-FLX instrument, the Genome analyzer IIx (Illumina, San Diego, USA) and the PacBio RSII system (Pacific Biosciences, Menlo Park, USA). The 454 shotgun sequencing produced 335,141 single-end reads with an average read length of 420 bp. The Newbler 2.8 de novo assembler (Roche Diagnostics) assembled the reads into 411 contigs with a coverage of 18 x. A hybrid assembly was performed with Mira 4.0.3 (http://mira-assembler.sourceforge.net/docs/ DefinitiveGuideToMIRA.html) by using 4,000,000 (112 bp) paired-end Illumina reads and 30,952 PacBio reads (C2-chemistry) with an average read length of 5053 bp. An HGAP 2.3.0 assembly using 67,045 PacBioreads (P6-chemistry) with a mean length of 13,802 bp resulted in an average coverage of 124.46 x. The assemblies were manually combined and contradictions were resolved by Sanger sequencing using BigDye 3.0 chemistry and an ABI3730XL capillary sequencer (Applied Biosystems, Life Technology GmbH, Darmstadt, Germany). All sequence positions were manually checked using Gap4 (v4.11) of the Staden package (Staden et al., 1999) to ensure the sequence quality.
Annotation was conducted with Prokka v.1.9 (Seemann, 2014). The initial annotation was performed using Bacillus thuringiensis Bt407 as a high quality species reference (Sheppard et al., 2013) and a comprehensive toxin protein database (including all Cry, Cyt, Vip, Sip toxins) as a feature reference set. The annotations of detected cry toxin genes were manually corrected and confirmed by Crickmore   .
The genome size of B. thuringiensis MYBT18247 is 6,138,199 bp with an average GC-content of 35% (Table 1). The data set consists of a chromosome of 5.6 Mbp and six plasmids which vary in size from 12.5 kb to 175 kb. One plasmid contig of 130 kb could not be circularized due to large repetitive regions at both contig ends. The chromosome encodes 6210 protein-coding and 156 RNA genes, including 14 rRNA clusters (5S, 16S, 23S rRNA). The plasmids encode 556 protein coding genes. There were 3727 genes assigned to COG database (Table 2) and 1114 genes were identified in MetaCyc metabolic pathways (Caspi et al., 2016). Three nematicidal cry toxin genes have been identified on plasmid p174778 (cry6Ba2; BTI247_60340), p113275 (cry6Ba3, BTI247_62380) and p15092 (cry21-like, BTI247_64000) (Table 3).
B. thuringiensis MYBT18247 comprises a plethora of virulence factors which might all contribute to pathogenicity. The chromosome encodes two chitinases, which enable the degradation of chitin in the midgut peritrophic membrane of many insects and have been identified as insect virulence factors (Sampson and Gooday, 1998). Additionally,   proteases including four camelysins, six collagenases, and four phospholipases were identified and may play an important role in activating protoxins (Nisnevitch et al., 2010), destruction of the intestine of C. elegans (Peng et al., 2016) and in hydrolyzing phospholipids of host cell membranes (Hergenrother and Martin, 1997). Furthermore, five immune inhibitor A metalloproteases, four bacillolysins and an N-acyl homoserine lactonase were detected, suspicious for boosting the nematicidal or insecticidal activity, as well as overcoming the host immune system by cleaving host antibacterial peptides (Fedhila et al., 2002;Park et al., 2008;Raymond et al., 2010). The host gut microbiome is part of the immune response and bacterial secondary metabolites such as bacteriocins and microcins are able to suppress other pathogenic microbes. In the B. thuringiensis MYBT18247 genome eleven biosynthetic clusters including bacteriocins, siderophores, NRPS (nonribosomal peptide synthetases) and terpene clusters were detected with AntiSMASH3.0.5 (Weber et al., 2015) (Table 4). Four potential bacteriocins were further validated with BAGEL3 (van Heel et al., 2013) (Table 3). In the chromosome a putative bacteriocin class III and a LAP (linear azol(ine) containing peptide) were classified. In the plasmids p174778 and p81952 one head to tail cyclized bacteriocin class IId (PF09221) were identified. The cyclic bacteriocins are active against a broad spectrum of Gram-positive and Gram-negative bacteria (Finn et al., 2016). B. thuringiensis MYBT18247 encodes as well genes for hemolysis and non-haemolytic enterotoxins (Table 3), which have an important effect during the infection of hosts (Argolo-Filho and Loguercio, 2014;Kim et al., 2015). However, a test on Columbia agar revealed a hemolytic negative phenotype. This might correlate to the observation that the corresponding genome locus misses the hemolysin BL lytic component L2, one essential part of the tripartite toxin. The plasmids encode a number of genes dedicated to genome fluidity such as transposases, insertion sequences, transcriptional regulators and recombinases.
The metabolic versatility of B. thuringiensis MYBT18247 has been evaluated using BIOLOG Phenotypic Microarrays (PM1-PM2). The strain utilized substrates assigned feeding into glycolysis, citrate cycle and pentose phosphate pathway. It metabolizes various carbon sources including sugars, sugar alcohols and sugar acids which have been found in fruits, fungi, insect compartments or plant compartments. Apparently, the strain is a generalist which is able to adapt to various ecological niches.
The methylome analysis of B. thuringiensis MYBT18247 was performed using the SMRT Portal v2.3.0 analysis platform. Four methylated N 6 -methyladenine (m6A) motifs were observed. The non-palindromic motif (CRTANNNNNNNRTTNC/GNAAYNNNNNNNTAYG) was found to be methylated in more than 65% of all instances. Additionally, three N4-methylcytosin (m4C) motifs were detected with a methylation grade of 6-21% and in four putative motifs the identification of the methylation was not possible due to insufficient coverage. The SMRT data DNA methyltransferase recognition motifs are deposited in REBASE at http://rebase.neb.com/rebase/private/pacbio_Liesegang23. html (Roberts et al., 2015).
Phylogenetic classification of B. thuringiensis MYBT18247 within the Bacillus cereus sensu lato group, was performed by multi-locus-sequence typing (MLST) according to Priest et al. (Priest et al., 2004) (Fig. 1). The strain clusters within a subgroup, comprising B. thuringiensis MYBT18246, B. thuringiensis YBT-1518, B. thuringiensis Bt407, and B. thuringiensis serovar chinensis CT-43. Strikingly, all members of the cluster except B. thuringiensis 407, which is an artificially cured laboratory strain, encode at least one nematicidal or insecticidal cry toxin gene.
Summarized, B. thuringiensis MYBT18247 comprises a variety of promising genes, including the rare cry6Ba which show high potential for the development of new biotechnological relevant nematicidal and insecticidal control agents.

Nucleotide sequence accession numbers
The whole genome sequence has been deposited at the DDBJ/ EMBL/GenBank with the accession numbers CP015250.1-CP015256.1. The strain is available from DSMZ (Braunschweig, Germany) under accession 104068.

Acknowledgements
We thank Dr. Andrea Thürmer, Nicole Heyer and Simone Severitt for sequencing support. This project was funded by the German BTI247_54170 chromosome a all protein sequences of the genome were scanned with generic HMM models constructed from representative sequences of the known Cry/Cyt/VIP-toxins extracted from SwissProt and the Bt toxin database (http://www.btnomenclature.info/). Identified sequences were characterized by a procedure described by the international committee for cry-Toxins , verified by Crickmore and deposited at the BT toxin database. b AntiSMASH3.0.5 (Weber et al., 2015) and BAGEL3 (van Heel et al., 2013) were used for identification. c Predicted by Prodigal (Hyatt et al., 2010) and annotated by BLAST+ (Camacho et al., 2009) comparisons to reference-proteins from the Prokka genome annotation pipeline (Seemann, 2014). Annotation was primarily derived from Bacillus thuringiensis Bt407 as a high quality species reference (Sheppard et al., 2013).  GenBank accession numbers are given in parentheses. Comparison includes representative strains of Bcsl group members (blue). Bacillus subtilis subsp. subtilis str. 168 has been used as outlier to root the tree. Sequences were aligned using ClustalW 1.6 ( Thompson et al., 1994). The phylogenetic tree was constructed by using the Neighbor-Joining method (Saitou and Nei, 1987) and evolutionary distances were computed by the Maximum Composite Likelihood method (Tamura et al., 2004) within MEGA7.0 (Kumar et al., 2016). Numbers at the nodes are bootstrap values calculated from 1000 replicates. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article).