Identiﬁcation and characterization of a novel botulinum neurotoxin

Botulinum neurotoxins are known to have seven serotypes (BoNT/A–G). Here we report a new BoNT serotype, tentatively named BoNT/X, which has the lowest sequence identity with other BoNTs and is not recognized by antisera against known BoNTs. Similar to BoNT/B/D/F/G, BoNT/X cleaves vesicle-associated membrane proteins (VAMP) 1, 2 and 3, but at a novel site (Arg66-Ala67 in VAMP2). Remarkably, BoNT/X is the only toxin that also cleaves non-canonical substrates VAMP4, VAMP5 and Ykt6. To validate its activity, a small amount of full-length BoNT/X was assembled by linking two non-toxic fragments using a transpeptidase (sortase). Assembled BoNT/X cleaves VAMP2 and VAMP4 in cultured neurons and causes ﬂaccid paralysis in mice. Thus, BoNT/X is a novel BoNT with a unique substrate proﬁle. Its discovery posts a challenge to develop effective countermeasures, provides a novel tool for studying intracellular membrane trafﬁcking, and presents a new potential therapeutic toxin for modulating secretions in cells.

B oNTs are a family of bacterial toxins, classified as one of the six most dangerous potential bioterrorism agents (Category A and Tier 1 select agent in the United States) 1 . They are also widely used to treat a growing list of medical conditions 2,3 , including muscle spasms, chronic pain, overactive bladders, as well as having cosmetic applications. There are seven well-established serotypes of BoNTs (BoNT/A-G), traditionally defined based on a lack of cross-neutralization by different antisera raised against each toxin type. All BoNTs share the same structure and function [4][5][6] . They are composed of a light chain (LC, B50 kDa) and a heavy chain (HC, B100 kDa) connected by an inter-chain disulfide bond. The HC contains two sub-domains: the C-terminal H C that mediates binding to receptors, and the N-terminal H N that mediates translocation of the LC across endosomal membranes. The LC acts as a protease in neurons to cleave a set of proteins: BoNT/A, C and E cleave at distinct sites on a peripheral membrane protein known as SNAP-25; BoNT/B, D, F and G cleave at different sites on homologous vesicle proteins VAMP1, 2 and 3 (vesicle-associated membrane proteins); and BoNT/C also cleaves the plasma membrane protein syntaxin 1. These proteins are prototypes of the SNARE (soluble NSF attachment protein receptor) protein family, whose members mediate various membrane fusion events in eukaryotic cells 7,8 . Cleavage of any one of the three neuronal SNARE proteins blocks fusion of synaptic vesicles to plasma membranes, thus preventing neurotransmitter release from neurons.
Recognizing all distinct serotypes of BoNTs is essential for developing effective countermeasures against this family of toxins. BoNT/A and BoNT/B were first identified in 1919 by Georgina Burke 9 . The last of the seven serotypes, BoNT/G, was discovered in 1969 (ref. 10), and no new BoNT serotype has been found for the past four decades. Recent progress in genomic sequencing has revealed multiple subtypes (designated with Arabic numbers, for example, BoNT/A1), which can be recognized by the same antiserum but contain substantial sequence variations [11][12][13] . Furthermore, there are also multiple mosaic toxins. For instance, a 'type H' was reported in 2013 but was later designated as a mosaic toxin, as its LC shares B80% identity with the LC of a BoNT/F subtype, BoNT/F5, and its H C shares B84% identity with the H C of BoNT/A1 (refs [14][15][16][17]. Consistently, this toxin can be neutralized by antibodies against BoNT/A (ref. 16).
The genes encoding BoNTs can be on a plasmid, a phage, or the chromosome, indicating that these genes are mobile and capable of horizontal gene transfer [18][19][20][21][22][23] . Some strains contain two or even three different BoNT genes 15,24,25 . These strains are usually designated with a capital letter for the toxin type that is expressed at higher levels than the other one, followed with a lower case letter for the second toxin type (for example, BoNT/Af strain). In addition, it has also been reported that some BoNT/A strains contain a complete BoNT/B gene, but only BoNT/A is expressed [26][27][28] . Thus, the BoNT/B gene is considered a silent gene and the strains are known as BoNT/A(B) strains. A recent survey of infant botulism cases reported that B8% isolates are BoNT/A(B) strains 29 .
Here we searched published genomic sequences and identified a novel BoNT gene encoded on the chromosome of Clostridium botulinum strain 111. This strain was originally identified from an infant botulism patient in Japan in 1996 (ref. 30). The initial characterizations indicated that the toxicity of this strain is due to BoNT/B 30 . Later studies confirmed that this strain expresses a subtype of BoNT/B (BoNT/B2) encoded on a plasmid 31,32 . The sequence of this novel BoNT gene was deposited into the GenBank database in February 2015, as a part of the genomic sequence of C. botulinum 111. We characterized the protein encoded by this gene at functional levels and established it as a new BoNT serotype with a unique substrate profile.

Results
Searching genomic databases revealed a novel BoNT gene.
In an attempt to survey the evolutionary landscape of BoNTs, we performed iterative Hidden Markov model searches of the Uniprot sequence database. Our search identified all known BoNT subtypes and mosaic toxins, as well as the related tetanus neurotoxin ( Fig. 1a; Supplementary Fig. 1). To our surprise, the search revealed a potentially new BoNT, tentatively designated BoNT/X (Fig. 1a, GenBank no.: BAQ12790.1), from the recently reported genomic sequence of C. botulinum strain 111. BoNT/X showed the least protein sequence identity with the other BoNTs in pairwise comparisons (Fig. 1b). Furthermore, the low sequence similarity is evenly distributed along the entire BoNT/X sequence (Fig. 1c), indicating that it is not a mosaic toxin. Despite this low sequence identity, the overall domain arrangement of BoNTs is conserved in BoNT/X (Fig. 1c), including a zinc-dependent protease motif HEXXH (residues 227-231, HELVH) in the LC (ref. 33), and a SXWY motif in the H C (residues 1,274-1,277, SAWY), which recognizes the lipid receptor gangliosides 34 .
Similar to the other BoNTs, the BoNT/X gene is located in a gene cluster 23 . All seven established BoNTs are co-expressed with another 150 kDa protein known as NTNHA (non-toxic non-hemagglutinin protein), which forms a pH-dependent complex with BoNTs and protects them from proteases in the gastrointestinal tract 35 . The BoNT/X gene is also preceded by a potential NTNHA gene (Fig. 1d). Besides BoNT and NTNHA, a typical BoNT gene cluster contains genes encoding one of the two types of accessory proteins: (1) the HA cluster encoding three conserved proteins HA17, HA33 and HA70, which form a complex with BoNT/NTNHA and facilitate absorption of toxins across the intestinal epithelial barrier [36][37][38] ; or (2) the OrfX cluster encoding conserved OrfX1, OrfX2, OrfX3 and P47 proteins with unknown function 23 . The BoNT/X gene is located in an OrfX gene cluster, as are BoNT/E, F and members of BoNT/A. Interestingly, the BoNT/X cluster has two unique features ( Fig. 1d): (1) there is an additional OrfX2 gene that does not exist in any other BoNT clusters (we designated it OrfX2b); (2) the reading frame of OrfX genes is usually opposite to BoNT/NTNHA genes, but it has the same direction as the BoNT/X gene in the BoNT/X cluster (Fig. 1d). These findings suggest that BoNT/X is a unique branch of the BoNT family.
The LC of BoNT/X cleaves VAMP2 at a novel site. To characterize BoNT/X, we first focused on its LC (X-LC, residues 1-439) and produced it as a His6-tagged protein in Escherichia coli. LCs of BoNT/A (A-LC) and BoNT/B (B-LC) were produced and assayed in parallel as controls. Incubation of X-LC with rat brain detergent extracts (BDE) did not affect syntaxin 1 or SNAP-25, but abolished VAMP2 immunoblot signals (Fig. 2a). LCs of BoNTs are zinc-dependent proteases 33 . As expected, EDTA prevented cleavage of SNARE proteins by X-, A-and B-LCs (Fig. 2a). Furthermore, incubation of X-LC with the purified recombinant cytosolic domain of VAMP2 (residues 1-93) converted VAMP2 into two lower-molecular-weight bands (Fig. 2b), confirming that X-LC cleaves VAMP2.
To identify the cleavage site, we analysed the VAMP2 (1-93) protein, with or without pre-incubation with X-LC, by liquid chromatography-tandem mass spectrometry (LC-MS/MS, Fig. 2c-e). A single dominant peptide peak appeared after incubation with X-LC (Fig. 2c,e; Supplementary Fig. 2). Its molecular weight is 3,081.7, which fits only the peptide sequence of A67-L93 of VAMP2 (Fig. 2c,e). Consistently, another fragment from the beginning of the His6-tag to residue R66 of VAMP2 was also detected (Fig. 2d). To further confirm this finding, we repeated the assay with a different VAMP2 fragment: glutathione S-transferase (GST) tagged VAMP2 (33-86) ( Supplementary  Fig. 3). Incubation with X-LC generated a single dominant peptide peak with a molecular weight of 2,063.1, which fits only A67-R86 of VAMP2 ( Supplementary Fig. 3). Together, these results demonstrate that X-LC has a single cleavage site on VAMP2 between R66 and A67.
R66-A67 is a novel cleavage site on VAMP2, distinct from all established target sites of BoNTs (Fig. 2f). It is also the only BoNT cleavage site located within a region previously known as the SNARE motif (Fig. 2f, shaded regions) 39 . The VAMP protein family includes VAMP1, 2, 3, 4, 5, 7 and 8, as well as related Sec22b and Ykt6. R66-A67 is conserved in VAMP1 and 3, which are highly homologous to VAMP2. To validate the specificity of X-LC, we expressed HA-tagged VAMP1, 3, 7, 8 and Myc-tagged Sec22b and Ykt6 in HEK293 cells via transient transfection. Cell lysates were incubated with X-LC. Both VAMP1 and 3 were cleaved by X-LC, whereas VAMP7, VAMP8 and Sec22b were resistant to X-LC (Fig. 2g).
BoNT/X cleaves VAMP4, VAMP5 and Ykt6. Unexpectedly, Ykt6 was also cleaved by X-LC (Fig. 2g). This finding was confirmed using a purified GST-tagged Ykt6 fragment, which shifted to a lower-molecular-weight band after incubation with X-LC (Fig. 2h). The cleavage site was determined to be K173-S174 by mass spectrometry analysis of the intact Ykt6 versus Ykt6 cleaved by X-LC ( Supplementary Fig. 4). This site is homologous to the cleavage site of BoNT/X on VAMP2 (Fig. 2f). Among VAMP family of proteins, VAMP4 contains the same pair of residues (K87-S88) at this site as Ykt6. We found that X-LC cleaved both purified GST-tagged cytoplasmic domain of VAMP4 (Fig. 2i), as well as native VAMP4 in BDE (Fig. 2j). As a control, Sec22b was not cleaved by X-LC in BDE. In addition, the GST-tagged cytoplasmic domain of VAMP5 was also cleaved (Fig. 2i). The cleavage sites were determined by mass spectrometry analysis to be K87-S88 in VAMP4 and R40-S41 in VAMP5 ( Supplementary  Fig. 5). Both sites are homologous to the cleavage site of BoNT/X on VAMP2 (Fig. 2f), demonstrating that the location of the cleavage site is conserved across different VAMPs. The ability of X-LC to cleave VAMP4, VAMP5 and Ykt6 is highly unusual, as their sequences are substantially different from VAMP1/2/3. BoNT/X is the first and the only BoNT known that can cleave VAMPs beyond the canonical targets VAMP1, 2 and 3 (ref. 40).
Proteolytic activation of BoNT/X. We next examined the linker region between the LC and the HC, which must be cleaved by bacterial or host proteases to convert the toxin to an 'active' di-chain form. We produced a recombinant X-LC-H N fragment (residues 1-891) in E. coli and subjected it to limited proteolysis by endoproteinase Lys-C. Samples were analysed using Tandem Mass Tag (TMT) labelling and tandem mass spectrometry. TMT labels free N-termini (and lysines). Limited proteolysis by Lys-C  BoNT/X and other BoNTs/TeNT is evenly distributed along the entire BoNT/X sequence. The X axis represents the query sequence position at the center of a 100-amino-acid moving sequence-comparison window. The Y axis shows the percentage of identity between that sequence window and each of the aligned background sequences. The two bars at the top of the graph illustrate the best matching sequence (lower bar) and whether the best match is significantly separated from the second-best match (upper bar). (d) A schematic drawing of the orf gene cluster that hosts the BoNT/X gene (upper panel), which has two distinct features compared with other known orfX clusters (middle and lower panels): (1) there is an additional orfX2 protein (designated orfX2b) located next to the BoNT/X gene; (2) the reading frame of orfX genes has the same direction as the BoNT/X gene. (1-70) were incubated with X-LC (100 nM). Samples were analysed by SDS-PAGE and Coomassie Blue staining. X-LC cleaved both VAMP4 and VAMP5. We note that VAMP5 protein contains a contaminant band that runs close to the cleavage product. (j) Experiments were carried out as described in a, except that VAMP4 and Sec22b were detected. Synaptotagmin I (Syt I) is a loading control. X-LC cleaved native VAMP4 in BDE. One of two (b,g,j) or three (a,h,i) independent experiments is shown.
appears to be more potent than activated LC-H N of BoNT/A (A-LC-H N ) and BoNT/B (B-LC-H N ), which did not show any detectable cleavage of their substrates under the same assay conditions (Fig. 3b).
The inter-chain disulfide bond in BoNT/X. Like other BoNTs, the linker region of BoNT/X contains two conserved cysteines, but there is also an additional cysteine (C461) unique to BoNT/X (Fig. 3a). To determine the cysteine residues that form the essential inter-chain disulfide bond, we generated three X-LC-H N mutants, each with one of the three cysteine residues mutated (C423S, C461S and C467S). These mutants, as well as wild-type (WT) X-LC-H N , were subjected to limited proteolysis with Lys-C and then analysed via SDS-PAGE and Coomassie Blue staining, with or without the reducing agent dithiothreitol (DTT; Fig. 3c).
Mutating the only cysteine on the LC (C423S) is expected to abolish the inter-chain disulfide bond. Consistently, C423S mutant separated into two B50 kDa bands without DTT.
In contrast, both C461S and C467S mutants showed as a single band at 100 kDa in the absence of DTT and separated into two B50 kDa bands in the presence of DTT. These results suggested that C423 on the LC can form the inter-chain disulfide bond with either C461 or C467 on the HC. We also found that Lys-C treatment degraded a significant portion of C423S mutant as compared with C461S or C467S mutants (Fig. 3c, þ DTT), suggesting that losing the inter-chain disulfide bond makes the molecule more susceptible to proteases. We noticed that a portion of WT X-LC-H N formed aggregates at the top of the SDS-PAGE gel (Fig. 3c, marked by an asterisk). These aggregates disappeared in the presence of DTT. C423/C461/C467 are the only three cysteines in the X-LC-H N ; mutating any one of them abolished formation of aggregates (Fig. 3c, À DTT), suggesting that these aggregates are formed by inter-molecular disulfide bonds due to the existence of an extra cysteine in the linker region. Interestingly, the majority of activated WT X-LC-H N separated into two B50 kDa bands without DTT (Fig. 3c), which is similar to C423S mutant. On the other hand, WT X-LC-H N did not show , except that neurons were exposed to WT or X-LC-H N mutants. C423S mutation abolished the activity of X-LC-H N , whereas mutating C461 or C467 did not affect the activity of X-LC-H N . These results confirmed that the inter-chain disulfide bond is essential for the activity of X-LC-H N , and this inter-chain disulfide bond can be formed via either C423-C461 or C423-C467. One of two (b) or three (b,c,f) independent experiments is shown.

BoNT/A BoNT/B BoNT/C BoNT/D BoNT/E BoNT/F BoNT/G
increased degradation by Lys-C compared with C423S mutant (Fig. 3c, þ DTT). One possible explanation is that WT X-LC-H N contains an inter-chain disulfide bond under native conditions, but this bond can rearrange to intra-chain C461-C467 pair under denaturing conditions in the SDS buffer. This phenomenon is known as disulfide bond shuffling, which often occurs among adjacent cysteines. To test this hypothesis, we utilized an alkylating reagent, N-Ethylmaleimide (NEM), which permanently blocks free cysteines and prevents disulfide bond shuffling. As shown in Fig. 3d, WT X-LC-H N pretreated with NEM showed as a single band at 100 kDa in the absence of DTT, and separated into two B50 kDa bands in the presence of DTT. These results confirm that WT X-LC-H N contains mainly an inter-chain disulfide bond, but it is susceptible to disulfide bond shuffling due to an extra cysteine in the linker region (Fig. 3e). We further examined the activity of the three X-LC-H N cysteine mutants on cultured neurons. As expected, C423S mutant was inactive, whereas C461S and C467S mutants both showed similar levels of activity as WT X-LC-H N (Fig. 3f). These results confirm that the inter-chain disulfide bond is critical for the activity of BoNT/X.
Generating full-length BoNT/X via sortase-mediated ligation. We then sought to determine whether full-length BoNT/X is a functional toxin. As no antisera against BoNT/X are available, we decided to avoid generating the full-length active toxin gene. Instead, we developed an approach to generate a limited amount of full-length BoNTs in test tubes by enzymatic ligation of two non-toxic fragments of BoNTs. This method utilizes a transpeptidase known as sortase 42,43 , which recognizes the peptide motif LPXTG, cleaves between T-G, and concurrently forms a new peptide bond with other proteins/peptides containing N-terminal glycine (Fig. 4a). We produced two non-toxic fragments of BoNT/X: (1) LC-H N with a LPETGG motif and a His6-tag fused to the C terminus; and (2) the H C of BoNT/X (X-H C ) with a GST tag, thrombin cleavage site, and an additional glycine residue at its N terminus. Cutting by thrombin releases X-H C with a free glycine at its N terminus. Incubation of these two fragments with sortase generated a small amount of B150 kD full-length BoNT/X (X-FL, Fig. 4a,b). We note that X-H C showed poor solubility and a strong tendency towards aggregation, which might be the reason for the low ligation efficiency (Fig. 4b). In contrast, ligation of X-LC-H N with the H C of BoNT/A (A-H C ) achieved a better efficiency, with the majority of X-LC-H N ligated into a XA chimeric toxin ( Supplementary  Fig. 6a). To ensure biosafety, the amount of precursor fragments in the reaction is strictly limited to generate the minimum amount of ligated toxin necessary for functional assays.
We first analysed the activity of ligated BoNT/X using cultured rat cortical neurons. Neurons were exposed to the sortase ligation mixture and control mixtures in culture medium. As shown in a -LPETGG-His6 BoNT/X is composed of X-LC-H N and X-H C at 1:1 molar ratio. These antisera recognized their corresponding target toxins, yet none recognized BoNT/X. The antisera against BoNT/DC and BoNT/C cross-react, as these two toxins share a high degree of similarity within their H C domains. (e) Cultured rat cortical neurons were exposed to ligated X-FL in culture medium for 12 h, with or without two combinations of anti-sera. Ab1: trivalent anti-BoNT/A/B/E, anti-BoNT/C and anti-BoNT/F. Ab2: anti-BoNT/G and anti-BoNT/D. The trivalent anti-BoNT/A/B/E was used at 1:50 dilution. All other anti-sera were used at 1:100 dilution. None of the antisera affected the cleavage of VAMP2 and VAMP4 by X-FL. The specificity and potency of these antisera were validated for their ability to neutralize target serotypes in the same assay as described in Supplementary Fig. 7. (f) X-FL linked by sortase reaction (0.5 mg) was injected into the gastrocnemius muscles of the right hind limb of mice (n ¼ 4). The injected limb developed typical flaccid paralysis, and the toes failed to spread within 12 h. The left limb was not injected with toxins, serving as a control. (g) Full-length inactive form of BoNT/X (BoNT/X RY ) was purified as a His6-tagged recombinant protein in E. coli. Further purified BoNT/X RY is shown in Supplementary Fig. 8b. One of two (e) or three (c,d) independent experiments is shown. Fig. 4c, X-LC-H N alone cleaved some VAMP2 due to its high concentration in the reaction mixture. Mixing X-H C with X-LC-H N without sortase slightly enhanced cleavage of VAMP2 compared with X-LC-H N alone, suggesting that X-H C might be associated with X-LC-H N via non-covalent interactions. This interaction appears to be specific, as mixing A-H C with X-LC-H N did not enhance cleavage of VAMP2 in neurons ( Supplementary  Fig. 6b). Ligating X-LC-H N with X-H C by sortase clearly enhanced cleavage of VAMP2 compared with the mixture of X-LC-H N and X-H C without sortase (Fig. 4c). These results demonstrated that the X-H C is functional for targeting cells and that ligated full-length BoNT/X entered neurons and cleaved VAMP2. Similarly, ligated XA also entered neurons and cleaved VAMP2 (Supplementary Fig. 6b).
BoNT/X was not recognized by antisera against known BoNTs. We next carried out dot blot assays using antisera raised against known BoNTs, including all seven serotypes as well as one mosaic toxin (BoNT/DC), to confirm that BoNT/X is serologically unique. Four horse antisera were utilized (trivalent anti-BoNT/A, B and E, anti-BoNT/C, anti-BoNT/DC, and anti-BoNT/F) as well as two goat antisera (anti-BoNT/G and anti-BoNT/D). The specificity and potency of these antisera were first validated by analysing their ability to neutralize BoNTs on cultured neurons. As expected, all antisera neutralized their target BoNTs, without affecting the activity of a different serotype ( Supplementary  Fig. 7). We found that these antisera recognized their corresponding BoNTs in the dot blot assay, yet none recognized BoNT/X (Fig. 4d). We further analysed whether the toxicity of BoNT/X on neurons can be neutralized by these antisera. X-FL generated by sortase-mediated ligation was first activated with limited proteolysis using trypsin. We used trypsin to activate X-FL instead of Lys-C for functional assays, as trypsin allows us to stop proteolysis using trypsin inhibitors. Activated X-FL entered cultured rat cortical neurons and cleaved both VAMP2 and VAMP4 in a concentration-dependent manner (Fig. 4e). Combinations of antisera against known BoNTs (Ab1 (horse antisera): trivalent anti-BoNT/A, B and E, anti-BoNT/C, and anti-BoNT/F; Ab2 (goat antisera): anti-BoNT/G and anti-BoNT/D) did not affect the activity of ligated X-FL, as evidenced by similar degrees of VAMP2 and VAMP4 cleavage in the presence of these antisera (Fig. 4e). These results confirmed that BoNT/X is a new BoNT serotype.
BoNT/X induced flaccid paralysis in vivo in mice. We next sought to determine whether BoNT/X is active in vivo using a well-established non-lethal assay in mice, known as the Digit Abduction Score (DAS) assay, which measures local muscle paralysis following injection of BoNTs into mouse hind limb muscles 44 . BoNTs cause flaccid paralysis of limb muscles, which is manifested as the failure to spread the toes in response to a startle stimulus. We injected ligated X-FL (0.5 mg, activated by trypsin treatment) into the gastrocnemius muscles of the right hind limb in mice, which induced typical flaccid paralysis and the failure of toes to spread (Fig. 4f), indicating that BoNT/X is capable of causing flaccid paralysis in vivo. We note that the potency of ligated X-FL appears to be much lower than other BoNTs in this assay. To further confirm the low toxicity of ligated X-FL, we injected mice with 1 mg of ligated X-FL intraperitoneally (n ¼ 3). No mice showed any systemic effects and all survived at this dose. Thus, ligated X-FL has a rather low toxicity in vivo in mice compared with other native BoNTs, which usually have lethal doses at low picogram levels per mouse.
Full-length inactive BoNT/X. Finally, we developed an inactive mutant of BoNT/X as a potential reagent for generating neutralizing antibodies. Mutations at two residues (R362A/Y365F) in BoNT/A inactivate the protease activity of the LC and abolish the toxicity of BoNT/A in vivo 45,46 . These two residues are conserved in BoNT/X. We introduced the corresponding mutations (R360A/Y363F) in BoNT/X and generated a full-length inactive form, designated as BoNT/X RY . As shown in Fig. 4g, BoNT/X RY was purified as a His6-tagged protein in E. coli, and it had no activity on cultured neurons (Supplementary Fig. 8a). Furthermore, intraperitoneal injection of mice with 30 mg BoNT/X RY (activated by trypsin treatment) did not cause any adverse effects (n ¼ 5), demonstrating that it is not toxic in vivo. A substantial portion of BoNT/X RY formed aggregates at the top of the SDS-PAGE gel (Fig. 4g). Adding DTT reduced these aggregates to monomeric BoNT/X RY (Fig. 4g). Thus, full-length BoNT/X is susceptible to forming intermolecular disulfide bonds. Nevertheless, the monomeric form of BoNT/X can be purified and is stable in solution (Fig. 4g). Furthermore, we developed a scale-up purification protocol, which generated BoNT/X RY with a yield of B3 mg per liter of culture and B90% purity ( Supplementary Fig. 8b). Highly purified BoNT/X RY remained stable in solution up to 10 mg ml À 1 in the presence of reducing agent. This atoxic BoNT/X RY will be a valuable reagent for generating neutralizing antibodies.

Discussion
BoNT/X is the first serotype of BoNTs identified by genomic sequencing and bioinformatics approaches. It remains unknown whether BoNT/X is ever produced in C. botulinum strain 111. BoNT/X could be a silent gene, or it may not be expressed at detectable toxicity levels under culture conditions in the lab. Thus, this toxin was revealed only by sequencing C. botulinum 111. This illustrates the importance of genomic sequencing and bioinformatics approaches for understanding microbial virulence factors. Whether BoNT/X could be expressed and exhibit toxicity under certain environmental conditions remains an intriguing question.
A remarkable feature of BoNT/X is its unique ability to cleave VAMP4 and Ykt6. VAMP4 is widely expressed and is known to mediate vesicle fusion between the trans-Golgi network (TGN) and endosomes, as well as homotypic fusion of endosomes 47,48 . Ykt6 is an atypical SNARE without a transmembrane domain 49 . It is anchored to membranes via lipidation, which allows dynamic regulation of its membrane association. Ykt6 is an essential protein in yeast and implicated in multiple membrane fusion events including ER-Golgi, intra-Golgi, endosome-Golgivacuolar, and autophagosome formation. Its function in mammalian cells remains to be established. BoNT/X is the first and only BoNT to cleave these SNAREs that mediate various intracellular membrane-trafficking events.
Interestingly, both VAMP4 and Ykt6 are enriched in neurons. Recent studies suggested that VAMP4 contributes to asynchronous synaptic vesicle exocytosis, enlargeosome exocytosis and activity-dependent bulk endocytosis (ADBE) in neurons [50][51][52] . The role of Ykt6 in neurons remains to be established, but it has been shown to suppress the toxicity of a-synuclein in Parkinson's disease models 53,54 . The other substrate of BoNT/X, VAMP5, is mainly expressed in muscle cells and its function remains to be established 55 . BoNT/X will be a useful tool to investigate the function of VAMP4, Ykt6 and VAMP5, as well as related membrane trafficking events. In addition, because VAMP4 has been implicated in granule release in immune cells 56 , BoNT/X may have the potential to modulate inflammatory secretion in immune cells.
The X-LC-H N fragment showed a higher level of activity in neurons than either A-LC-H N or B-LC-H N , suggesting that its membrane translocation and/or protease activity might be more efficient than the corresponding fragments in BoNT/A and BoNT/B. X-H C is functional for targeting cells, as its presence enhanced cleavage of VAMP2 in neurons over LC-H N alone (Fig. 4c). When present without the translocation and LC domains, X-H C is prone to aggregation. This solubility issue is likely due to separation of X-H C from X-LC-H N , as full-length BoNT/X RY remains stable at high concentrations. The X-FL generated by sortase-mediated ligation has a rather low toxicity in vivo in mice. It remains unknown whether this low in vivo toxicity was intrinsic to BoNT/X. It is also possible that the sortase linking method resulted in an attenuated toxin, as the H C and LC-H N folded separately and there is an additional linker between the H N and H C in ligated toxins. Nevertheless, X-FL produced by sortase-mediated ligation is active on neurons and induced typical flaccid paralysis when injected locally in mice, demonstrating that BoNT/X is a functional toxin (Fig. 4f). It will be necessary to produce native BoNT/X to characterize its in vivo potency and determine its biosafety risk. It will be important to generate neutralizing antisera against BoNT/X before producing any native toxin.
Introducing a full-length active toxin gene into any expression system/organism is always a significant biosafety concern. Sortase-mediated ligation assembles a small quantity of fulllength toxin from two complementary and non-toxic fragments expressed and purified individually. The amount of the precursor fragments in the reaction can be strictly controlled, so the amount of ligated toxin is precisely limited to ensure biosafety. The ligated toxin should possess the same mode of action as native toxins, but may exhibit lower toxicity in vivo, possibly due to the addition of the sortase linker and/or reduced compactness of the molecule. Thus, the ligated toxin could be used for functional studies, but may not be suitable for estimating the lethal dose of native toxin. On the other hand, this reduced toxicity from the sortase linking method could be an advantage to create attenuated toxins for research use. It might be also possible to deliberately attenuate the toxicity of ligated toxins in vivo by including additional linkers, such as peptide sequences that are sensitive to serum proteases, to further mitigate biosafety concerns.
cDNA and constructs. The cDNAs encoding X-LC (residues 1-439), X-H C (residues 893-1,306), A-LC-H N (residues 1-874, GenBank no. M30196), and B-LC-H N (residues 1-860, GenBank no. AB232927) were synthesized by GenScript (New Brunswick, NJ, USA). The cDNA encoding X-H N was generated in-house using the Gibson assembly method. X-LC, A-LC (residues 1-425) and B-LC (residues 1-439) were cloned into pET28 vectors with His6-tag on their N-termini. X-H C and A-H C (residues 875-1,297, GenBank No. AF488749) were cloned into pGEX4T to express as GST-tagged proteins. One extra glycine was introduced into the N terminus of X-H C to increase the sortase ligation efficiency. X-LC-H N , A-LC-H N and B-LC-H N were cloned into pET22b vector, with the peptide sequence LPETGG fused to their C-termini, followed by a His6-tag, and purified as His6-tagged proteins. The cDNA encoding rat VAMP2 was generously provided by E. Chapman (Madison, WI, USA). VAMP2 (1-93) was cloned into pET28 vector with a His6-tag on the N terminus. VAMP2  was cloned into pGEX4T vector and expressed as a GST-tagged protein. The cDNAs encoding mouse VAMP1, VAMP3, VAMP4, human VAMP5, rat VAMP7 and VAMP8 were generously provided by C. Hu (Louisville, KY, USA). Full-length VAMP1, 3, 7 and 8 were cloned into modified pcDNA3.1 vectors, with an HA tag fused to their C termini. Constructs expressing full-length rat Ykt6 and mouse Sec22b, both in pcDNA3.1 vector with a Myc tag fused to the N terminus of the protein, were generously provided by J. Hay (Missoula, MT, USA). The cytoplasmic domains of VAMP4 (1-115) and VAMP5  were cloned between BamHI/XhoI sites in pGEX4T and expressed as GST-tagged proteins. We note that there are seven extra residues from the pGEX4T vector fused to the C-termini of VAMP4 and VAMP5 fragments. Ykt6 fragment (residues 1-192) was also cloned into pGEX4T and expressed as GST-tagged proteins. The construct encoding His6-tagged sortase (SrtA*) was generously provided by B. Pentelute (Boston, MA, USA) 43 .
Bioinformatics. The Uniprot database was searched with Jackhmmer on the HMMER web server, using a BoNT/A1 sequence as the seed (Uniprot accession number A5HZZ9) until convergence. Returned sequences were aligned with Clustal Omega and a NeighborNet phylogenetic network estimated with SplitsTree.
Protein purification. E. coli BL21 (DE3) was utilized for protein expression. In general, induction of expression was carried out with 0.1 mM IPTG at 22°C overnight. Bacterial pellets were disrupted in lysis buffer (50 mM Tris pH 7.5, 150 mM NaCl) by sonication, and supernatants were collected after centrifugation at 20,000 g for 30 min at 4°C. Protein purification was carried out using AKTA Prime FPLC system (GE), and purified proteins were further desalted with a PD-10 column (GE, 17-0851-01).
Large-scale production and purification of BoNT/X RY . cDNA encoding BoNT/X RY was assembled in-house from mutated X-LC (R360A/Y363F), X-H N and X-H C . It was cloned into a pET22b vector, with the His6-tag on its C terminus. The corresponding plasmid was transformed into E. coli BL21 (DE3). Cultures for expression were first grown using a LEX Bioreactor (Epiphyte3, Ontario, Canada) at 37°C in 1.5 l of medium until OD 600 reached 0.8. The temperature was then reduced to 18°C for induction of expression with 1 mM IPTG, and grown for 16-17 h. Bacteria were harvested, re-suspended in HEPES buffer (50 mM HEPES pH 7.2, 500 mM NaCl, 25 mM imidazole, 5% glycerol, 2 mM TCEP), and lysed with an Emulsiflex-C3 (Avestin, Mannheim, Germany) at 20,000 p.s.i. Lysates were ultra-centrifuged at 200,000 g for 45 min. Supernatant was loaded onto a 15 ml Ni-NTA agarose column and washed with wash buffer (50 mM HEPES pH7.2, 500 mM NaCl, 100 mM imidazole, 5% glycerol, 1 mM TCEP). Proteins were eluted with elution buffer (50 mM HEPES pH 7.2, 500 mM NaCl, 250 mM imidazole, 5% glycerol, 1 mM TCEP) and then dialyzed overnight in 50 mM HEPES, 500 mM NaCl, 5% glycerol, and 0.5 mM TCEP. Dialysate was concentrated using a Vivaspin concentrator before being loaded on a Superdex200-16/60 column pre-equilibrated in the same buffer used for dialysis. Elution peak corresponding to BoNT/X was collected and concentrated to B10 mg ml À 1 . Sample was aliquoted and flashfrozen in liquid nitrogen for storage at À 80°C.
Cleavage of SNARE proteins in rat BDE. Rat brain was homogenized in 15 ml 320 mM sucrose buffer, followed by centrifugation at 5,000 r.p.m. for 2 min at 4°C. Supernatants were collected and centrifuged at 11,000 r.p.m. for 12 min. The pellet was collected and solubilized for 30 min in 15 ml Tris-buffered saline (TBS: 20 mM Tris, 150 mM NaCl) plus 2% of Triton X-100 and a cocktail of protease inhibitors (Roche, CA). Samples were subsequently centrifuged at 17,000 r.p.m. for 20 min to remove insoluble materials. The final BDE concentration was B2 mg ml À 1 . BDE (60 ml) were incubated with X-LC (0.5 mM), A-LC (1 mM), or B-LC (1 mM), for 1 h at 37°C, and then analysed by immunoblot using the enhanced chemiluminescence (ECL) method (Pierce). As controls, LCs were pre-incubated with 20 mM EDTA for 20 min at room temperature before adding to BDE. Full-blot scans are shown in Supplementary Fig. 9. (1-70) and Ykt6  were expressed and purified as GST-tagged proteins. These proteins (0.3 mg ml À 1 ) were incubated with 0.1 mM X-LC in TBS buffer at 37°C. Samples were either analysed by SDS-PAGE gels and Coomassie Blue staining, or subjected to mass spectrometry analysis.
Identification of cleavage sites in VAMPs by LC-MS/MS. Samples were analysed at Taplin Biological Mass Spectrometry Core Facility at Harvard Medical School. For VAMP2, whole-protein samples were loaded onto a 100 mm internal diameter C18 reverse-phase HPLC column packed with 3 cm of beads off-line using a pressure cell. The column was re-attached to an Accela 600 Pump (Thermo Fisher Scientific). A rapid gradient of increasing acetonitrile was used to elute the protein/peptide from the HPLC column. As peptides eluted, they were subjected to electrospray ionization and then placed into an LTQ Orbitrap Velos Pro ion-trap mass spectrometer to acquire a high-resolution FTMS scan at 60,000 resolution, a second scan at low resolution in the ion trap, and a final scan to perform data-dependent MS/MS. The charge state envelopes were de-convoluted manually to obtain mono-isotopic masses when possible or average masses for the proteins. Peptide and protein identity were determined by matching protein databases with the acquired fragmentation pattern using the software program Sequest (Thermo Fisher Scientific). All databases include a reversed version of all the sequences, and the data were filtered to 1-2% peptide false-discovery rate.
For Ykt6, VAMP4 and VAMP5, samples were first separated on SDS-PAGE. Protein bands were excised and cut into B1 mm 3 pieces. Gel pieces were incubated with 50 mM ammonium bicarbonate solution containing 12.5 ng ml À 1 modified sequencing-grade chymotrypsin (Roche Diagnostics). Samples were digested overnight at room temperature. Peptides were then extracted and separated with reverse-phase HPLC. As peptides were eluted, they were subjected to electrospray ionization and transferred into an LTQ Orbitrap Velos Pro ion-trap mass spectrometer (Thermo Fisher Scientific). Eluted peptides were detected, isolated and fragmented to produce a tandem mass spectrum of specific fragment ions for each peptide.
Identification of the protease cleavage site between LC and H N . His6-tagged recombinant X-LC-H N fragment (residues 1-891) was purified in E. coli and subjected to limited proteolysis by endoproteinase Lys-C (Sigma P2289, 100:1 (toxin:Lys-C) molar ratio, 25 min at room temperature). The cleavage site was determined by TMT labelling and tandem mass spectrometry. Briefly, intact X-LC-H N samples were labelled with the light TMT, and equal amounts of X-LC-H N samples treated with Lys-C were labelled with the heavy TMT. Both samples were then digested with chymotrypsin, combined and subjected to quantitative mass spectrometry analysis.
Cysteine alkylation by NEM. Lys-C-activated X-LC-H N fragment was diluted into sodium phosphate buffer (10 mM, pH 6.5) at a final concentration of 0.3 mg ml À 1 , with or without NEM at indicated concentrations (20, 10 and 5 mM) and incubated for 10 min at room temperature. NEM was freshly prepared in sodium phosphate buffer. Samples were mixed with 3 Â neutral loading dye (200 mM Tris pH 6.8, 30% glycerol, 6% Lithium Dodecyl sulfate, 10 mM NEM, and 0.06% BPB) at room temperature for 10 min, heated for 10 min at 55°C, and then analysed by SDS-PAGE and Coomassie Blue staining.
Neuron culture and immunoblot analysis. Primary rat cortical neurons were prepared from E18-19 embryos using a papain dissociation kit (Worthington Biochemical) following the manufacturer's instruction 57 . Neurons were exposed to either BoNT/X fragments or sortase ligation mixtures in culture medium for 12 h. Cells were then lysed with RIPA buffer plus a protease inhibitor cocktail (Sigma-Aldrich). Lysates were centrifuged for 10 min at maximum speed using a microcentrifuge at 4°C. Supernatants were subjected to SDS-PAGE and immunoblot analysis.
Dot blot. BoNTs (0.2 mg in 1 ml) were spotted onto nitrocellulose membranes and dried (10 min at RT). The membranes were blocked with 5% milk in TBST (TBS plus 0.05% Tween20) for 30 min and then incubated with appropriate antisera (1:500 dilution) for 30 min. The membranes were then washed three times with TBST and incubated with HRP (horseradish peroxidase)-conjugated secondary antibodies for 30 min, washed three more times with TBST, and analysed by the ECL method. The BoNT/X sample was composed of X-LC-H N and GST-X-H C at 1:1 molar ratio.
Sortase-mediated ligation. GST-X-H C or GST-A-H C was cleaved overnight at 4°C by thrombin before being added into the ligation reaction mixture. Ligation reaction was set up in 50 ml TBS buffer with X-LC-H N (8 mM), X-H C (4 mM) or A-H C (25 mM), Ca 2 þ (10 mM) and sortase (10 mM), for 40 min at room temperature.
DAS assay. All procedures were conducted in accordance with the guidelines approved by the Institute Animal Care and Use Committee (IACUC) at Boston Children's Hospital (#3030). Briefly, X-FL generated by sortase-mediated ligation was first activated with limited proteolysis using trypsin (60:1 (toxin:trypsin) molar ratio, 30 min at room temperature). We chose trypsin instead of Lys-C here, as it allows us to stop proteolysis by adding trypsin inhibitors (Soybean trypsin inhibitor, 1:10 ratio (trypsin:trypsin inhibitor). Mice (CD-1 strain, male, purchased from Charles River, 5-6 weeks old, 21-25 g, n ¼ 4) were anesthetized with isoflurane (3-4%) and injected with X-FL (0.5 mg) using a 30-gauge needle attached to a sterile Hamilton syringe, into the gastrocnemius muscles of the right hind limb. Muscle paralysis and the spread of hind paw in response to a startle stimulus were observed 12 h after injection as previously described 44 .
Biosafety and biosecurity. All procedures were approved by the Institute of Biosafety Committees at Boston Children's Hospital and at Stockholm University. To ensure biosafety and biosecurity, no active full-length toxin gene was produced in any form. The amount of sortase linking reaction is strictly controlled to ensure that only a minimal amount of ligated toxins was produced, which was immediately utilized for functional studies. The loss of toxicity of BoNT/X RY and reduced toxicity of sortase-linked X-FL were confirmed using the mouse lethality assay. Inactive BoNT/X RY was prepared and is available for developing neutralizing antibodies.
Data availability. The data and materials that support the findings of this study are available from the corresponding authors upon request.