High resolution crystal structures of the receptor-binding domain of Clostridium botulinum neurotoxin serotypes A and FA

The binding specificity of botulinum neurotoxins (BoNTs) is primarily a consequence of their ability to bind to multiple receptors at the same time. BoNTs consist of three distinct domains, a metalloprotease light chain (LC), a translocation domain (HN) and a receptor-binding domain (HC). Here we report the crystal structure of HC/FA, complementing an existing structure through the modelling of a previously unresolved loop which is important for receptor-binding. Our HC/FA structure also contains a previously unidentified disulphide bond, which we have also observed in one of two crystal forms of HC/A1. This may have implications for receptor-binding and future recombinant toxin production.


INTRODUCTION
Botulinum neurotoxin (BoNT) is the active agent that causes the deadly condition botulism. It is expressed as a single polypeptide chain of approximately 150 kDa and is cleaved posttranslationally to yield an active di-chain molecule held together by a single disulphide bond. The smaller 50 kDa light chain (LC) possesses a single zinc-endopeptidase domain whereas the larger 100 kDa heavy chain is comprised of two domains-a target cell receptor binding (H C ) domain and a translocation (H N ) domain (Schiavo et al., 1992;Montecucco, 1986). BoNTs are the most poisonous biological substances known to man and their remarkable toxicity is a result of their highly specific and modular mechanism of action. The toxins target neuronal cell membranes through the formation of a dual receptor binding complex (Montecucco, 1986) which allows for internalisation by endocytosis (Colasante et al., 2013). The H N domain then translocates the LC into the cytoplasm where it cleaves a specific SNARE (soluble N-ethylmaleimide-sensitive factor attachment protein receptor) protein which consequently inhibits vesicle release. Many immunologically distinct BoNTs have been discovered over the years-serotypes /A through to /G; although, only serotypes /A, /B, /E and /F have been reported to cause botulism in humans (Coffield et al., 1997).
Each serotype can be further subdivided into different subtypes based on their amino acid sequence (Rossetto, Pirazzini & Montecucco, 2014).
The BoNT H C domain is responsible for targeting the toxin to a specific cell type and the specific receptors involved have been identified for most serotypes. For example, serotype /A binds to a glycosylated luminal domain of the synaptic vesicle 2 (SV2) protein, preferentially to the C isoform (SV2C) (Dong et al., 2006;Mahrhold et al., 2006), as well as a ganglioside, namely GT1b (Rummel et al., 2004;Yowler & Schengrund, 2004;Stenmark et al., 2008). X-ray crystallography has revealed protein-backbone hydrogen-bond interactions between β-strands of the BoNT/A1 H C and the fourth luminal domain of SV2C (SV2C-LD4), as well as electrostatic interactions between charged surfaces (Benoit et al., 2014). This is supplemented by additional interactions with the N-linked glycans present on the native SV2 molecule-this network of glycan interactions is key for BoNT potency (Yao et al., 2016).
BoNTs have been exploited for therapeutic use in many neurological indications such as dystonia and overactive bladder (Dressler, 2012). More recently, BoNTs have been re-engineered to target different cell types and treat specific clinical indications (Masuyer et al., 2014;Masuyer et al., 2015)-these are known as targeted secretion inhibitors (TSIs). A detailed understanding of the structural aspects of the different BoNT subtypes will prove useful in identifying regions of variability that may help uncover conserved mechanisms of binding, which in turn will guide efforts in developing novel BoNT therapeutics. Therefore, high-resolution, three-dimensional structural analysis of naturally occurring BoNTs, their mosaics and subtypes, would be of great value. Of particular interest are the structural differences arising from sequence differences between subtypes.
Recently, a new bivalent strain of Clostridium botulinum, IBCA10-7060, was reported to produce BoNT/B2 and a previously unknown BoNT serotype-''BoNT/H'' (Barash & Arnon, 2014). This novel toxin is a mosaic molecule and is now more commonly referred to as BoNT/FA (as well as BoNT/HA) due to a LC similar to that of BoNT/F5, a H N domain similar to that of BoNT/F1, and a H C domain similar to that of BoNT/A1 Gonzalez-Escalona et al., 2014;Kalb et al., 2015). The crystal structure of the BoNT/FA binding domain was recently reported (Yao et al., 2017). Here, we present a new crystal structure of H C /FA at 1.95 Å resolution which reveals further structural information that is unresolved in the reported structure. Specifically, our structure reveals a loop previously unmodeled due to lack of density, which is of high importance, and we also observe a disulphide bond which was not present within the previous structure. To this end we have produced two crystal forms of H C /A1 (determined to 1.45 Å and 1.7 Å respectively) differing by the presence of this disulphide bond.

Cloning and constructs
The genes encoding the binding domain of BoNT/A1 (H C /A1) and BoNT/FA (H C /FA) were provided by Ipsen Bioinnovation Ltd. Each was cloned into the pJ401 expression vector (DNA 2.0, Menlo Park, CA, USA) with an N-terminal 6× histidine tag using standard molecular biology techniques and confirmed by sequencing (Eurofins Genomics, Germany).

Protein expression
His 6 -H C /A1 and His 6 -H C /FA were expressed in E. coli strain BL21(DE3) (Novagen, Madison, WI, USA) using the following protocol. A glycerol stock was used to inoculate 100 mL TB medium containing 50 µg/mL kanamycin, and grown at 37 • C for 16 h. From this, 10 mL of culture was used to inoculate 1 L TB medium containing 50 µg/mL kanamycin and grown at 37 • C to an OD 600 of 0.6. The temperature was reduced to 16 • C and the cells grown to an OD 600 of 1.0 at which point 0.5 mM isopropyl-d-1-thiogalactopyranoside (IPTG) was added to induce expression. Cells were grown for an additional 16 h at 16 • C and then harvested by centrifugation at 4,000× g for 30 min.

Protein purification
Expression cell pastes were resuspended in 0.5 M NaCl, 50 mM Tris-HCl pH 7.4, 20 mM imidazole and lysed using a Constant Systems cell disruptor at 20 kPSI. Cell debris were removed by centrifugation at 80,000× g for 30 min and the supernatant was filtered through a 0.22 µm membrane syringe filter. The clarified lysate was loaded onto a 5 mL HisTrap column (GE Healthcare, Little Chalfont, UK), washed, and the target protein eluted with 0.5 M imidazole. His 6 -H C /A1 and His 6 -H C /FA were further purified by SEC using a Superdex 200 16/60 column (GE Healthcare, Little Chalfont, UK) and 0.5 M NaCl, 50 mM Tris-HCl pH 7.4. Purified samples were concentrated to 10 mg/mL using a 10 kDa MWCO centrifugal filter (Millipore, Billerica, MA, USA).

Crystallography
Crystals of His 6 -H C /FA and His 6 -H C /A1 were grown at 16 • C using a 1:1 ratio of protein solution (10 mg/mL) to well solution using the sitting-drop vapour-diffusion method-4 M sodium formate, 0.1 M sodium acetate pH 5.5 for the former, and 0.1 M MIB pH 4.0, 25% w/v PEG 1500 for the latter. Crystals were soaked in cryoprotectant (equal volume of reservoir solution and 50% glycerol) before vitrification in liquid nitrogen. Complete datasets were collected on beamline I03 and I04, respectively, at the Diamond Light Source (Didcot, UK). Diffraction images were processed using DIALS (Gildea et al., 2014) and scaled using AIMLESS (Evans & Murshudov, 2013) from the CCP4 suite (Winn et al., 2011). Data collection statistics are summarised in Table 1. A combination of R pim and CC 1/2 value were used to determine the resolution cut-off of 1.95 Å and 1.45 Å, respectively. Phase information was determined by molecular replacement using PHASER (McCoy et al., 2007) and a previous structure of H C /A1 (PDB: 2VUA;Stenmark et al., 2008) as the initial search model. Multiple rounds of structure refinement were performed by manual correction in COOT (Emsley et al., 2010) followed by restrained refinement with REFMAC5 (Murshudov et al., 2011). Final validation was performed with MolProbity (Chen et al., 2010). Secondary structures were annotated using Stride (Frishman & Argos, 1995) and figures were prepared using PyMol (Schrödinger, LLC, New York, NY, USA) and CCP4mg (McNicholas et al., 2011).

Crystal structure of BoNT/FA H C domain
We identified crystallisation conditions which yielded crystals of H C /FA in space group P422, with two molecules related by non-crystallographic symmetry in the asymmetric unit. This is different to a recently reported structure, PDB: 5V38 (Yao et al., 2017) and reveals an important loop that is involved in receptor binding (average temperature factor (B-factor) 74 Å 2 ). A high-multiplicity dataset was collected containing 360 degrees of data over 3,600 images. No significant radiation damage was observed over the course of data collection and thus all data were used. The CC 1/2 value for the outer shell was 0.832, indicating there was still very usable data at this resolution (Evans & Murshudov, 2013). The overall structure of H C /FA is shown in Fig. 1C and the crystallographic statistics are listed in Table 1. As with all reported structures of the BoNT receptor binding domain, present is a characteristic β-jellyroll fold at the N-terminal half and a predominantly β-trefoil fold at the C-terminal half of the protein (Figs. 1A-1D). Both molecules in the asymmetric unit overlay well with a root mean square deviation (RMSD) value of 0.35 Å between all atoms. The B-factors for each chain are low overall (24.74 Å 2 and 26.84 Å 2 respectively) with a corresponding overall Wilson B-factor of 24.55 Å 2 . As expected, our structure is highly similar to PDB: 5V38 with an RMSD value for combinations of chains between structures ranges from 0.54 Å to 0.36 Å (Fig. 1D). However, it further reveals the presence of a loop (R1261-R1268) that has been shown in other subtypes to be involved in ganglioside receptor binding ( Fig. 2A). Crystal packing has enabled neighbouring chains to interact directly with this loop, provide sufficient stabilisation to produce good electron density. Yao et al. (2017) suggested that the lack of density was due to high flexibility, which is consistent with the high B-factors we observed in this region relative to the rest of the protein.
The specific ganglioside receptor for BoNT/FA is not yet known; however, considering that GT1b binds with high affinity to BoNT/A1 (Fig. 2B) and that H C /FA and H C /A1 are structurally very similar, we propose that BoNT/FA possesses a similar binding specificity. The overall conformation of this region is also similar to that of BoNT/A1 in complex with a ganglioside receptor (Fig. 2). The detailed conformation of this loop is important for understanding receptor binding and our structure confirms that BoNT/FA could bind to gangliosides in a similar manner to BoNT/A. Proximate to the ganglioside binding region, we observe the presence of a disulphide bond between Cys1227 and Cys1272 (Fig. 3C) which is also not present in the structure 5V38. The equivalent bond has been observed

Crystal structures of the BoNT/A1 binding domain
We have identified a single crystallisation condition that produced two crystal forms of H C /A1-one possessed the equivalent disulphide bond whereas the other did not. Using 25% w/v PEG 1500 and 0.1 M MIB pH 4.0, H C /A1 crystallised into orthorhombic crystals with the space group P2 1 2 1 2 1 (crystal form 1) that diffracted to a resolution of 1.45 Å.
Using the same conditions, monoclinic crystals were obtained six months later with the space group P2 1 (crystal form 2) that diffracted to 1.8 Å resolution. Inspection of this structure (Fig. 1B) revealed the presence of a disulphide bond between C1235 and C1280, the equivalent of which was also observed in our H C /FA crystal structure (Figs. 3A and 3B) and in a full-length BoNT/A1 crystal structure (PDB: 3BTA; Lacy & Stevens, 1999).
Our findings suggest that the crystallisation condition is not the only determinant as to whether the bond is present or not. The conservation of these cysteine residues across the BoNT sub-serotypes suggests they are very important (Fig. 4); however, their precise