Towards new cholera prophylactics and treatment: Crystal structures of bacterial enterotoxins in complex with GM1 mimics

Cholera is a life-threatening disease in many countries, and new drugs are clearly needed. C-glycosidic antagonists may serve such a purpose. Here we report atomic-resolution crystal structures of three such compounds in complexes with the cholera toxin. The structures give unprecedented atomic details of the molecular interactions and show how the inhibitors efficiently block the GM1 binding site. These molecules are well suited for development into low-cost prophylactic drugs, due to their relatively easy synthesis and their resistance to glycolytic enzymes. One of the compounds links two toxin B-pentamers in the crystal structure, which may yield improved inhibition through the formation of toxin aggregates. These structures can spark the improved design of GM1 mimics, either alone or as multivalent inhibitors connecting multiple GM1-binding sites. Future developments may further include compounds that link the primary and secondary binding sites. Serving as decoys, receptor mimics may lessen symptoms while avoiding the use of antibiotics.


Introduction
The secreted enterotoxins from Vibrio cholerae and enterotoxigenic E. coli (ETEC) cause millions of diarrhea episodes each year (Qadri et al., 2005;Ali et al., 2015).
Cholera is responsible for approximately 100,000 deaths annually, a number that has been predicted to increase with climate change (Holmner et al., 2010). ETEC mortality is estimated to be significantly higher, although it is difficult to determine accurate numbers due to underreporting and misdiagnosis (Qadri et al., 2005). With major epidemics in recent history, there is still a requirement for rapid-acting drugs. The cholera toxin (CT) and the heat-labile enterotoxin (LT) consist of one catalytically active A-subunit bound to five non-toxic B-subunits that are arranged in a homopentamer (Merritt & Hol, 1995). The B-pentamers (CTB and LTB, respectively) are responsible for the binding to epithelial cells in the small intestine, facilitating the endocytosis of the toxin (Chinnapen et al., 2007;Heggelund et al., 2015). Inside the intestinal cell, the A-subunit causes a signaling cascade leading to watery diarrhea by the opening of ion channels. The resulting diarrhea can be up to 1 liter/hour, leading to life-threatening dehydration if left untreated (Sack et al., 2004;Harris et al., 2012). Treatment is accomplished with the application of oral rehydration therapy, but this requires medical competence and large quantities of clean water, both of which can be limited resources during an epidemic. While general antibiotics can be used against cholera infections, they are only used in serious cases and have been shown to limit the duration of the disease by 50% (Nelson et al., 2011).
At present there are three cholera vaccines on the market; Shanchol (Shantha Biotechnics, India), Euvichol (EuBiologics, Korea) and Dukoral (Crucell, Netherlands), the latter of which has been shown to be effective towards both cholera and ETEC-induced diarrhea (Jelinek & Kollaritsch, 2008). All three are inactivated vaccines that have to be taken in two spaced doses, and are hence impractical in a situation where rapid protection is required, such as during a cholera outbreak. They are most frequently used for travelers from non-endemic areas, and are not effective on children under the age of 1-2 years (WHO, 2010). The only live attenuated vaccine, CVD 103-HgR (Berna Biotech, formerly Swiss Serum and Vaccine Institute, Switzerland) was taken off the market in 2003 for financial reasons (Herzog, 2016). This vaccine is currently under re-assessment, now produced in the U.S. (Vaxchora, PaxVax), and has shown to have great potential for being more effective than the other vaccines. It is also more suitable for use after outbreaks since it only requires one dose (Chen et al., 2016;Harris, 2016).
The primary receptor of both CT and LT is the GM1 ganglioside (Holmgren et al., 1975). The binding of CT to the GM1 oligosaccharide (GM1-os) Figure 1) is one of the strongest proteincarbohydrate interactions known, with a binding constant of 43 nM (Kuziemko et al., 1996;Turnbull et al., 2004). Binding has been described as a "two-fingered grip", provided by the two terminal residues, galactose (Gal) and sialic acid (NeuAc) (Merritt et al., 1994). The specificity of this interaction is mainly determined by the terminal galactose residue, which is buried in a deep pocket at the rugged underside of the toxin (distant from the A-subunit). Methyl β-galactopyranoside (GalOMet) alone has been shown to bind to the CT with a K D of 15 mM (Turnbull et al., 2004).
Binding of the sialic acid residue is not as strong (>200 mM), with its contribution resulting mainly from the conformational preorganization of GM1-os. CT and LT, with their five equivalent binding subunits, are known to be multivalent proteins. By binding five GM1 molecules simultaneously, the binding strength can be increased by at least an order of magnitude (Lauer et al., 2002).
In addition to GM1, the CT and related toxins have been shown to interact with fucosylated blood group antigens and derivatives (Holmner et al., 2004;Holmner et al., 2007;Heggelund et al., 2012;Mandal et al., 2012;Vasile et al., 2014;Heggelund et al., 2016). Toxin binding to the blood group H-determinant Lewis-y (Fucα2Galβ4[Fucα3]GlcNAc) characteristic of blood group O, and its blood group A-counterpart, have recently been characterized both crystallographically and by quantitative binding analyses (Heggelund et al., 2016). The blood group antigens bind at the lateral side of the toxin, approximately 10 Å from the primary binding site.
In all structures, the ligands are anchored to the toxin via a fucose residue. The binding strength of the Lewis-y blood group determinants is in the millimolar range, and the ligand has been suggested to serve as a secondary receptor for cell entry (Holmner et al., 2004;Holmner et al., 2007;Mandal et al., 2012;Heggelund et al., 2016). Indeed, recent cell biology experiments have shown that fucosylated carbohydrate structures can serve as functional receptors in cells in which GM1 synthesis is inhibited (Wands et al., 2015). The cholera toxins come in two varieties: classical (c) and El Tor (ET). The two biotypes differ at two amino acid residues in the B-subunit, 18 and 47 (His18 and Thr47 in cCTB, and Tyr18 and Ile47 in ET CTB). While ET CTB displays reduced affinity for blood group A and B antigens (Mandal et al., 2012;Heggelund et al., 2016), both CT variants bind equally strongly to GM1 (Dubey et al., 1990). The classical CT was by far more abundant in ancient times, and is undergoing a resurgence again today (Nair et al., 2006).
The search for an effective CT inhibitor was accelerated by the publications of the crystal structures of CT and LT in the 1990's (Sixma et al., 1991;Merritt et al., 1994).
The major strategy in inhibitor design has been to use the terminal galactose as an anchor, for example in the promising molecule m-nitrophenyl-α-D-galactopyranoside (MNPG). MNPG showed enhanced binding capability compared to D-galactose, presumably through the favorable displacement of a water molecule in the binding site, enabling the direct interaction of the nitro group of MNPG to Gly33 (Merritt et al., 1997). Several crystal structures were solved to investigate the potential for MNPG and its derivatives as potent inhibitors (Minke et al., 2000;Pickens et al., 2002;Mitchell et al., 2004). Lately the search has turned towards multivalent inhibitors, using pentavalent GM1-os on different scaffolds, creating a 1:1 interaction of toxin and inhibitor Mattarella et al., 2013). An interesting approach is to "let CT fight itself" using CTB modified with GM1-os residues as penta-GM1-os-CTB neoglycoprotein inhibitors (Branson et al., 2014). This potent glycoprotein inhibitor was shown to bind 1:1 to native CTB with picomolar affinity.
Other potential CT inhibitors are polyphenolic compounds from grape-seed extract, suggesting that antagonists do not necessarily have to be based on carbohydrate structures (Reddy et al., 2013;Cherubin et al., 2016).
Despite the progress in chemo-enzymatic synthesis of GM1-os, the synthesis of this oligosaccharide is still expensive, and polyvalent GM1-os inhibitors are unlikely to result in a low-cost drug (Zuilhof, 2016). GM1 mimics have the potential to overcome this problem, and to result in materials that are relatively inexpensive and sufficiently active. Various approaches have been adopted to mimic GM1-os (Bernardi & Cheshev, 2008;Cheshev et al., 2010;Ramos-Soriano et al., 2013). In particular, searching for non-hydrolyzable analogs, we explored both simple C-galactosides These molecules are significantly simpler to synthesize than GM1-os and are stable to glycolytic enzymes, due to the absence of proper O-glycosidic linkages. In this paper, we describe the X-ray crystal structures of three such inhibitors (Figure 1, 1-3), binding to the cholera toxin and the heat-labile enterotoxin. Compound 1 (Podlipnik et al., 2007) is a 3,4,5-trimethoxycinnamic acid galactoconjugate that can be synthesized without the need for protective groups, and has the potential for further extensions. Compounds 2 and 3 are bidentate ligands that combine the two terminal residues of the GM1 oligosaccharide, i.e. galactose and sialic acid (Cheshev et al., 2010).
The binding strengths of all three ligands have been measured, and found to be in the upper µM range (Podlipnik et al., 2007;Cheshev et al., 2010). While this is probably not enough for sufficient GM1 inhibition alone, the ligands have the potential for being combined into multivalent receptor-binding antagonists, linking several or all five binding sites.

Figure 1. Schematic representation of GM1 pentasaccharide and inhibitors 1-3.
Carbohydrate residues are labeled, and the reducing end of GM1-os is indicated by a waved line. Compound 3 was used as a mixture of two stereoisomers (R and S) at the center, indicated by a waved line.

Results
We present four high-resolution structures of toxin-inhibitor complexes (Table 1; Figure 2 and 3). To improve the chance of successful crystallization, we used three homologous toxins: cCTB, ET CTB and porcine LTB (pLTB; with single-site mutation R13H). All three toxins have the same amino acid sequence in the primary binding site, and they also have essentially identical 3D structures. Inhibitor 1 was crystallized with ET CTB, inhibitor 2 with cCTB and ET CTB, and inhibitor 3 with pLTB R13H. The cholera toxin structures presented here were solved to atomic resolution, allowing for detailed analysis of the interactions. They are only surpassed by the recently published cCTB structure in complex with blood group determinants, solved to 1.08 Å resolution (PDB ID: 5ELB; Heggelund et al., 2016). The pLTB structure is only surpassed by a structure solved to 1.3 Å (PDB ID: 1DJR; Minke et al., 2000), and is the first deposited structure of the R13H variant.  Relevant amino acids are shown as blue sticks and labeled. Residues from neighboring subunits are indicated with a hash (#). Water molecules are shown as red spheres, and hydrogen bonds by red dashed lines (restricted to bond lengths less than 3.6 Å, and with favorable angles). Carbohydrate residues are labeled in italics. Shown are toxin complexes with (A) GM1-os (white sticks; PDB ID 3CHB; Merritt et al., 1998), (B) 1 (yellow sticks), (C) 2 (green sticks), and (D) 3 (cyan sticks).
All ligands bind to the same binding site, B-D show structures solved in this work.

Crystal structure of pLTB R13H in complex with inhibitor 3
The toxin complex with inhibitor 3 was solved to a resolution of 1.6 Å in space group P2 1 2 1 2 1 with one B-pentamer in the asymmetric unit. The inhibitor was present in all five binding sites ( Figure 2E,F), and binds in the same manner as 2, with the galactose residue binding to Glu51, Gln61, Asn90 and Lys91, and the sialic acid residue interacting with Glu11 and His13 ( Figure 3D). The benzylamido extension, the feature distinguishing this inhibitor from 2, is flexible and sometimes adopts different conformations. Interestingly, in two of the binding sites, the inhibitor is stretched out and links two adjacent toxin pentamers (Figure 4). In these two cases, the sialic acid moiety extends into the sialic acid binding site of another B-pentamer, creating a cross-over with another ligand. The inhibitor solution was a mixture of two stereoisomers, as indicated by the waved line in Figure 1. Both isomers displayed very similar retention factors on weak affinity chromatography with immobilized CTB (Bergström et al., 2009), suggesting a similar affinity for the protein (Cheshev et al., 2010). The S-isomer is predominantly seen in the crystal structure, such that only one of the five binding sites contains the R-isomer.

Discussion
Here we present the X-ray crystal structures of four enterotoxin inhibitor complexes at atomic resolution. As expected, all three inhibitors bind in the primary binding site, acting as decoys for the toxins' main receptor, the GM1 ganglioside. One of the inhibitors shows the surprising capability to facilitate bridging to nearby toxins, thus potentially allowing the formation of larger aggregates. Multivalent constructs of GM1-os have been proposed before as effective antagonists of CT action in Vibrio cholerae infections (Zomer-van Ommen et al., 2016;Zuilhof, 2016). By using these simple C-glycosidic mimics of GM1-os, it might be possible to simplify the organic synthesis process, which is a crucial step in the development of a low cost drug.
Ligand-induced dimerization of toxin B-pentamers has been reported previously, both for the CTB (Zhang et al., 2002) and the shiga-like toxin (Kitov et al., 2000).
More recently, Turnbull, Zuilhof and coworkers (Sisu et al., 2009) found that divalent and tetravalent analogs of GM1 were better inhibitors than pentavalent inhibitors (Zhang et al., 2004;Fu et al., 2015). Likewise, earlier reports found a 47,500-fold increase in binding for octavalent GM1-os dendritic glycoconjugates, resulting in an IC 50 of 5±1×10 -11 M (Pukin et al., 2007). This is conceivably achieved through linking more than two B-pentamers together, resulting in the formation of aggregates. The bivalent inhibitor described in this paper connects receptor binding sites from different pentamers in the crystal unit cell, creating a chain of toxins (Figure 4). This would have been difficult to predict by molecular modeling, which only deals with one B-pentamer at a time. Aggregating the soluble toxin could be a very effective strategy for preventing fluid accumulation during cholera infection.
By exploiting the blood-group antigen binding site of the toxin (Figure 5), it might be possible to create even more potent inhibitors that function by promotion of the aggregation effect. It was recently shown that the two binding sites for GM1-os and blood group antigens can be occupied simultaneously (Vasile et al., 2014;Heggelund et al., 2016). Dual-binding site inhibitors could have the potential to induce an aggregation event by linking the primary site from one B-pentamer to the secondary site of another. Although B-pentamers in a crystal are likely positioned closer together than in the gut of a cholera-infected individual, inhibitor-induced linking of pentamers in the human gut is conceivable. The concentration of CT in human stool has been measured at 10 µg/ml (Turnbull et al., 1985), and it is likely that the concentration in the small intestine is similar. Other reported inhibitors may also work by linking different binding sites in different B-pentamers, for example the high-molecular weight polysaccharide from garlic water extract that has been shown to be bioactive against CTB (Politi et al., 2006).
The structural insights offered in this paper can help to spark further developments in the design of potent and cost-effective cholera toxin inhibitors, especially by exploiting the promising tactic of dual-site binders. The development of such inhibitors will be crucial for making a prophylactic cure against cholera and ETECinduced diarrhea, while at the same time avoiding antibiotics. More than 1.8 billion people use a drinking-water source contaminated with fecal matter (WHO, 2015).
The development of a low-cost drug that is independent of a cold-chain delivery will have a great potential to lessen the incredible toll these diseases have on developing nations.

Syntheses of GM1 mimics 1-3
Inhibitor 1 (Podlipnik et al., 2007) uses one component of the binding epitope, galactose, mounted on a cinnamic acid-based spacer. Inhibitor 2 (Cheshev et al., 2010) mimics the GM1 oligosaccharide by connecting a sialic acid residue and galactose with a spacer consisting of a triazole group and an amide bond. Inhibitor 3 is identical to 2 except for an added benzylamido moiety.

Expression of ET CTB
The gene for ET CTB (Uniprot: P01556) was previously introduced into the nonpathogenic Vibrio sp. 60 under an IPTG-inducible promoter (Aman et al., 2001), an expression system kindly provided by Professor Timothy Hirst. The gene includes an N-terminal signal sequence directing the protein to the periplasmic space and subsequent secretion into the growth media, facilitating high expression and easy purification. The bacteria were grown in high-salt LB medium (15 g/L NaCl) supplemented with 0.1 mg/ml ampicillin at 30 °C with shaking. Expression was induced with 0.5 mM IPTG at an OD 600nm of 0.2, followed by protein production for 16-20 hours. The medium was separated from the bacterial pellet by centrifugation at 40,000 g at 20 °C, and purified further as described for all toxin B-pentamers.

Expression of cCTB and pLTB R13H
The genes for cCTB (Uniprot: Q57193) and pLTB R13H (Uniprot: P32890 with an Arg to His substitution) were previously synthesized, cloned into the pET21b+ plasmid, and introduced into E. coli BL21(DE3) (Holmner et al., 2011;Heggelund et al., 2016). The sequence contains the signal sequence for secretion, but the protein is retained within the periplasmic space of E. coli, requiring purification by periplasmic lysis. The bacteria were grown in LB medium containing 0.1 mg/ml ampicillin at 37 °C until an OD 600nm of 0.5 was reached. After the temperature was lowered to 25 °C, expression was induced with 0.5 mM IPTG, and protein produced for 14-18 hours. The supernatant was separated from the bacterial pellet by centrifugation at 6900 g and the pellet was re-suspended in ice-cold periplasmic lysis buffer (30 mM Tris pH 8, 20% (w/v) sucrose, 1 mM EDTA, 5 mM MgSO 4 ), with 150 µg lysozyme per gram of cell pellet added after re-suspension. The periplasmic solution was kept cold while stirring for 10-30 minutes. The solution was centrifuged at 8500 g for 15 minutes and dialyzed against PBS in a Snakeskin tube (Thermo Scientific, 3500 MWCO). The resulting protein was centrifuged at 45,000 g for 20 minutes and purified further as described for all toxin B-pentamers.

Purification of toxin B-pentamers
The protein was applied to a D-galactose-sepharose affinity column (Thermo Scientific) and eluted using 300 mM galactose in PBS. The fractions were concentrated using Vivaspin 20 ml concentrator tubes (5000 MWCO, PES membrane, Sartorius), and subjected to size-exclusion chromatography on a Superdex75 column mounted on an Äkta FPLC machine, pre-equilibrated with a Tris running buffer (20 mM Tris, 200 mM NaCl at pH 7.5). Fractions with toxin were dialyzed against the Tris running buffer in a Snakeskin dialysis tube (3500 MWCO, Thermo Scientific), concentrated using concentrator tubes to 3-10 mg/ml, and stored at -80 °C.

Co-crystallization of toxins with inhibitors
Two hours before crystallization, toxins and inhibitors were mixed at a molar ration of 1:10 (B-subunit:inhibitor). Initial co-crystallization experiments were carried out at 20 °C with a crystallization robot (Oryx4, Douglas Instruments, UK). First hits were obtained in both the Morpheus screen (Gorrec, 2009), in conditions A1, A4, A9, A10 and A12, and the PGA-LM screen conditions D10 and D5 (both Molecular Dimensions). The hits were optimized using the hanging-drop vapor-diffusion technique using 24-well trays. Variations in the conditions identified by preliminary screening were subsequently explored with the use of microseeding, where seeds were prepared by crushing crystals from earlier screens with a seed bead. pLTB R13H was chosen for co-crystallization with inhibitor 3 due to its relative ease of crystallization.
The ET CTB + 1 data set was collected from a crystal found in an almost dried-out drop containing MES pH 6, 30% PEG 400, 3% PGA-LM (optimization of PGA-LM condition D5). The drop was hydrated with the same buffer, and no additional cryoprotection was necessary.
Crystals of the pLTB R13H + 3 complex were obtained from several conditions of the Clear Strategy 1 crystal screen (Molecular dimensions). Diffraction-quality crystals were grown in 0.1 M sodium cacodylate pH 6.5, 0.2 M lithium sulfate, and 15% PEG 4000, and cryo-protected using the original buffer with 25% glycerol added.

Data collection and refinement
Samples were mounted in loops, flash-frozen in a nitrogen cryo stream, and subjected to data collection at the ESRF, Grenoble, France (Nurizzo et al., 2006;Gabadinho et al., 2010;de Sanctis et al., 2012) (Table 1). Scaling and processing of the CTB data sets was done with XDS (Kabsch, 2010), whereas the pLTB data set was scaled and processed with Mosflm (Battye et al., 2011). Diffraction cut-offs were chosen based on the assessment of CC 1/2 (Karplus & Diederichs, 2012;Karplus & Diederichs, 2015) (Table 1). The pLTB + 3 crystal data was collected using remote access, to a resolution of 1.60 Å, which in hindsight turned out to be a very conservative cutoff with CC 1/2 = 90.2 The structures were solved by molecular replacement using MOLREP from the CCP4 software suite . The search model used for the CTB structures was a 1.25 Å crystal structure of cCTB (PDB ID: 3CHB; Merritt et al., 1998). The pLTB R13H structure was solved using the native pLTB crystal structure (PDB ID: 1EFI; Fan et al., 2001) as a search model. The search models were prepared by pruning unconserved residues and removing water molecules with the program CHAINSAW (Stein, 2008).
The inhibitors were modeled in MarvinSketch and MarvinSpace (ChemAxon.com), and the corresponding PDB-and library files were created using PRODRG (Schüttelkopf & van Aalten, 2004).
After initial rigid body refinement using REFMAC5 , setting 5% of the reflections aside for calculating the R free , the structures were surveyed and patched with Coot (Emsley et al., 2010) and further refined. At later stages of the refinement, water molecules were manually added with Coot. The inhibitors were included last. All inhibitors were modeled with 100% occupancy after assessment of the difference electron density and B-factors of the ligand and the nearby protein chain.
The data sets with inhibitor 2 showed significant anisotropy, which resulted in disproportionally high R-factors, a common effect of anisotropy. Although the Hamilton R ratio test showed that the structures could also be refined with anisotropic B-factors, this was abandoned in favor of an isotropic B-factor model with TLS refinement, since anisotropic B-factors resulted in inferior electron density in the loop areas (residues 50-60), and made model building harder. The application of TLS refinement resulted in well-defined density also in the loop areas, along with R/R free values comparable to those obtained from anisotropic B-factor refinement.
In the structure of cCTB in complex with 2 some of the intramolecular disulfide bridges (Cys9-Cys86) show oxidation, suggesting radiation damage. Cys9 has previously been reported to have two conformations, where one is pointing away from Cys86 and Thr15 (Merritt et al., 1998;Heggelund et al., 2016). However, in this structure, strong positive difference density (10 r.m.s.d.) was observed on the opposite site of the disulfide link, towards Thr15, indicating a modification of the sulfur atom rather than an alternative conformation of the residue. Partial oxidation of the residues was modeled by replacing the cysteines with S-oxy cysteine (CSX), with 50% occupancy of the oxygen in chains A, B and D.
The pLTB R13H + 3 data set is of high quality with close to 100% completeness, while the ET CTB + 1 data set exhibited some anisotropy. Both structures were refined with standard isotropic B-factors.