Structure aided design of a Neu5Gc specific lectin

Subtilase cytotoxin (SubAB) of Escherichia coli is an AB5 class bacterial toxin. The pentameric B subunit (SubB) binds the cellular carbohydrate receptor, α2–3-linked N-glycolylneuraminic acid (Neu5Gc). Neu5Gc is not expressed on normal human cells, but is expressed by cancer cells. Elevated Neu5Gc has been observed in breast, ovarian, prostate, colon and lung cancer. The presence of Neu5Gc is prognostically important, and correlates with invasiveness, metastasis and tumour grade. Neu5Gc binding by SubB suggests that it may have utility as a diagnostic tool for the detection Neu5Gc tumor antigens. Native SubB has 20-fold less binding to N-acetlylneuraminic acid (Neu5Ac); over 30-fold less if the Neu5Gc linkage was changed from α2–3 to α2–6. Using molecular modeling approaches, site directed mutations were made to reduce the α2–3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\gg }}$$\end{document}≫ α2–6-linkage preference, while maintaining or enhancing the selectivity of SubB for Neu5Gc over Neu5Ac. Surface plasmon resonance and glycan array analysis showed that the SubBΔS106/ΔT107 mutant displayed improved specificity towards Neu5Gc and bound to α2–6-linked Neu5Gc. SubBΔS106/ΔT107 could discriminate NeuGc- over Neu5Ac-glycoconjugates in ELISA. These data suggest that improved SubB mutants offer a new tool for the testing of biological samples, particularly serum and other fluids from individuals with cancer or suspected of having cancer.

AB5 toxins exert their effects in a two-step process: (i) binding of the pentameric B subunit to specific glycan receptors on the target cell surface; (ii) internalisation of the AB5 toxin, followed by A subunit-mediated inhibition or corruption of essential host functions 1 . The B subunits of AB5 toxins recognize cell surface glycan receptors, directing internalization and intracellular trafficking of the holotoxin. Specificity of these protein-glycan interactions is critical for pathogenesis, as it determines host susceptibility and tissue tropism. Moreover, the pentavalent interactions between AB5 toxin B subunits and their cognate glycans result in very high affinity binding, making them powerful ligands for glycan detection, a noteworthy example being use of the cholera toxin B subunit for detection of the ganglioside GM1 in histopathological sections 2 and for labelling of lipid rafts in membranes 3 .
In 2004 Paton et al. described the discovery and initial biological characterization of a new sub-family of bacterial AB5 toxins with the prototype termed subtilase cytotoxin (SubAB) 4 . In the case of SubAB, the A subunit (SubA) was found to be a subtilase family serine protease with exquisite specificity for the essential endoplasmic reticulum chaperone BiP/GRP78 5 . Structural studies revealed that unlike most subtilases, SubA possessed an unusually deep active site cleft, explaining its exquisite substrate specificity 5 . SubA has proven to be a powerful tool for examining the role of BiP in diverse cellular processes and it also has potential as a cancer therapeutic 6,7 . Significantly, glycan array analysis has shown that the B subunit of the toxin (SubB) has a high degree of binding specificity for glycans terminating with α2-3-linked N-glycolylneuraminic acid (Neu5Gc), a sialic acid that humans cannot synthesise 8 . Of all the glycans on the array, the best binding occurred with Neu5Gcα2-3 Galβ1-4GlcNAcβ-. Binding of labelled toxin to the array was reduced 20-fold if the Neu5Gc was changed to Neu5Ac; over 30-fold if the Neu5Gc linkage was changed from α2-3 to α2-6; and 100-fold if the sialic acid was removed. The overall pattern of binding to structures represented on the array indicated that SubB has a high affinity for terminal α2-3-linked Neu5Gc with little discrimination for the penultimate moiety. The crystal structure of the SubB-Neu5Gc complex revealed the basis for this specificity. The additional hydroxyl on the methyl group of the N-acetyl moiety that distinguishes Neu5Gc from Neu5Ac interacts with Tyr78 OH of SubB and hydrogen bonds with the main chain of Met10 8 . These key interactions could not occur with Neu5Ac, thus explaining the marked preference for Neu5Gc. Guided by the structural data, key residues were mutagenized in the predicted binding pocket, and this abrogated glycan recognition, cell binding and toxicity. SubB amino acids S12 and Y78 form crucial stabilizing bonds with Neu5Gc 8 . An S12A mutation abolished glycan binding completely, while a Y78F mutation that prevents interactions with the C 11 OH group that distinguishes Neu5Gc from Neu5Ac reduced glycan binding by 90% and abolished preference of the mutant SubB protein for Neu5Gc over Neu5Ac 8 .
Interestingly, the most prominent form of aberrant glycosylation in human cancers is the expression of glycans terminated by Neu5Gc. Neu5Gc is not expressed in significant levels on normal healthy human cells 9-12 as humans cannot synthesise Neu5Gc due to an inactivating mutation in the CMAH gene 13 . Nevertheless, research suggests that Neu5Gc presentation in cancer patients can be explained by Neu5Gc absorption through dietary intake of red meat and dairy products, which are the richest sources of Neu5Gc 14 . The presence of Neu5Gc is prognostically important, because its expression frequently correlates with invasiveness, metastasis and the tumour grade 10 . Preferential display of Neu5Gc glycans on cancer cells may be at least partly explained by the hypoxic tumour environment, which markedly induces expression of the sialic acid transporter sialin, resulting in increased display of Neu5Gc and other sialic acids on the cell surface 15 . Due to the fact that sialyl-conjugates regulate adhesion and promote cell mobility, such alterations in surface sialylation may influence the colonisation and metastatic potential of tumour cells 16 . Elevated levels of abnormal sialic acids such as Neu5Gc have been observed in breast, ovarian, prostate, colon and lung cancer 11,12 . Importantly, incorporation of Neu5Gc in cancer cells is most prominent in soluble glycoproteins found both in the extracellular space and inside the cell, and Neu5Gc is the dominant sialic acid in glycoproteins secreted from cancer cells into the surrounding tissues 9 . The expression of Neu5Gc in cancer is also known to drive production of xenoautoantibodies against Neu5Gc 17, 18 . These anti-Neu5Gc antibodies are being investigated to determine their potential for novel diagnostics, prognostics, and therapeutics in human carcinomas 17 .
Due to its known involvement in cancer and its normally low level in non-cancerous human tissues, detection of high levels of Neu5Gc in serum and in tissues would be considered abnormal and would be indicative of the presence of a tumour. This raises the possibility of exploiting the specificity of SubB for Neu5Gc to develop a high-throughput diagnostic screening test for a range of cancers. However, the poor affinity for α2-6 linked Neu5Gc might impact on the sensitivity of such a test. In the present study, we have examined the interaction between SubB and glycans terminating in either α2-3-linked, or α2-6 linked, Neu5Gc, with a view to designing a SubB mutant with capacity to recognise both types of structures with high affinity.

Results
Structure-guided mutation of the glycan binding site of SubB. In order to understand the molecular basis for the preference for α2-3-linked structures, we have compared the interaction between SubB and Neu5Gcα2-3Galβ1-3GlcNAc (determined by X-ray crystallography) vs Neu5Gcα2-6 Galβ1-3Glc ( Fig. 1). Whereas the sub-terminal sugars of the former glycan extend freely out into the solvent, as reported previously 8 , the tertiary sugar of the α2-6 structure is folded back onto the SubB surface, making close contact with a loop comprising SubB residues T104-E108. This loop is stabilised by a disulphide bond between C103 and C109. The resultant steric hindrance distorts the docking of the terminal Neu5Gc into the binding pocket, accounting for the significantly poorer binding of α2-6-linked Neu5Gc structures observed on the original glycan array analysis.
Since α2-6-linked sialic acids are common markers of colon cancer 19,20 and are linked to prognosis in a range of cancers 21 , we used molecular engineering to improve binding of α2-6-linked Neu5Gc structures to SubB by designing a series of substitution and/or deletion mutants to reduce the height of the T104-E108 loop. We have modelled the interactions between these SubB mutants and Neu5Gcα2-6 Galβ1-3Glc and predict that Figure 1. Surface representation of SubB in complex with (A) Neu5Gcα2-3Galβ1-3GlcNAc (determined from a X-ray crystal structure (Byres et al. 8 )) and (B) Neu5Gcα2-6Galβ1-3Glc (modeled with the X-ray crystal structure). Trisaccharides are shown as a green or cyan stick with red and blue residues representing oxygen and nitrogen, respectively. they would have improved recognition of α2-6-linked Neu5Gc without significantly impacting on α2-3-linked Neu5Gc binding, as shown in Fig. 2. We then constructed recombinant subB genes and expressed and purified the various proteins as C terminal His 6 -tagged fusion proteins from recombinant E. coli (see Materials and Methods). SubB proteins with single or double amino acid substitutions (T107A and S106A/T107A), a double deletion mutant (ΔS106/ΔT107) and a triple mutant (ΔS106/ΔT107/E108D) were successfully purified.
Surface plasmon resonance of engineered SubB mutants. Purified SubB and the various mutant derivatives were then immobilized on Biacore chips and tested for binding affinities to a range of Neu5Ac-or Neu5Gc-terminating structures (free sialic acid, sialic acid-α2-3-lactose and sialic acid-α2-6-lactose), as well as to human and bovine α1-acid glycoprotein (AGP), by surface plasmon resonance (SPR) ( Table 1). The human AGP glycans contain Neu5Ac 22, 23 and the bovine AGP glycans contain both Neu5Ac and Neu5Gc 23 . The MS glycoproteomic analysis (Fig. S1) was performed to confirm the Neu5Ac and Neu5Gc distribution in the human and bovine AGP used in the SPR study. Wild-type SubB was found to have high affinity for α2-3-linked Neu5Gc-lactose and free Neu5Gc, as predicted from the glycan array result, with nanomolar binding affinities observed. No binding was observed for the α2-6-linked Neu5Gc-lactose (tested to a maximum concentration of 25 µM) and 2.2 µM affinity was observed for α2-3-linked Neu5Ac -a more than 300-fold decrease in binding compared to the equivalent Neu5Gc structure. The wild-type SubB also had a 13-fold reduced binding affinity for human AGP compared to bovine AGP. The wild-type SubB had no binding to any non-sialylated glycans tested ( Table 1). The mutation in SubB T107A had no significant effect on binding to any of the tested structures compared to the wild-type protein. SubB S106A/T107A had improved binding to α2-6-linked structures, but this improvement was seen for both Neu5Ac and Neu5Gc. The nanomolar range affinities observed for all linked sugars tested including Neu5Acα2-8 (GT2; Table 1) and binding to sulfated Chondroitin (Chondroitin-6-sulfate; Table 1) reveals that SubB S106A/T107A has a relaxed specificity range that included non-sialic acid structures. The SubB ΔS106/ΔT107/E108D mutant had improved recognition of α2-6-linked Neu5Gc without changing the binding to the α2-6-linked Neu5Ac structures. However, the difference in affinity between α2-3-linked Neu5Ac and α2-3-linked Neu5Gc was reduced to 50-fold compared to the 300-fold observed for the wild-type. The SubB ΔS106/ ΔT107 mutant was significantly improved for Neu5Gc vs Neu5Ac discrimination compared to the wild-type protein, and had the ability to bind α2-3-linked Neu5Gc and α2-6-linked Neu5Gc with binding affinities that were not significantly different between the two structures (15.3 nM vs 8.5 nM, respectively; P = 0.12). Thus, SubB ΔS106/ ΔT107 exhibited the optimum combination of enhanced Neu5Gc vs Neu5Ac discrimination and the capacity to recognise both α2-3and α2-6-linked Neu5Gc structures. SubB ΔS106/ΔT107 also demonstrated no binding to any of the non-sialylated glycans tested ( Table 1). The anti-Neu5Gc antibody produced in chicken was used as a Figure 2. Surface representation of the wild-type and SubB mutants modeled with Neu5GCα2-6Galβ1-3Glc (shown as a cyan stick). The mutated SubB residues are shown as grey sticks and red and blue residues represent oxygen and nitrogen, respectively. control and showed less selectivity and lower affinity for Neu5Gc containing glycans than any of the SubB proteins tested.

ELISA of engineered SubB against human and bovine proteins/serum.
To assess the ability of the engineered mutants to detect the presence of Neu5Gc in biological samples ELISA assays were performed. Using dishes coated with a dilution series of SubB, labelled serum proteins from human and bovine sources were tested. A two-fold improvement in differential recognition of the Neu5Gc containing serum proteins from bovine was identified with SubB ΔS106/ΔT107 (Fig. 3).

Detection of human vs bovine AGP.
To independently verify the capacity to discriminate between human and bovine AGP (only bovine AGP displays significant levels of Neu5Gc-terminating glycans), serially diluted glycoproteins were spotted onto nitrocellulose filters and after washing and blocking, filters were overlayed with purified biotinylated SubB ΔS106/ΔT107 . Bound lectin was then detected on washed filters using Streptavidin-AP (Fig. 4). SubB ΔS106/ΔT107 binding to bovine AGP was detectable down to approximately 200 ng/spot, while significant binding to human AGP was not detectable even at the maximum amount tested (12.5 μg/spot). This discriminatory power is consistent with the SPR data above.

Discussion
Neu5Gc is an important diagnostic and prognostic marker in human carcinomas, with elevated Neu5Gc expression detected in breast, ovarian, prostate, colon and lung cancer 11,12 . Wild-type SubB had unprecedented specificity for glycans terminating in Neu5Gc, but bound poorly to α2-6-linked Neu5Gc and still recognised α2-3-linked Neu5Ac structures albeit weakly 8 . To improve the recognition of SubB for α2-6-linked Neu5Gc and make it more specific for Neu5Gc, we engineered SubB using structure-aided modifications, with specific focus on the T104-E108 loop. Manipulation of this loop had two specific outcomes through the modification of the same two amino acids. Firstly, alanine substitution of S106 and T107 (S106A/T107A) led to a loss of specificity for Neu5Gc, producing a lectin capable of binding to all tested terminally sialylated glycans regardless of linkage (α2-3, α2-6 and α2-8) or sialic acid type (Neu5Ac or Neu5Gc) in SPR studies. Glycan array analysis confirms the relaxed specificity and revealed binding to additional, sulfated glycans. The second was that deletion of the same two amino acids (ΔS106/ΔT107) produced a lectin with exquisite specificity for Neu5Gc regardless of linkage (α2-3 and α2-6). The SubB ΔS106/ΔT107 mutant was significantly improved for the recognition Neu5Gc containing structures compared to the wild-type SubB. SubB ΔS106/ΔT107 also had no difference in its ability to bind α2-3-linked Neu5Gc or α2-6 linked Neu5Gc structures, making it a significant improvement over the wild-type protein. Further modifications of the SubB protein outside of the S106 and T107 amino acids produced no significant improvement in specificity. The SubB ΔS106/ΔT107/E108D mutant protein, which is the SubB ΔS106/ΔT107 protein with a E108D mutation also added, was less able to distinguish α2-3-linked Neu5Gc from α2-3-linked Neu5Ac than SubB ΔS106/ΔT107 and had stronger binding to the human α1-Acid glycoprotein than the SubB ΔS106/ΔT107 mutant (24 fold more protein bound by SubB ΔS106/ΔT107/E108D than SubB ΔS106/ΔT107 ).

Neu5Ac-α2-6-lac
These improved SubB mutants offer a new tool for the testing of biological samples, particularly serum and other fluids from individuals with cancer or suspected of having cancer.

Methods
Structural modeling of SubB. The three-dimensional structure of the SubB mutants were modeled using Phyre2 24 . Neu5GCα2-6Galβ1-3Glc was acquired from PDB ID: 4EN8 25 and modeled into the SubB and SubB mutant structures manually using Coot 26 . Construction and expression of SubB mutants. Mutations were introduced into the subB coding sequence (close to the 3′ end) by direct high-fidelity PCR using the forward primer pETSubBF and the respective mutant-specific reverse primers listed in Table 2. PCR products were cloned into the BamHI and XhoI sites of pET-23(+) (Novagen) and transformed into E. coli BL21(DE3). SubB derivatives were expressed and purified as His 6 -tagged fusion proteins by Ni-NTA affinity chromatography, as previously described 4 . Proteins were >95% pure as judged by SDS-PAGE and Coomassie blue staining.

Surface Plasmon Resonance of SubB and engineered SubB mutants. Surface Plasmon resonance
(SPR) was run using the Biacore T100 system (GE) as described previously 27 . Briefly, SubB, SubB mutants and anti-Neu5Gc IgY (SiaMab; formerly Sialix/GC-Free Inc., San Diego, CA, USA) were immobilized onto flow cell 2-4 of a series S sensor chip CM5 (GE) using the NHS capture kit and flow cell 1 was run as a blank immobilization. Monosaccharides, disaccharides, oligosaccharides and α1-Acid glycoprotein from human and bovine sources (Sigma-Aldrich; See Table 1) were flowed over at 0.01-100 µM on initial range finding experiments. Concentrations were adjusted and all data were analysed using single cycle kinetics using the Biacore T100 Evaluation software. Glycan array analysis of SubB and engineered SubB mutants. Glycan array slides were printed on SuperEpoxy 3 (Arrayit) activated substrates using an Arrayit Spotbot Extreme contact printer as previously described 28 . For each subarray 2 μg of SubB proteins were pre-complexed with anti-His tag antibody (Cell signalling) and Alexa555 secondary and tertiary antibodies (rabbit anti-mouse; goat anti-rabbit) at a ratio of 2:1:0.5:0.25 in a final volume of 500 μL. This 500 μL antibody protein complex was added to a 65 μL gene frame (Thermo Scientific) without a coverslip. Washing and analysis was performed as previously described 27 .

ELISA analysis of SubB and the engineered SubB ΔS106/ΔT107 mutant. Wells of black 96-well NUNC
Maxisorp plates were coated with SubB or SubB ΔS106/ΔT107 protein two-fold serially diluted in 100 mM bicarbonate/carbonate coating buffer (pH 9.6) starting at 1.25 μg of protein overnight at 4 °C. Wells were washed 3 times with phosphate-buffered saline, 0.05% Tween-20 (PBS-T) before blocking solution (3% BSA) was added for 1 hour at room temperature. Proteins in normal human serum and bovine serum were fluorescently labelled by combining neat serum with 100 µM FITC dye (Peirce) and incubating on ice for 1 hour. Excess dye was removed using a 1 kDa size exclusion spin column. 100 μl of FITC-labelled normal human serum or bovine serum was added to wells coated with SubB or SubB ΔS106/ΔT107 and wells were incubated for 1 hour at room temperature. Wells were washed 3 times with PBS-T. 100 μl of PBS was added to each well before the fluorescence was measured at 485/535 nm. Fluorescence unit values are shown as the mean of duplicates +/−SD, with the mean fluorescence units obtained for wells containing all reagents except for the SubB proteins subtracted. Any negative value was considered as 0.
SubB overlay experiments. Purified SubB ΔS106/ΔT107 was labelled with biotin using the EZ-Link ® Sulfo-NHS-Biotinylation Kit (Thermo Scientific) according to the manufacturer's instructions. Purified human and bovine α−1 acid glycoprotein (Sigma cat. nos G9885 and G3643) were dissolved in water at 5 mg/ml and 5 μl volumes of serial two-fold dilutions were spotted onto nitrocellulose filters and air dried at 37 °C overnight. Filters were then blocked with 5% skim milk in Tris-buffered saline with 0.05% Tween 20 (TTBS) for 2 h. After washing three times in TTBS, filters were overlaid with 1 μg/ml biotin-SubB ΔS106/ΔT107 in TTBS and incubated overnight  at 4 °C. Filters were then washed three times in TTBS and bound biotin-SubB ΔS106/ΔT107 was detected using streptavidin-alkaline phosphatase conjugate (Roche). Filters were developed using a chromogenic nitro-blue tetrazolium/X-phosphate substrate system (Roche).