Computationally guided conversion of the specificity of E-selectin to mimic that of Siglec-8

Significance Rational design assisted by computational approaches is a powerful and efficient tool for creating proteins specific for the recognition of novel glycans. We engineered a double mutant of E-selectin to eliminate specificity for the endogenous ligand (sLex) while introducing affinity for the 6′-sulfated form of that ligand (6′-sulfo-sLex). The specificity, defined by glycan array screening, identically mimics that of the unrelated protein Siglec-8, which naturally prefers to bind 6′-sulfo-sLex and its unfucosylated form (6′-sulfo-sLacNAc). We show that the rational design for this highly specific recognition for 6′-sulfation requires three core features: 1) removal of unfavorable interactions for 6′-sulfation, 2) introduction of favorable interactions for 6′-sulfation, and 3) removal of interactions that favor binding to the endogenous ligand.

Sulfation is a ubiquitous and important posttranslational modification of many biological molecules, including proteins (1), carbohydrates (2,3), lipids (4), and glycolipids (5), and mediates many biological functions. Sulfated glycans are associated with various diseases, such as cancers (6,7), cystic fibrosis (8-10), and osteoarthritis (11,12), and have great potential in molecular pathology as biomarkers. However, the isolation and detection of sulfated glycans is challenging because of their low abundance in cells, their low ionization efficiency for detection by mass spectroscopy, and the fact that the modification is labile under even relatively mild isolation conditions (13,14). Although lectins are often used to detect glycans (for example, in histology) or to enrich them chromatographically before further analysis, their application to sulfated glycans is challenging due to the paucity of sulfate-recognizing lectins as well as their broad or mixed specificities. For example, lectins from Maackia amurensis recognize both 3 0 -sulfated and 3 0 -sialylated oligosaccharides (15,16), which is perhaps not that unexpected given that sulfate and sialic acid are both anionic. Even more surprising is the observation that the lectin from Langerin cross-reacts with 6 0 -sulfated glycans and mannose (17). Antibodies can also be used as glycan detection reagents, including antibodies with specificity for sulfated oligosaccharides (18)(19)(20).
There are also a limited number of endogenous mammalian lectins that have been found to preferentially bind to sulfated oligosaccharides, most notably members of the Siglec and selectin families (21)(22)(23). Siglecs are primarily located on the surfaces of immune cells and share a binding preference for sialylated oligosaccharides. In addition, at least 8 of the 15 known Siglecs (24) (2, 3, 5, 7, 8, 9, 14, and 15) show enhanced binding to sialylated glycans that are additionally sulfated (25,26); however, many of these have broad specificities, with the notable exceptions of Siglec-8 and Siglec-9. Siglec-8 displays a strong preference for ligands that contain sulfation at the O6 position of galactose in sialyl Lewis X (6 0 -sulfo-sLe x ; Neu5Acα2-3Gal[6S]β1-4[Fucα1-3]GlcNAcβ1) or at that position in sialyl LacNAc (6 0 -sulfo-sLacNAc; Neu5-Acα2-3Gal[6S]β1-4GlcNAcβ1) (27). In contrast to Siglec-8, Siglec-9 prefers ligands that are sulfated at the O6 position of GlcNAc and shows enhanced glycan array Significance Rational design assisted by computational approaches is a powerful and efficient tool for creating proteins specific for the recognition of novel glycans. We engineered a double mutant of E-selectin to eliminate specificity for the endogenous ligand (sLe x ) while introducing affinity for the 6 0 -sulfated form of that ligand (6 0 -sulfo-sLe x ). The specificity, defined by glycan array screening, identically mimics that of the unrelated protein Siglec-8, which naturally prefers to bind 6 0 -sulfo-sLe x and its unfucosylated form (6 0 -sulfo-sLacNAc). We show that the rational design for this highly specific recognition for 6 0 -sulfation requires three core features: 1) removal of unfavorable interactions for 6 0 -sulfation, 2) introduction of favorable interactions for 6 0 -sulfation, and 3) removal of interactions that favor binding to the endogenous ligand.
binding when fucose is present; that is, 6-sulfo-sLe x ≫ 6-sulfo-sLacNAc (27). These two Siglecs display orthogonal ligand specificities, and they are also selectively expressed on different leukocytes. Siglec-8 is found on eosinophils, basophils, and mast cells where it regulates their function and survival (28), while Siglec-9 is expressed by neutrophils and monocytes where it modulates the function of neutrophils during infection (29). Understanding the biological mechanisms by which subtle differences in sulfation patterns govern these specificities is an area of active research. It is worth mentioning here that the specificity of monoclonal antibody S2 parallels that of Siglec-9; although, in contrast to Siglec-9, the presence of fucose in the ligand does not enhance S2 binding (30). Thus, different proteins may display unique binding modes for the same oligosaccharide ligand.
Much of the latest data pertaining to the roles of glycan sulfation (25,(31)(32)(33) have come somewhat indirectly, and with considerable effort, from genetic studies in which activity is inferred from the impact of the transfection or deletion of sulfotransferase genes in model cell lines. Given the emerging evidence that sulfation can dramatically enhance (25,34) or abrogate (35,36) protein binding, the paucity of reagents that are able to detect specific sulfation patterns in vitro or in vivo creates a barrier to advancing this already challenging field. The narrow specificity of Siglec-8 presents a remarkable example of a highly specific interaction between an endogenous protein and 6 0 -sulfated oligosaccharides. Because such a degree of specificity is rare among naturally occurring lectins, there is a need for an alternative to serendipitous lectin discovery for the generation of novel carbohydrate detection reagents. One potential approach would be to engineer the desired specificity into an existing lectin scaffold. To be truly specific, however, the reagent should also display reduced or eliminated binding to the endogenous glycan(s). Several examples of lectin specificity engineering have been reported (37)(38)(39)(40)(41)(42)(43)(44)(45)(46)(47)(48)(49)(50), with varying degrees of success.
In an early example of carbohydrate specificity engineering, based on domain swapping, Drickamer (37) introduced galactosebinding activity into a C-type lectin by substituting two amino acids that are conserved in the carbohydrate recognition domain in the mannose-specific lectin (E185 and N187) with two that are conserved in related lectins that prefer galactose. The double mutant (E185Q/N187D) indeed preferred galactose over mannose by 3.5-fold compared to the wild type, which preferred mannose over galactose by almost 14-fold. Nevertheless, the double mutant retained significant affinity for the endogenous ligands of the parent lectin and unexpectedly introduced high affinity for N-acetylgalactosamine. Subsequently, in the quest to develop a lectin with improved detection capability for the Thomsen-Friedenreich (TF) tumor antigen (Galβ1-3GalNAc), Adhikari et al. (38) introduced point mutations into a recombinant version of peanut agglutinin at a position (N41) that was known to stabilize a water-mediated hydrogen bond with the ligand. Replacing N41 with a glutamine enhanced affinity for the TF antigen by approximately fourfold, rationalized as arising from the replacement of the mediating water by a direct interaction with the Q41 side chain. Although the N41Q point mutation improved affinity for the target ligand, it did not narrow the endogenous specificity of the lectin. In the design of a probe for the disease marker 6-sulfo-galactose, Hu et al. (40) applied error-prone PCR to a recombinant form of the ricin B-domain lectin with endogenous affinity for galactose, work that built off their earlier success applying this approach to introduce specificity for 6 0 -sialyl-galactose into the same system (45). One mutation (E20K) in particular was identified as being critical for introducing sulfate binding; however, no clones were reported that significantly reduced binding to the endogenous nonsulfated ligands.
The three cases introduced above represent common approaches to lectin specificity engineering, namely, sequence-based domain swapping, structure-based point mutagenesis, and random mutation (directed evolution). Each successfully achieved the goal of introducing either novel or enhanced affinity; however, none were able to simultaneously reduce or remove affinity for the endogenous ligands. This latter property is essential to fully exploit the engineered lectin as a reagent in diagnostic or therapeutic applications.
In the present work, we sought to use computational methods to guide the design of a protein that could recognize 6 0 -sulfo-sLe x (a ligand for Siglec-8) based on introducing sulfate specificity into a lectin (E-selectin) known to bind the nonsulfated congener. Selectins recognize the unsulfated core tetrasaccharide sLe x , which is found in glycoproteins, such as P-selectin glycoprotein ligand-1 (PSGL-1) (51), E-selectin-ligand-1 (52), and some CD44 isoforms (53). Sulfation of sLe x can enhance, attenuate, or switch selectin specificity (21,35,54,55). However, in the case of E-selectin, neither direct sulfation of sLe x nor sulfation of its associated peptide enhanced its affinity (21,56). No members of the selectin family recognize 6 0 -sulfo-sLe x (35).
A computational approach to protein engineering has several benefits over purely experimental techniques, including the ability to predict the effect of hypothetical point mutations on ligand binding. Further, to achieve high specificity, we wanted to test the hypothesis that in addition to introducing affinity for the sulfate group, mutations could be introduced that would reduce or eliminate affinity for the endogenous nonsulfated oligosaccharide. E-selectin was chosen to demonstrate this approach as its specificity and three-dimensional (3D) structure have been reported previously, and it has been shown to have no measurable affinity for 6 0 -sulfo-sLe x (35). The results from this study may offer insight into the rules governing oligosaccharide specificity and provide a rational and generalizable approach to developing carbohydrate-specific reagents.

Results
Molecular Models for E-Selectin and Siglec-8 Complexes. To confirm the validity of the molecular modeling protocol, molecular dynamics (MD) simulations (200 ns each) were first performed on complexes of E-selectin and Siglec-8 with their cognate ligands sLe x and 6 0 -sulfo-sLe x , respectively, as reported from experimental structural studies (57,58). Additionally, to permit a statistical assessment of the variability in the data, three independent MD simulations were performed for each complex. The MD simulation data reproduced the experimentally observed ligand binding poses and glycosidic linkage values (SI Appendix, Fig. S1) and the interatomic interactions (57,58) (Table 1 and SI Appendix, Table S1). In the case of E-selectin, the endogenous sLe x ligand maintained stable hydrogen-bond interactions with the protein via the sialic acid, galactose, and fucose residues, as observed in the crystal structure (57). In the case of Siglec-8, the endogenous 6 0 -sulfo-sLe x ligand also maintained the key experimentally observed interactions during the simulations (Table 1 and SI Appendix, Table  S1), particularly with R56, R109, and K116, residues that are known to be critical for affinity based on experimental alanine scanning (58). Notably, MD simulations of the R56A, R109A, and K116A mutants (computational alanine scanning) in the complex with 6 0 -sulfo-sLe x showed that, consistent with the experimental affinity data (58), the loss of R109 completely abolished affinity for the 6 0 -sulfo-sLe x ligand, as evidenced by the ligand diffusing out of the binding site (SI Appendix, Fig.  S2). Data from molecular mechanical generalized Born surface area (MM-GBSA) binding energy analyses (59) also supported the experimental observations that the R56A and K116A mutations weaken affinity but do not abolish it (SI Appendix, Fig. S2 and Table S2).
Having thus confirmed the ability of the molecular modeling protocols to reproduce the experimentally observed conformations and interactions for the known E-selectin-sLe x and Siglec-8-6 0 -sulfo-sLe x complexes, we applied the computational mutagenesis method to design 6 0 -sulfation recognition into E-selectin.
Engineering 6 0 -Sulfation Recognition into E-Selectin. To initiate the engineering of E-selectin to recognize 6 0 -sulfo-sLe x , MD simulations of E-selectin in complex with the target ligand (6 0 -sulfo-sLe x ) were performed, in which the sulfated ligand was generated by replacing the 6-OH with a sulfate moiety (Materials and Methods). Consistent with the experimental observation that E-selectin does not show measurable affinity for this sulfated ligand (35), the complex was unstable during each of the three independent simulations (SI Appendix, Fig. S3). A closer examination of the trajectories showed that ligand instability was accompanied by distortions of the glycosidic linkages into highenergy conformations (SI Appendix, Fig. S4). Instability of the complex precluded calculation of the interaction energy for this system; however, examination of the E-selectin-6 0 -sulfo-sLe x complex suggested that the ligand instability likely arose from the presence of unfavorable van der Waals and electrostatic interactions between the sulfate moiety and the side chains of glutamate residues E92 and E107. In the E-selectin-sLe x crystal structure, the 6-OH group in the galactose residue is in close proximity to the side chains of E92 and E107, with a hydrogen bond present between it and the E92 carboxylate group (Fig. 1). Consequently, sulfation of the O6-group would result in unfavorable van der Waals and electrostatic interactions with the side chains of at least one of the glutamate residues. To fully investigate the impact of each of the negatively charged side chains in E92 and E107, ligand complexes with two single mutations (E92A and E107A) and a double mutation (E92A/E107A) were computationally analyzed, with the expectation that the smaller uncharged side chain of alanine would remove the unfavorable interactions with the sulfate moiety. Table 1. Stable intermolecular hydrogen-bond pairs observed in the MD simulations for E-selectin and Siglec-8 complexes with their endogenous ligands and the E92A/E107A-6 0 -sulfo-sLe x complex  (Table 1 and SI Appendix, Table S1). (C) Schematic representation of the putative unfavorable interactions involving the sulfate moiety (dashed red curves).
putative repulsions between the sulfate moiety in 6 0 -sulfo-sLe x . This mutation is therefore potentially important for both decreasing the affinity of the endogenous ligand and enhancing that of the target ligand. Indeed, when complexed with E92A, the 6 0 -sulfo-sLe x ligand remained bound, although disordered, throughout each independent MD simulation, in contrast to the high degree of instability observed in the complex with wild-type E-selectin. Nevertheless, the ligand populated three distinct poses (pose1, pose2, and pose3 in Fig. 2), indicative of a high degree of disorder and instability. In the complex with E92A, pose1 and pose3 adopted a similar ligand shape ( Fig. 2 and SI Appendix, Fig. S3), which is equivalent to that seen in the experimental co-complex with its cognate receptor Siglec-8 (58), but each pose adopted different orientations relative to the mutant protein surface (Fig. 2 and SI Appendix, Table S3). By contrast, the unique shape of the ligand in pose2 resulted from an unexpected flip of the GlcNAc ring from 4 C 1 to 1 C 4 , likely induced by the initial placement of the ligand in the hypothetical E92A binding site. Thus, this single point mutation was predicted to be insufficient to lead to a complete conversion in ligand specificities.
To quantify these predictions, the interaction energies for E92A with the 6 0 -sulfo-sLe x and with the nonsulfated ligand were computed. Pose1 and pose2 both displayed unfavorable binding interaction energies, 0.4 and 5.0 kcal/mol, respectively (SI Appendix, Table S4). Although pose3 showed a strong favorable binding interaction energy (À7.6 kcal/mol; SI Appendix, Table S4), the average binding interaction energy from the three independent MD trajectories (À0.7 kcal/mol) indicated a significantly weaker affinity than the wild-type E-selectin-sLe x complex (À3.9 ± 1.0 kcal/mol; Table 2). Without the binding contribution from the 6 0 -sulfo moiety, the complex of E92A-sLe x displayed a negligible binding interaction (À0.2 ± 2.1 kcal/mol; SI Appendix, Table S4 Fig. S5). In the present energy calculations, we applied a value of 3.0 for the internal dielectric constant of the proteins in the MM-GBSA analysis. It should be noted that values greater than 1 for the internal dielectric constant are proposed to serve as an approximate method to account for the effect of charge polarization induced by ligand binding and are therefore somewhat system dependent, with values of 2 to 4 being proposed (64,65). In the present case, the interaction energy computed with an internal dielectric constant of 3.0 (À3.9 ± 1.0 kcal/mol) agrees well with the known binding affinity (À4.2 kcal/mol) (66) for the E-selectin-sLe x complex. Interestingly, an internal dielectric constant of 3.0 also reproduced the interaction energy for Siglec-8 with its cognate ligand 6 0 -sulfo-sLe x reasonably well (theoretical binding energy of À5.1 ± 0.6 kcal/mol [ Table 2] and experimental affinity of À4.8 kcal/mol [58]).
E107A in E-Selectin-6 0 -sulfo-sLe x . In the E-selectin-sLe x crystal structure (Fig. 1), the negatively charged carboxylate group in the side chain of E107 forms only a very weak interaction with the O6 hydroxyl group in galactose (3.7 Å), in contrast to the strong hydrogen bond formed between that hydroxyl group and the carboxylate in E92. Consequently, we would infer that were a sulfate group present at O6, both the carboxylate groups in E92 and E107 would be repulsive, with the former being more so. The MD simulations confirmed that the E107A complex was only marginally stable (SI Appendix, Fig. S6), presumably due to repulsions from interactions with E92. Over the three independent MD simulations of the E107A complex, the sulfated ligand adopted approximately four poses (SI Appendix, Fig. S7), with the oligosaccharide pose equivalent to that seen in the wild-type E-selectin-sLe x complex being present for only ∼20% of the simulation. The three other poses displayed large positional and conformational variations, in which the glycosidic linkages adopted higher energy orientations than those in the stable E-selectin-sLe x complex. Due to the instability of the E107A complex in each of the three independent 200-ns simulations, the entropy values failed to converge. For this reason, the energy values were not computed.
Collectively, each of the single mutations of E-selectin only partially stabilized the binding of 6 0 -sulfo-sLe x to E-selectin, and it appeared that mutating only one glutamate (E92A or E107A) was insufficient to fully eliminate the unfavorable electrostatic and van der Waals interactions with the 6 0 -sulfate moiety in the ligand. While each single point mutation weakened binding to the endogenous ligand (sLe x ), the simulations showed that the ligand remained bound, if also highly disordered. With an internal dielectric value of 3.0, the E92A mutation weakened binding to sLe x by 3.7 kcal/mol, nearly abolishing it, and for the E107A mutation, the affinity was reduced by 2.9 kcal/mol (SI Appendix, Tables S4 and S5).
E92A/E107A in E-Selectin-6 0 -sulfo-sLe x . Unlike the wild-type E-selectin and its two single mutants, the complex for the double mutant with 6 0 -sulfo-sLe x was stable during each of the three independent MD simulations, with the ligand adopting the same single binding pose as seen in the endogenous sLe x ligand bound to wild-type E-selectin (SI Appendix, Figs. S6 and S8). The stability of this complex indicated that the E92A/ E107A double mutation significantly reduced any unfavorable electrostatic or van der Waals interactions with the 6 0 -sulfate moiety. The stability of the ligand correlated with the presence of strong hydrogen bonds between the ligand and the protein, particularly those involving the sialic acid (carboxylate group and 4-OH), fucose (2-OH, 3-OH, and 4-OH), and galactose (3-OH and 4-OH) residues (Table 1). While the double mutation caused the loss of a hydrogen bond between the 6-OH group in galactose and the side chain of E92, new interactions were formed between the 6 0 -sulfo group and the side chains of polar residue N105 and positively charged residues K111 and K113 ( Table 1).
The binding energies for the E92A/E107A-6 0 -sulfo-sLe x complex were computed to permit quantitative comparison with the wild-type E-selectin-sLe x complex. The per-residue MM-GBSA energy analysis of the E92A/E107A-6 0 -sulfo-sLe x complex showed that the fucose residue contributed more to the total binding energy (48%) than did the sialic acid residue (11%; Table 2). This energy distribution was similar to that seen in the E-selectin-sLe x complex (fucose, 52%; sialic acid, 14%; Table 2). These MM-GBSA analyses are consistent with the experimental observation that removing the fucose residue from the sLe x glycan in the PSGL-1 glycopeptide ligand reduced binding below the detection limit (65,67). The binding energy for the oligosaccharide component (that is, the ligand not including the sulfate group) in the E92A/E107A-6 0sulfo-sLe x complex (À24.8 ± 1.0 kcal/mol) appeared to be Table 2. Per-residue interaction energies* and entropic penalties † for wild-type and mutated E-selectins and Siglec-8 complexes with their ligands.

Per-residue interaction energies (MM-GBSA) and percentage of binding Neu5Ac
À3.6 ± 0.1 (14%) À3.2 ± 0.6 (11%) À12.6 ± 0.2 (64%) Core-2 Gal À5.8 ± 0.5 (23%) À4.7 ± 0.4 (17%) slightly weaker than the endogenous sLe x in the wild-type E-selectin complex (À25.4 ± 0.7 kcal/mol); however, this difference is not statistically significant (P = 0.4425). Thus, the enhanced binding of the sulfated ligand (À2.7 kcal/mol, not including entropy) can be attributed predominantly to new interactions formed with the sulfate moiety (À3.3 ± 0.6 kcal/mol). The MM-GBSA analysis quantifies the magnitude of the energy gained from the formation of hydrogen-bond interactions between the sulfo group and N105, K111, and K113 and that lost with the abolition of the hydrogen bond between the 6-OH group in galactose and the side chain of E92. The impact of the double mutations on the binding affinity of the endogenous sLe x ligand was determined from MD simulations of the putative complex of E92A/E107A with sLe x . This complex was stable in each of the three independent MD simulations (SI Appendix, Fig. S5) and gave rise to a binding energy of À24.1 ± 0.5 kcal/mol, not including entropy. Although stable, the interaction energy was reduced by ∼4 kcal/mol compared to that of the E92A/E107A-6 0 -sulfo-sLe x (SI Appendix, Table S5). With the inclusion of entropic effects, the absolute binding energy of the E92A/E107A-sLe x complex was À2.4 ± 0.9 kcal/mol compared to À6.0 ± 0.8 kcal/mol for the binding of the 6 0 -sulfo-sLe x ligand.
The computational analysis predicted that to engineer recognition of 6 0 sulfation into E-selectin would require removing unfavorable electrostatic and van der Waals interactions between the 6 0 -sulfo group and the negatively charged side chains of E92 and E107. The analysis also predicted that replacing only one of these side chains would only partially stabilize the 6 0 -sulfo-sLe x ligand in the binding site. To fully stabilize the complex, a double mutation (E92A and E107A) was required. In the double mutant, the 6 0 -sulfo-sLe x oligosaccharide bound in the same low-energy pose observed in the endogenous nonsulfated ligand and formed additional strong interactions involving the sulfate group. Moreover, specificity for the novel sulfated ligand over the endogenous oligosaccharide was predicted to arise from loss of a hydrogen bond to 6-OH in galactose after mutation of E92. Having obtained statistically robust data from MD simulations and MM-GBSA analyses, we then undertook the expression of the relevant mutants of E-selectin in HEK293 cells with the aim of experimentally confirming their specificity by glycan array screening.

Glycan Microarray Data for Wild-Type and Mutated E-Selectins.
The recombinant E-selectin mutants were submitted to the National Center for Functional Glycomics (NCFG) for glycan microarray screening. Glycan array data for wild-type human E-selectin have been previously reported (68) and, as expected, showed binding to a limited number of sialic acid-containing glycans, including sLe a and sLe x (Fig. 3A). By contrast, the E92A/E107A double mutant displayed exclusive specificity for the 6 0 -sulfo sialylated lactosamine (6 0 -sulfo-sLacNAc) motif present in 6 0 -sulfo-sLe x (Fig. 3B), a specificity indistinguishable from that of the wild-type Siglec-8. Neither single mutation alone was sufficient to generate a binding signal to 6 0 -sulfo-sLe x (Fig. 3 C and D). The loss of detectable affinity for the endogenous sLe x in the double mutant is consistent with the modeling-based interpretation that the double mutation not only enhanced the binding to 6 0 -sulfo-sLe x by introducing new interactions with 6 0 -sulfo group but also reduced the affinity for the endogenous sLe x ligand by removing a key hydrogen bond between the Gal-O6 hydroxyl group and E92. Therefore, the specificity of the double mutant demonstrates the importance of combining mutations that enhance binding to the target ligand with ones that attenuate binding to the endogenous ligand. That the computed binding energy for the E92A/ E107A-sLe x complex (ΔG binding = À2.4 ± 0.9 kcal/mol) was ∼3.6 kcal/mol weaker than the binding of the 6 0 -sulfo-sLe x ligand suggested that the double mutant retained some affinity for the nonsulfated ligand. However, the fact that this interaction was not observed by glycan array screening indicated that any remaining affinity must be below the detection limit of the experimental assay.
The observation that the double mutant bound to 6 0 -sulfo-sLacNAc was unexpected given that E-selectin-ligand interactions are characterized by a coordination between the O3 and O4 hydroxyl groups of the fucose residue and the Ca 2+ ion, leading to the classification of this protein as a C-type lectin (70). To confirm the requirement for Ca 2+ in wild-type E-selectin binding and to define this dependence for the recombinant mutants, the glycan array screening experiments were repeated for each system in the presence of 10 mM ethylenediaminetetraacetic acid (EDTA). Under these conditions, no binding to any ligands was observed (SI Appendix, Fig. S9). This result would be expected for binding that depends on coordination of the fucose ring to the Ca 2+ ion but suggested that some alternative Ca 2+ -dependent mode of interaction must be present to explain loss of binding of 6 0 -sulfo-sLacNAc in the presence of EDTA.
To establish a molecular mechanism for the binding of 6 0 -sulfo-sLacNAc to the double mutant, an initial model for this complex was generated based on the structure of this mutant bound to 6 0 -sulfo-sLe x . The fucose was then removed, and the complex was subjected to three independent MD simulations (200 ns). During the MD simulations, the ligand remained bound to the mutant but adopted a modified orientation (ring atom RMSDd of ∼4 Å) compared to the initial ligand position (SI Appendix, Figs. S6 and S10), arising from a change in the conformation of the Neu5Acα2-3Gal glycosidic linkage angles (SI Appendix, Fig. S11). In the absence of the fucose residue, the reorientated ligand retained its characteristic interactions between the sialic acid and the protein (SI Appendix, Table S6) but was now able to form a stable watermediated interaction between the sulfate moiety and the Ca 2+ ion. This interaction was detected by performing an analysis of high-occupancy water positions in the MD trajectory and was confirmed to be present in all three independent MD simula- Fig. S12). For reference, a similar water-mediated interaction between a sulfate group and a Ca 2+ ion has been reported in human annexin V . The presence of a persistent interaction between the sulfate moiety and the Ca 2+ ion, albeit water mediated, provides a rationalization for the observed Ca 2+ -dependent nature of the binding between 6 0 -sulfo-sLacNAc and the double mutant and simultaneously explains the abolition of a requirement for fucose in the ligand. An equivalent interaction was not observed in the complex of the double mutant with 6 0 -sulfo-sLe x .
A closer examination of the MD data indicated that the position of this mediating water molecule corresponds to that of one of the Ca 2+ -coordinated water molecules in the apo-E-selectin crystal structure [water molecule 315 in PDBID 1ESL (72)] and also to the position that is occupied by fucose hydroxyl group O3 after binding to sLe x in the reported crystal structure (PDBID 4CSY). It thus appears that the double mutant displays two modes of ligand recognition, one that parallels the canonical C-type lectin binding of E-selectin with additional interactions between the sulfate moiety and residues N105, K111, and K113 and an alternative mode that may occur in the absence of fucose, wherein the sialic acid and sulfate moieties maintain their key interactions with the protein but in which the sulfo group additionally forms a water-mediated electrostatic interaction with the Ca 2+ ion (SI Appendix, Table S6). This fascinating possibility will be the subject of further theoretical and experimental investigation.
Orthogonal Binding Modes Display Equivalent Ligand Specificity.
The selectivity displayed in the glycan microarray for E92A/ E107A appeared to be indistinguishable from that reported for wild-type Siglec-8 (28,73) (Fig. 3B), suggesting that the recognition of 6 0 -sulfo-sLe x has been engineered into E-selectin by the double mutation. Examination of the conformations for 6 0 -sulfo-sLe x in the MD simulations of the complexes with E92A/E107A and Siglec-8 showed that the glycosidic linkages of the ligand in both complexes displayed the same distributions (SI Appendix, Figs. S1 and S6), indicating that the 6 0 -sulfo-sLe x ligand adopted the same conformation in each complex (Fig. 4). Remarkably, the engineered 6 0 -sulfo-sLe x recognition motif in the E92A/E107A is not equivalent to that in the wild-type Siglec-8 (Fig. 4). Whereas the binding of 6 0 -sulfo-sLe x to the E-selectin double mutant is driven predominantly by interactions with the sulfate moiety, the sialic acid, and the fucose (via coordination to a Ca 2+ ion conserved across the selectins), in the complex with Siglec-8, the affinity arose primarily from interactions with the sulfate moiety and the sialic acid ( Table 2). Similar to the wild-type E-selectin-sLe x complex, in the double mutant, the fucose contributed nearly half of the total binding energy from 6 0 -sulfo-sLe x . By contrast, in the wild-type Siglec-8-6 0 -sulfo-sLe x complex, the fucose made a negligible contribution to binding, while the sialic acid residue contributed more than half of the total binding energy. The sulfate group in the Siglec-8-6 0 -sulfo-sLe x complex was predicted to contribute À1.3 ± 0.2 kcal/mol to the total binding energy, which may be compared to a value of À3.3 ± 0.6 kcal/mol for the interaction with the same moiety in the E92A/E107A mutant. The higher magnitude of the binding energy attributable to the sulfate in the double mutant complex is consistent with the observation that in the double mutant, the sulfate interacts with two charged residues (K111 and K113), while in the complex with Siglec-8, it interacts only with one charged residue (R56). Overall, the energy decomposition pattern correlated well with the presence of hydrogen bonds observed in the MD simulations (Table 1 and SI Appendix, Table S1).

Discussion
The present modeling study demonstrated the use of MD simulations and per-residue energy analyses to rationally guide the engineering of sulfate-binding specificity into a nonsulfatebinding protein. Specifically, the modeling was used to convert the specificity of E-selectin from a preference for its endogenous ligand (sLe x ) to the 6 0 -sulfated form of the same oligosaccharide (6 0 -sulfo-sLe x ). MD simulations predicted that E-selectin does not recognize 6 0 -sulfo-sLe x due to repulsions between the negatively charged side chains of E92 and E107 and the 6 0 -sulfo group. Additionally, simulations suggested that removing only one of the negatively charged side chains was not sufficient to stabilize binding of the sulfated ligand. However, a simultaneous double mutation, which eliminated the unfavorable repulsions by removing negatively charged side chains, reduced affinity to the endogenous sLe x ligand with a loss of a key hydrogen bond and introduced new favorable interactions to the 6 0 -sulfo group. These predictions were confirmed experimentally by screening the double mutant (expressed in HEK293 cells) against a glycan array containing more than 600 glycans, which confirmed not only that the double mutant exclusively bound to glycans terminating in the 6 0 -sulfo-sialyl motif but also that neither the wild-type E-selectin nor either of the single mutants showed this specificity.
The present study demonstrated a successful example of the rational design of a protein-based probe for a sulfated glycan through modeling the effects of site-directed mutagenesis. The work also led to the serendipitous discovery of a putative interaction mode for nonfucosylated 6 0 -sulfo-sLe x . As predicted computationally and confirmed experimentally, a double mutation was required to introduce the desired specificity. Notably, the computational analyses indicated that engineering the desired specificity required three components: 1) Removal of destabilizing steric and electrostatic interactions between the 6 0 -sulfate and E92 and E107. 2) Creation of favorable electrostatic interactions between the 6 0 -sulfo group and K111, K113, and N105, enabled by the E92A/E107A mutations. 3) Loss of a favorable hydrogen bond from Gal-O6 in the endogenous ligand to E92.
The first two components enhanced affinity for the novel ligand, while the third was required to eliminate affinity for the endogenous glycan. Indeed, it seems reasonable that introducing specificity required not only the creation of favorable interactions with the new ligand but also the introduction of interactions that disfavored binding to the endogenous ligand.

Materials and Methods
Structure Preparation. The initial coordinates for E-selectin-sLe x and Siglec-8-6 0 -sulfo-sLe x complexes were obtained from the PDB (entry codes 4CSY [57] and 2N7B [58], respectively). Chain A was extracted from the E-selectin-sLe x complex crystal structure, with the water molecules removed. Mutants of Siglec-8 (R109A, R56A, and K116A) and E-selectin (E92A, E107A, and E92A/E107A) were created by removing the extra atoms in the side chain of the corresponding residues. Both the addition of sulfate group to the O6 position of the galactose residue in sLe x and the removal of the fucose residue from sLe x in the E-selectin complexes were performed by using the tLEaP module in AMBER15 (75). Force  (Table 1) are shown as small green spheres. Monosaccharides are drawn and colored according to 3D-SNFG nomenclature (60) (fucose, red cone; GlcNAc, blue cube; galactose, yellow sphere; Neu5Ac, purple diamond); 6 0 -sulfo groups are shown in stick model. Protein solvent-accessible surfaces were computed with VMD (74), with Siglec-8 shown in gray and E92A/E107A shown in cyan. Note, the viewing angle is different from that presented in Fig. 2. (B) Schematic representations of binding sites showing hydrogen bonds as dashed lines.
field parameters for sLe x and the 6 0 -sulfo group were taken from the GLYCAM06 (version j) (76) parameter set, and those for the proteins were taken from AMBER15 (ff99sb) (77). Sodium or chloride counter ions were added to neutralize each protein complex using the tLEaP module before they were solvated in a truncated octahedral box (8-Å buffer with transferable intermolecular potential 3point [TIP3P] water model).
Simulation Setup. Energy minimizations of the solvated complexes were performed under canonical ensemble (nVT) conditions with a two-step procedure. First, the positions of water molecules and counter ions were restrained (100 kcal/molÁÅ 2 ). Second, all restraints were removed except for Cα atoms on the protein backbone and ring atoms in glycan, and the minimization cycle was repeated. After energy minimization, each system was heated to 300 K over 50 ps (nVT) with a restraint (10 kcal/molÁÅ 2 ) on the same atoms as those in the previous step. Before data collection, systems were equilibrated at 300 K under the isothermal-isobaric ensemble (nPT) with a Berendsen thermostat (78) for 10 and 0.5 ns, consecutively. In the first equilibration, the same restraint as that in the heating process was applied. Then, the restraints on the ring atoms in glycan were removed in the second equilibration.
Production MD simulation for each complex was performed for 200 ns with the graphics processing unit (GPU) implementation of PMEMD from the AMBER15 software package (79) with the same restraints in the previous step. In all MD simulations, covalent bonds involving hydrogen atoms were constrained using the SHAKE algorithm (80), allowing a simulation time step of 2 fs. A nonbonded cutoff of 8 Å was applied to van der Waals interactions, with long-range electrostatics treated with the particle mesh Ewald approximation. Standard 1-4 nonbonded scale factors (1.0 and 2.0/1.2) were applied within the ligand and protein, respectively (76). MD simulations for all complexes were performed independently three times.

Binding Free Energy, Entropy, Carbohydrate Intrinsic (CHI) Energy
Calculations, and Representative Ligand Structure Extraction. The MM-GBSA calculations for binding interaction energies and per-residue contributions were performed on 5,000 snapshots extracted evenly from 200 ns of MD simulation using a single-trajectory method with the MMPBSA.py.MPI module in AMBER. The GB 1 OBC model (81) and internal dielectric constant (ε int ) of 3.0 were applied in all MM-GBSA calculations. Quasiharmonic (QH) entropies (ΔS RTV ) were calculated using the cpptraj module in AMBERTOOLS (82) and fit linearly as a function of inverse simulation period. The intercept with the y axis of the linear fitting function is the extrapolation of QH entropy to an infinite simulation period (83). Conformational entropies associated with changes in the glycosidic torsion angle distributions that occur after binding were computed using the Karplus-Kushick approach (84). CHI energies associated with the glycosidic linkages in the ligand were computed with the corresponding torsion angles from the MD simulation trajectories and the reported energy curves (61). The CHI energies for the glycosidic linkages in Neu5Acα2-3Gal were not included. The CHI energies for the glycosidic linkages in Fucα1-3GlcNAc were computed by using the mirror image of the reported energy curves (61). Conformation of the ligand that was most similar to its average shape in the protein complex acquired from all three independent MD simulations was extracted and presented as its representative structure. The analyses of high-occupancy water positions in the MD simulation trajectories were performed with the visual molecular dynamics (VMD) volmap plugin (74), which computed the average densities of water molecules over all matrices of cubic voxels (a cell size of 0.5 Å).
Cloning and Protein Expression. The gene for human E-selectin (including residues 22 to 558) was designed to include the transferrin secretion signal and C-terminal human Fc and 8×His tag and purchased from Genewiz. The resulting gene was cloned into pcDNA3.1 for expression in mammalian cells. The wild-type construct was used as a template to create the mutants E92A and E107A and the double mutant E92A/E107A using Quickchange mutagenesis. HEK293 Freestyle cells were transiently transfected with the expression construct using polyethylenimine, and culture supernatants were harvested 5 to 7 d after transfection. E-selectins were purified from the culture supernatants by nickel affinity chromatography and dialyzed against storage buffer (20 mM Tris-HCl, 250 mM NaCl, and 5 mM CaCl 2 , pH 7.4) and flash frozen and stored at À80°C until use.
Microarray Experiments. The E-selectin proteins were run on consortium for functional glycomics (CFG) version 5.2 microarrays (85,86). Microarray slides were rehydrated for 5 min in TSM buffer (20 mM Tris-HCl, 150 mM NaCl, 2 mM CaCl 2 , and 2 mM MgCl 2 ) before adding 50 μg/mL of Fc-tagged human E-selectins in TSM binding buffer (TSM buffer with 1% bovine serum albumin). Microarrays were washed with TSM + 0.05% Tween-20, and bound selectins were detected with anti-human IgG-Alexa488 (Invitrogen) at 5 μg/mL. Because selectin binding is Ca 2+ dependent, control experiments for all variants were performed, which included 10 mM EDTA in the binding buffer instead of CaCl 2 . Microarray slides were scanned with a Genepix 4300A (Molecular Devices) and quantified with Genepix Pro-7 software. The results are shown as relative fluorescent units by averaging the background-subtracted signals of the four replicate spots (after throwing out the highest and lowest value of the six printed spots), with error bars representing the SD of the four averaged values.
Data, Materials, and Software Availability. All study data are included in the article and/or SI Appendix.