The Crystal Structure of the Calystegia sepium Agglutinin Reveals a Novel Quaternary Arrangement of Lectin Subunits with a β-Prism Fold*

The high number of quaternary structures observed for lectins highlights the important role of these oligomeric assemblies during carbohydrate recognition events. Although a large diversity in the mode of association of lectin subunits is frequently observed, the oligomeric assemblies of plant lectins display small variations within a single family. The crystal structure of the mannose-binding jacalin-related lectin from Calystegia sepium (Calsepa) has been determined at 1.37-Å resolution. Calsepa exhibits the same β-prism fold as identified previously for other members of the family, but the shape and the hydrophobic character of its carbohydrate-binding site is unlike that of other members, consistent with surface plasmon resonance analysis showing a preference for methylated sugars. Calsepa reveals a novel dimeric assembly markedly dissimilar to those described earlier for Heltuba and jacalin but mimics the canonical 12-stranded β-sandwich dimer found in legume lectins. The present structure exemplifies the adaptability of the β-prism building block in the evolution of plant lectins and highlights the biological role of these quaternary structures for carbohydrate recognition.

Recent structural studies have pointed to a marked evolutionary plasticity within the family of the jacalin-related lectins (JRL). 1 Although all members of this still expanding lectin family share high sequence similarity and are built up of subunits with a similar overall architecture, they differ from each other in respect to their carbohydrate-binding specificities, molecular structure of the protomers, and subcellular location. Accordingly, the JRL are now subdivided into two subfamilies, the galactose-and mannose-specific JRL (1). The recent identification of a JRL in a true fern further contributes to the widespread distribution of this lectin family in the plant kingdom (2).
The galactose-specific jacalin-related lectins (gJRL) are a small homogeneous group of galactose/T-antigen-binding agglutinins occurring exclusively in the Moraceae small subgroup of plant lectins. Jacalin, the first member identified in this family, follows the secretory pathway and eventually accumulates in storage protein vacuoles (3). All currently known gJRL closely resemble jacalin that can be considered as a prototype of this subfamily. The native jacalin lectins are built up of four identical protomers each consisting of a heavy (␣) and a light (␤) polypeptide chain. A complex processing of co-and posttranslational proteolytic modifications from a prepro-protein results in the obtaining of two separate chains that remain tightly associated by non-covalent interactions (4). The mannose-specific jacalin-related lectins (mJRL) are a growing group of lectins that occur in a wide range of species from different taxonomic groups and display an exclusive carbohydrate specificity toward mannose (5)(6)(7)(8). Unlike jacalin, native mJRL are built up of two, four, or eight protomers consisting of a single uncleaved polypeptide chain. They are synthesized and located in the cytoplasm and do not undergo co-or post-translational proteolytic modifications (3).
The crystal structures of two gJRL and one mJRL have been reported. The jacalin structure, which consists of a ␤-prism fold built up of three Greek key motifs, revealed a novel mode of carbohydrate-binding site made of the N terminus of the ␣-chain and two surface loops (9). Moreover, the jacalin structure unambiguously demonstrated that native jacalin consists of the tight association of four identical protomers. By taking into account that each protomer possesses a single sugar-binding site, jacalin can be considered as a tetravalent lectin. This tetrameric arrangement of protomers can be extended to the Maclura pomifera agglutinin for which the structure is virtually identical to that of jacalin (10). In contrast, the crystal structure of an mJRL from Helianthus tuberosus (Jerusalem artichoke) (Heltuba) demonstrated that eight protomers of Heltuba are assembled as a donut-shaped octamer with eight solvent-exposed carbohydrate-binding sites (11). Although Heltuba shares the same overall ␤-prism fold as jacalin, the shape of the carbohydrate-binding site differs because its protomer is not cleaved into an ␣and ␤-chain and accordingly contains an additional loop that modifies the shape of the sugar-binding site (10,12). Although there is no doubt that the structure of the Heltuba protomer can be considered as a model for other mJRL, it is unlikely whether its quaternary arrangement into octamers can be extended to all members of this subfamily. Indeed, the occurrence of dimeric and tetrameric mJRL strongly argues for the existence of different quaternary arrangements (1).
In a further step toward understanding the carbohydrate specificity and the oligomeric assembly of a novel mJRL, we present the 1.37-Å resolution crystal structure of a lectin from Calystegia sepium (hedge bindweed) (Calsepa) in the absence of bound sugar. The protomer structure of Calsepa closely resembles that of Heltuba, but the shape and hydrophobic character of its carbohydrate-binding site are markedly modified compared with that of other members of the subfamily, consistent with our surface plasma resonance data. Most importantly, Calsepa exhibits a novel dimeric assembly that unexpectedly mimics the canonical 12-stranded ␤-sandwich dimer typically found in legume lectins and may create an enlarged carbohydrate-binding site.
CD Analysis-The far-UV CD spectrum of Calsepa was measured at room temperature with a Jobin and Yvon CD6 spectropolarimeter (Division d'Instrumentation S.A., Longjumeau, France) using 0.1-cm path length quartz cuvettes. Calsepa was dissolved in either HBS (m ϭ 0.155) or 20 mM phosphate-buffered saline (pH 7.4, m ϭ 0.155) at a final concentration ranging from 150 to 500 g⅐ml Ϫ1 (10.4 and 35 M, respectively). The proportion of ␤-strand and ␣-helical secondary structure elements was estimated from the CD spectra (15).
Surface Plasmon Resonance-The specific interaction of Calsepa and Heltuba with immobilized arcelin-1 was performed by surface plasmon resonance (SPR) using a biosensor BIAcore 1000 (BIAcore AB, Uppsala, Sweden). The lectins, at a concentration of 100 g⅐ml Ϫ1 in HBS, were injected for 5 min onto the glycoprotein-bound surface of the sensor chip at a flow rate of 5 l⅐min Ϫ1 . The change of the SPR response (expressed as resonance units or RU) was monitored at 25°C for 9.30 min. For immobilization, glycoproteins were used at a concentration of 100 g⅐ml Ϫ1 in 5 mM sodium acetate buffer, pH 4.0. According to the change of SPR response as a result of the immobilization of the glycoproteins on the carboxymethylated dextran layer covering the sensor chip, an estimated surface concentration of 10 ng⅐mM Ϫ2 of dextran was obtained for the immobilized glycoprotein.
Inhibition assays were performed by injecting monosaccharides, used at concentrations ranging from 5 to 25 mM in HBS, at the beginning of the dissociation phase of the arcelin-1-lectin complex for 5 min at a flow rate of 5 l⅐min Ϫ1 , and the change of the SRP response (RU) was monitored at 25°C for 9.30 min. Inhibition was expressed by the percentage of lectin remaining fixed on the arcelin-bound surface. Each experimental value is the mean of duplicate experiments.
Crystallization and Data Collection-For crystallization, ammonium sulfate-precipitated Calsepa was extensively dialyzed against 10 mM Hepes, pH 7.4, and 150 mM NaCl and concentrated to 10 mg/ml. Crystals were obtained at 20°C with the Enrico Stura PEG footprint screen using the vapor diffusion technique (16). Typically, 2 l of the protein solution were mixed with 2 l of the reservoir solution made of 20 -30% PEG 4K and 0.2 M imidazole/malate, pH 6.0. Calsepa crystals belong to the triclinic space group P1 with unit cell dimensions a ϭ 30.56 Å, b ϭ 51.79 Å, c ϭ 79.8 Å, ␣ ϭ 104.99°, ␤ ϭ 94.36°, and ␥ ϭ 94.85°; and they contain two Calsepa dimers per asymmetric unit. Crystals selected for data collection were rapidly soaked in the reservoir solution supplemented with 20% ethylene glycol, flash cooled at 100 K in the nitrogen gas stream, and stored in liquid nitrogen. Data were collected on the ESRF beamline ID14-EH3 (Grenoble, France). Oscillation images were integrated with DENZO (17) and scaled and merged with SCALA (18). Amplitude factors were generated with TRUNCATE (18).
Structure Determination and Refinement-Initial phases were obtained by molecular replacement using a Heltuba monomer (Protein Data Bank code 1C3K) (11) as a search model with the AMoRe program (19), giving a correlation coefficient and an R-factor value of 38.9 and 47.7%, respectively, in the 15-to 4-Å resolution range. 497 residues, out of 611 (81% of the polypeptide chain), were built without any manual intervention and refined to an R-factor and R-free values of 19.8 and 25.6%, respectively, using ARP/wARP version 5.0 (20). Final refinement stages of the ARP/wARP model were then performed with REFMAC (21) using data between 20 and 1.37 Å, and the resulting SigmaA-weighted 2F o Ϫ F c and F o Ϫ F c electron density maps were used to correct the model with the graphics program TURBO-FRODO (22). The final model comprises acetylated N-terminal residues Ala 2 to His 152 / Lys 153 for the four Calsepa molecules, two malate molecules, four imidazole moieties, seven ethylene glycol molecules, and 589 solvent molecules. The root mean square deviation between the four Calsepa molecules present in the asymmetric unit are within the 0.3-0.6 Å range for 147 C␣ atoms. The root mean square deviation between Calsepa and Heltuba (1C3K) are 1.3 Å for 125 C␣ atoms; those between Calsepa and jacalin (1JAC) are 1.6 Å for 103 C␣ atoms. The stereochemistry of the model was analyzed with PROCHECK (23); no residues were found in the disallowed regions of the Ramachandran plot. Data collection statistics and refinement are summarized in Table I. The coordinates and structure factors of Calsepa have been deposited with the Protein Data Bank, accession code 1OUW.
Molecular Modeling and Docking Experiments-The electrostatic potentials were calculated and displayed with GRASP using the parse3 parameters (24). The solvent probe radius used for molecular surfaces was 1.4 Å and a standard 2.0 Å-Stern layer was used to exclude ions from the molecular surface (25). The inner and outer dielectric constants applied to the protein and the solvent were fixed at 4.0 and 80.0, respectively, and the calculations were performed keeping the ionic strength equivalent to 0.15 M NaCl. No even distribution of the net negative charge of the carboxylic group of negatively charged residues was performed between their two oxygen atoms prior to the calculations.
The docking of Man, ␣-Met-Man, and Man␣1-3Man into the carbohydrate-binding site of Calsepa was performed on a Silicon Graphics O2 R10000 work station using the programs InsightII, Homology, and Discover (Molecular Simulations, San Diego, CA). The lowest apparent binding energy (E bind expressed in kcal⅐mol Ϫ1 ) compatible with the hydrogen bonds, considering van der Waals interactions and strong, 2.

RESULTS AND DISCUSSION
Quality and Overall View of the Structure-The Calsepa structure has been solved by molecular replacement method using the coordinates of Heltuba as a search model (Protein Data Bank accession code 1C3K). The final structure has crystallographic R-factor value of 15.4% (R-free ϭ 18.2%) in the 20 to 1.37 Å resolution range and has good stereochemistry (Table I). 2 Available on the World Wide Web at: mackerel.tamu.edu/spock/. Calsepa belongs to the ␤-prism fold family and consists of three four-stranded ␤-sheets (␤1 to ␤2, ␤11 to ␤12, ␤3 to ␤6, and ␤7 to ␤10) that form three Greek key motifs with overall dimensions of 35 ϫ 25 ϫ 20 Å (Fig. 1). The far-UV CD spectrum of Calsepa exhibits a negative band close to 217 nm (data not shown) indicating that ␤-sheet secondary structures predominantly occur in the protein consistent with our structural data. As a signature of the mJRL subgroup, Calsepa consists of a single-chain protomer and thus contains the additional loop that connect ␤1 to ␤2 within the first Greek key motif. Overall, the Calsepa monomer closely resembles that of Heltuba, the first representative member for a mannose-specific subfamily (11), whereas it is more distantly related to jacalin or related homologues of the gJRL subfamily (9). Yet the crystal structure of Calsepa displays large structural differences in the vicinity of the carbohydrate-binding site and in loops connecting ␤-strands compared with the structures of other members of the ␤-prism fold family.
The Carbohydrate-binding Site-To attest the carbohydrate specificity of Calsepa and Heltuba, we carried out SPR experiments using inhibition of the lectin-arcelin interaction to study the interaction of these lectins toward various monosaccharides (Fig. 2). Both Calsepa and Heltuba preferentially recognize Man at concentrations ranging between 5 and 25 mM but also readily interact with Glc. Unlike Heltuba, Calsepa exhibits a marked preference for methylated sugar derivatives, such as ␣-MeMan and ␣-MeGlc, at concentration down to 5 mM. These discrepancies in sugar specificity between these two lectins suggest that the carbohydrate-binding site of Calsepa may contain additional hydrophobic residues to accommodate the methyl moiety.
Attempts to obtain the crystal structure of Calsepa bound to ␣-Man or ␣-Met-Man were unsuccessful, but in the course of the refinement procedure ethylene glycol along with imidazole molecules were found to be well ordered in the electron density maps and located within the carbohydrate-binding site in all the four Calsepa molecules present in the asymmetric unit (Fig. 3). Structural comparison of Calsepa with Heltuba bound to Man␣1-2Man or Man␣1-3Man reveals that the ethylene glycol moiety perfectly mimics the O-6 -C-6 -C-5-O-5 atoms of the primary mannose moiety, whereas the imidazole ring mimics the second mannose ring (Fig. 3).
The carbohydrate-binding site of Calsepa presents signifi-cant structural differences compared with Heltuba. We previously identified that the carbohydrate-binding site is delimited by three loops in Heltuba: ␤1-␤2, ␤7-␤8, and ␤11-␤12. In Calsepa, the differences in the position of one of these three loops (␤7-␤8) along with the large extension of the two adjacent ␤3-␤4 and ␤5-␤6 loops significantly modify the shape and replaces Heltuba Met 92 , is within hydrogen bond distance to the O2 hydroxyl group of the second mannose moiety. E, molecular surface of the extended carbohydrate-binding site in Calsepa located at the dimer interface with bound ethylene glycol and imidazole molecules. The two subunits are colored in yellow and blue; the primary binding site is indicated in white, and the second is indicated in green, and Asn 18 , located at the junction, is indicated in magenta. more pronounced hydrophobic character of the carbohydratebinding site compared with that of Heltuba is consistent with SPR experiments showing the preference of Calsepa for methylated sugar derivatives. Yet the hydrophobic interaction seen between the pyranose ring of Man and Met 92 in the Heltuba-Man complex has no equivalent in Calsepa because Met 92 is replaced by Asn 96 , a residue that, in turn, could establish polar interactions with the second mannose moiety. Unlike Heltuba, the longer ␤5-␤6 loop in Calsepa rigidifies the Tyr 141 indole ring located in the ␤11-␤12 loop, a residue that is well positioned to interact with the B face of the second mannose moiety (Fig. 3). Indeed, the Tyr 141 aromatic ring, which replaces Asp 136 in Heltuba, protrudes from the Calsepa carbohydratebinding site and illuminates the preference of this lectin for methylated sugar derivatives.
A Novel Dimeric Interface-The Calsepa crystal structure identifies a novel dimer interface for a member of the JRL family. The dimer interface is formed from the tight association of the N-terminal region (residues 2-8) and the ␤1, ␤2, and ␤5 strands from each protomer. Interestingly, although ␤1 is involved for the jacalin and Heltuba dimer interfaces, ␤2 and ␤5 are also recruited for the second dimer interface necessary to built the donut-shaped octameric assembly of Heltuba, but the two subunits adopt a drastic different orientation (Fig. 4). In Calsepa, the calculated buried surface area to a 1.6-Å probe radius encompasses 1327 A 2 on each subunit and involves 25 residues, a value larger than that found for the Heltuba (800 A 2 ) and jacalin (980 A 2 ) dimers. The Calsepa N-terminal extension (6 residues out of 8 are buried at the dimer interface) plays a key role in the dimer stabilization with the two Nterminal arms that exchange each other and lock the dimeric assembly. With the exception of the tip of the ␤4 -␤5 loop, the position of the ␤-strands recruited to form this novel dimer interface are conserved within members of the JRL family, suggesting that the nature of the residues within these structural elements, rather than a large conformation change of these secondary structural elements, is a key factor in dictating the type of dimeric association. The differences in length and amino acid composition of the N-terminal region of Calsepa compared with other members of the JRL family support such an hypothesis. Indeed, this region, which protrudes from the core of the ␤-prism, is also necessary for the octameric assembly seen in Heltuba but also as does the corresponding ␤-chain for the tetrameric assembly of jacalin (Fig. 4).
To validate whether the quaternary assembly into dimers seen in the crystal structure also exists in solution, the molecular assembly of native Calsepa was determined by analytical ultracentrifugation. These experiments yielded a sedimentation coefficient of 2.57861 S and an apparent molecular mass of 29,780 Da (data not shown). Sedimentation equilibrium experiments indicated that dimeric and monomeric forms of Calsepa co-exist in solution. However, the dimeric form is largely predominant with a K a value of 4.07 ϫ 10 Ϫ6 M. Therefore, we can assumed that native Calsepa is a dimer in solution consisting of two identical subunits of 16,062 Da (a value calculated from the amino acid sequence of the protein). These findings are also supported by gel filtration experiments (data not shown).
The overall architecture of the Calsepa dimer consists of an extended 8-stranded ␤-sheet and strikingly resembles the socalled canonical dimeric association typically found in the legume lectin family that consists of a continuous concave-shaped 12-strand ␤-sheet (29 -31). Remarkably, these two structurally similar dimeric assemblies are formed by the quaternary arrangement of two protomers with an unrelated fold (␤-prism versus jelly roll) (Fig. 4). In both Calsepa and Lathyrus ochrus isolectin I (LoL I), a representative member of the legume lectin family, the ␤-strands recruited for this tight association are perpendicularly oriented to the axis of the dimer. Yet the two carbohydrate-binding sites are differently positioned: close to the dimer interface in Calsepa and at the distal ends in LoL I (Fig. 4). A similar organization in the carbohydrate-binding sites occurs in the B-chain of the type 2 ribosome-inactivating proteins (e.g. ricin). However, the sugar-binding sites are located on two distinct non-covalently linked protomers in Calsepa and LoL I, and they are located within a single polypeptide chain in the case of the ricin B-chain (Fig. 4). Such homology with the canonical dimeric assembly of legume lectins has been also observed for other molecules, such as spermadhesins that belong to the unrelated CUB domain fold (32).
A large diversity in the oligomeric assembly among members of a single lectin family has not been yet evidenced, and the biological function of these quaternary structures remains to be fully elucidated. In the Calsepa dimer, the two carbohydratebinding sites, which are located on the opposite face in each protomer, are separated by 38 Å, a distance preventing the formation of cross-linking interactions between different antennae of a single N-glycan. However, a second small cavity can be evidenced in the neighboring subunit and is occupied by an imidazole moiety and surrounded by charged residues from the two subunits (Asn 18 from one subunit and Arg 23 , Lys 27 , Glu 131 , Arg 151 , Lys 153 from the second subunit) (Fig. 3). At the dimer interface, the N-terminal arm of a Calsepa subunit stabilizes the ␤1-␤2 loop of the second subunit, a signature of members of the mJRL subfamily, and also participates with the Asp 6 side chain to the formation of this putative second sugar-binding site. Whether this second cavity represents a second sugarbinding site must await further experimental studies, but the nature of the residues that line this small cavity along with the close vicinity of these two sugar-binding sites, each only distant by 10 Å, argue for such hypothesis. The identification of this putative second sugar-binding site might represent a unique feature in Calsepa among members of this lectin subfamily and thus support a functional role of the quaternary structure of plant lectins during carbohydrate recognition events.
In summary, the crystal structure of Calsepa demonstrates that at least three different quaternary structures exist within the JRL family resulting from a different association of structurally similar protomers. By taking into consideration that the JRL and especially the mJRL currently represent an extended group of proteins, other types of quaternary arrangements may occur within this lectin subfamily. It seems likely that the large diversity of quaternary arrangements seen for members of JRL family is significantly higher than that of any other family of plant lectins, e.g. the legume lectins (33)(34)(35), the monocot mannose-binding lectins (36), and the type 2 ribosome-inactivating proteins (37,38). Although the biological significance of the differences in quaternary structures within the JRL family remains to be elucidated, it is evident that the number of binding sites per molecule and the corresponding valency profoundly affect the ability to establish multiple interactions and cross-link glycan receptors. The same conclusion can be drawn for the JRL as for other plant lectin families comprising members with a different quaternary structure and valency (e.g. dimeric versus tetrameric legume lectins; dimeric versus tetrameric monocot mannose-binding lectins).