The Covalent and Three-Dimensional Structure of Concanavalin A OF THE MONOMER AND ITS INTERACTIONS WITH METALS AND SACCHARIDES*

The three-dimensional structure of the lectin concanavalin A (Con A) has been determined at 2.0-A resolution by x-ray diffraction analysis. The protomers are ellipsoidal domes of dimensions 42 times 40 times 39 A. Folding of the polypeptide backbone is dominated by the presence of two antiparallel pleated sheets, a twisted sheet of seven strands passing through the center of the molecule and a bowed sheet of six strands which forms the back surface of the monomer. Manganese and calcium ions bind to the protein at adjoining sites to form a binuclear complex of two octahedra sharing a common edge. The ligands for each metal ion are four groups from the NH2-terminal region of the protein and 2 water molecules. The binding site for the inhibitor beta-(o-iodophenyl)-D-glucopyranoside is in a deep cavity which contains distinct hydrophobic and hydrophilic binding subsites. Studies of the binding of beta-(o-iodophenyl)-D-glucopyranoside to Con A in the crystalline state and in solution have indicated that the binding behavior of the protein is somewhat different in the two states.

for saccharide binding to occur (4,5,8,9). Although transition metals can be bound in the absence of Ca2f, recent nuclear magnetic resonance measurements (10) indicate that Ca2+ has a strong influence on the rate of the Mn2+-binding process. Circular dichroism studies have indicated that upon sequential binding of metal ions and saccharide there are conformational changes affecting aromatic residues, although the gross secondary structure of the protein is apparently not affected (11,12). Similarly, crystallographic experiments with demetallized Con A have shown that the metal-free protein is generally similar in structure to the native protein, although the geometrical relationships among the subunits are different (13).
The stereochemical requirements for the interaction between Con A and saccharides in solution have been studied extensively (4,(14)(15)(16)(17)(18)(19) and o-mannose, n-glucose, and related compounds have been found to be particularly potent inhibitors of Con A. The protein apparently binds to glucosyl and mannosyl residues at the nonreducing termini of oligo-or polysaccharides (14,19) or to certain nontermina 1 mannosyl residues (20). Other stereoisomers of these saccharides, such as o-galactose, apparently do not inhibit the biological activities of Con A (14). Con A has been the subject of extensive structural investigations.
Crystals of Con A have been prepared by several investigators (21)(22)(23)(24)(25)(26)(27), and details of the subunit organization have been revealed by low resolution crystallographic studies (24-27). The transition metal binding site was located through diffraction studies on crystals of Con A in which Mn2+ had been replaced by Cd2+ (7) and the binding site of the heavy atom-labeled inhibitor @-IPGlc was also located by diffraction studies (28). We have made a preliminary report of the amino acid sequence and threedimensional structure of Con A (29). A second crystallographic study of Con A (30) was in general agreement with our results.
We report here the primary crystallographic data and a description of the structure of the Con A protomer and of its known interactions with metals and saccharides.
These results are interpreted in the light of the complete chemical sequence reported in the two previous papers of this series (3,31).
In the following paper we present the atomic coordinates of Con A and an analysis of the hydrogen bonds and other noncovalent interactions involved in the stabilization of the secondary, tertiary, and quaternary structures of the protein (32). MATERIALS AND METHODS
The protein was isolated from defatted jack bean meal (Schwarz-Mann) by the method of Agrawal and Goldstein (8), and freed of the naturally occurring fragments (1)

RESULTS AND DISCUSSION
Description of Structure-The crystallographic asymmetric unit of Con A contains one protomer of molecular weight 25,500. The protomers are compactly folded to form ellipsoidal domes of approximate height 42 A and greatest cross section 40 x 39 A. The domes have a somewhat smaller cross section (40 x 25 A) at their bases. Two such domes, related by a 2-fold axis parallel to c, are joined base-to-base to form roughly ellipsoidal dimers, slightly constricted in the region of contact.
The dimers are in turn paired across additional crystallographic 2-fold axes to form roughly tetrahedral tetramers of 222 (D2) symmetry. At the center of the molecule, about the 222 point, is a small pocket of solvent having some communication with the outside solution. The arrangement of the four protomers in the Con A tetramer is illustrated schematically in Fig. 1.
In our low resolution studies of Con A, we observed that the molecular surface is relatively smooth and uninterrupted, except for one large depression or cavity extending deep into each protomer.
We surmised that these cavities might contain the carbohydrate binding sites of Con A and we later showed that the cavities are the site of binding of the heavy atom-labeled inhibitor /?-IPGlc (28). The locations of the bound inhibitor molecules in the binding cavity of each protomer are indicated in Fig. 1, which also shows the relative locations of the essential Mn2+ and Ca2f ions, more than 20 A from the /%lI'Glc-binding regions.
The folding of the polypeptide chain in Con A is depicted in Fig. 2, and the following discussion refers to a molecule of Con A oriented as in Fig. 2. The most striking feature of the structure is the presence of two large entirely antiparallel 6 structures or pleated sheets, which contain more than half the residues in the molecule (32). The large amount of 0 structure found is consistent with earlier circular dichroism and optical rotatory dispersion studies (11,12,48).
One of the pleated sheets (the "back" sheet, Fig. 3) forms almost the entire back surface of the molecule, including the rear of the P-Il'Glc-binding cavity, and is associated with most of the interactions involved in dimer and tetramcr formation.
The plane of this sheet is bent or curled back at the top through an angle of about 30", but it is only slightly twisted.
The sheet includes a small region which turns forward from the plane of the rest of the sheet at the left in Fig. 3. The entire sheet contains about 64 residues arranged in six antiparallel chains and 9 residues in short connecting loops. The three top chains and two 'All coordinates refer to that equivalent position which is most closely associated with the protein monomer whose coordinates are given in Table I  bottom chains of the sheet each comprise consecutive residues in the amino acid sequence. The site of the natural cleavage between residues 118 and 119 (1, 3) is located at the end of the loop that joins the bottom two chains of the rear @ structure. It appears from the structure that this loop would be sufficiently accessible, especially in the dimeric form of Con A, for the hypothesized enzymatic cleavage (1) to occur. Furthermore, since the structure near this region is stabilized by hydrogen bonding between adjacent strands of the /3 structure, the cleavage would not be expected to have any significant effect on the folding of the ;olyi,eptide chain. This conclusion has been confirmed by examination of 2.8-A maps of the difference in electron density between crystals grown from purified intact chains and crystals grown from the native mixture, phased either with the MIR phases or with the calculated phases. In either case, the map contains no interpretable structural features. The presence of some disorder in this region of even the intact subunit structure, however, is indicated by the fact that the density at residues 118 to 121 is defined poorly in the 2-A electron density map. In cleaved Con A, there may be a further general loosening of the /3 structure in this region, which may be related to differences in the solubility properties (33) and subunit aggregation (49) 3. View of the Con A protomer as in Fig. 2, but with the "back" p structure highlighted in black. Curling of the structure away from the vertical plane can be seen at top left. The small bent portion of the /3 structure is at fur left. The loop between the two lowest chains at bottom right contains the site of the natural cleavage in Con A between residues 118 and 119. reminiscent of the /3 structures in carboxypeptidase A (50) and carbonic anhydrase C (51). The twist amounts to about 90" in the sense of a left-handed screw along an axis perpendicular to the chain segments. The front sheet contains about 57 residues arranged in seven chains which, like those of the back sheet, are antiparallel.
It divides the remainder of the molecule unequally into two regions containing no regular secondary structures.
The left-hand region, consisting of residues 131 to 168, is arranged in three loosely organized turns which include the front strand of the front /3 structure and part of that portion of the back @ structure which is turned forward from the back plane. At the lower left, between this coil and the two pleated sheets, is a large internal region containing mostly hydrophobic side chains and no main chain atoms. To the right of the front pleated sheet is a region which includes the NH2 and COOH termini at the front of the molecule, the metalbinding region at the top, and the front wall of the /3-IPGlcbinding cavity at the lower right.
At the far right, just above the binding site, residues 81 to 84 comprise a single turn of approximately Q helical structure, the only such structure in the molecule.
Metal Binding Sites-The R/I@ and Ca2+ ions are bound close together at the top of the molecule as shown in Fig. 2. The 2 ions are 4.6 A apart and each metal is surrounded by an approximately octahedral coordination shell containing four ligands from the protein and 2 water molecules.
The metal-binding region is depicted in Figs. 5 and 6. The hIn2+ ion is located in the position deduced by Weinzierl and Kalb (7) from Cd2+ substitution experiments. The six ligands are the side chains of Glu 8, Asp 10, Asp 19, and His 24, and 2 water molecules.
One of the water molecules is involved in a hydrogen-bonding network extending to the carbonyl oxygen of Val 32 and the hydroxyl oxygen of Ser 34. The other is at the inner end of a shallow depression which contains only solvent and which extends to the surface of the molecule.
The observed octahedral coordination of the hW+ ion differs from the results of previous magnetic resonance (52) and diffraction (7) studies in which low Mn2+ coordination symmetry was suggested. However, other magnetic resonance experiments which indicate nearly cubic symmetry for the Mn 2+ site (53), and the presence of one rapidly exchanging water ligand (54) are consistent with the observed Mn2+ coordination.
The coordination shell of the Ca2f ion includes at least seven ligands and is less symmetrical than that of the &In*+. The Reproduced with permis-.cture is twisted appearance of the electron density around the Ca2+ suggests that the site is best described as octahedral, with one vertex occupied by both carboxylic oxygens of Asp 10. The other ligands are the side chains of Asn 14 and Asp 19, which probably contribute 1 atom each to the coordination, the carbonyl oxygen of Tyr 12, and 2 water molecules.
The water molecules are in positions to be hydrogenbonded to the carboxyl group of Asp 208 and the carbonyl oxygen atom of Arg 228 in the COOHterminal portion of the polypeptide chain. Two of the protein ligands, Asp 10 and Asp 19, are shared by both metal ions. Thus, the entire assembly may be described as a binuclear complex composed of two polyhedra sharing a common edge (Fig. 5).
These structural features of the metal-binding region can explain some of the chemical observations concerning metal ion binding by Con A. First, most of the ligands arc carboxylic acids supplied by an acidic portion of the polypeptide chain near the NH2 terminus.
I'rotonation of these carboxylic acids would account for removal of the metals at low pH (5). Second, the site consists of the NHz-terminal portion of the chain which is folded around both metals and includes two strands of the front fi structure.
The site is completed by two portions of the chain from near the COOH terminus, one of which also belongs to the front p structure.
This arrangement, together with the requirement for sequential binding of the metal ions and the evidence FIG. 6. Stereo drawing of the Ca2+-Mn2+-Con A complex. Folding of the polypeptide chain between residues 8 and 24 around the 2 metal ions is illustrated, as well as three additional pieces of chain associated with residues 32,208, and 228, which are hydrogen-bonded to water molecules that serve as metal ligands.
FIG. 7. Tracing of the electron density map in the vicinity of the Ca2+ ion with the atomic interpretation superimposed. Contours directly above the Ca2+ ion are omitted for clarity.
Projection down the local S-fold axis at the Ca2+ ion indicates the octahedral nature of the Ca2+ coordination shell.
that the fi structure is present in the metal-fret protein, suggests that in demetallized Con A, the front @ structure contains a precursor transition metal binding site consisting of residues 8, 10, and 24. lsinding the i\In 2+ to this site then might induce a conformational change, bringing the connecting loop (residues 12 to 22) into a position such that residue 19 joins the ILln2+ coordination shell and residues 12, 14, and 19 would be in the proper orientation with respect to residues 10, 208, and 228 to create the Ca2+ binding site. Calcium binding would then stabilize the native conformation of the COOH-terminal portion of the chain, and possibly of the saccharide binding site. The exact details of such conformational changes must, of course, await determination of the structure of the demetallized protein. Shoham et al. (6) have shown that the Ca2+ site is specific, in that it can bind Ca2+ and Cd2+, but not 13a2+, transition metals, or Sm3+. In agreement with their results, we observe that Con A crystals soaked in Sm3+ (Table I), Gd3+ (10 mM, 7 days), or 13a2+ (10 mrq 70 days) show no changes in electron density at the Ca*+ site. This specificity is in contrast to the Ca2+ binding sites of several other proteins.
Hardman and Ainsworth (30) have described the Ca2+ ion in Con A as having five rather than seven or more ligands.
Their description of the Ca2+ site differs from ours in the absence of the two water molecule ligands, in their identification of Asp 208 as a glutamic acid, and in the orientation of Asp 10. Since this site appears to be unusually specific, and since penta-coordination of CaZf is extremely rare, the presence or absence of the additional ligands is of some interest.
In Fig. 7 we present the electron density contours in the Ca2+-binding region from which we infer that both carboxylic oxygens of Asp-10 are Cazf ligands, and that a water molecule hydrogen bonded to the carbonyl oxygen of Arg 228 is also present in the coordination shell of Ca*+. The contours are projected approximately along a local 3-fold axis of the coordination shell. The density at the position of the additional water molecule (WS in Fig. 7) is distinctly above the local background.
The map also displays the symmetry of the CaZ+ site, from which we conclude that the site can best be described as an octahedron with 2 oxygen atoms at one vertex. This description is further supported by comparison of the metal-ligand bond distances and angles given in Table IV for the two metal binding sites.
In addition to the major sites of metal binding we have discussed, native Con ,4 may bind metals at other positions.
For example, we find a peak of electron density at 0.65, 0.62, 0.29, near the side chains of Glu 87, and of Asp 136 in a symmetryrelated monomer belonging to the same dimer." His 180 is near enough to serve as a third ligand via a bridging water molecule. This location is the site of substitution of several metal cations used as heavy atom derivatives in the phasing calculation, including 1'b2+ and Sm3+, and therefore the degree of metal occupancy in the native protein cannot be assessed reliably. Because this peak is located 011 the surface of the protein and is associated with a small number of protein ligands we assume that it represents a minor metal binding site that is not necessary for the saccharide-binding activity of Con A.

Saccharides and Hydrophobic
Molecules-To date, direct crystallographic observation of Con A-saccharide interactions has been precluded by the fact that treatment of the crystals with high concentrations of inhibitory sugars results in either dissolving of the crystals or loss of the diffraction pattcrn.5 These effects probably arise from an increase in the stability of Con A in solution (33), and a conformational change (11, 12) induced by the presence of specific saccharides.
High resolution three-dimensional difference electron density maps have been made from data collected from crystals treated with the highest concentrations of several saccharides that leave measurable diffraction patterns.
These maps reveal no features that can be unambiguously identified as a bound carbohydrate. In our low resolution study we observed a large cavity in the surface of the Con A molecule (24,25) and hypothesized that this cavity might contain the saccharide binding site. We later showed that in Con A crystals, the heavy atom-labeled saccharide &IPGlc is bound in this cavity (28). This compound is an inhibitor of Con A-mediated agglutination of erythrocytes and precipitation of polysaccharides (60). A 2.8-A resolution, threedimensional difference electron density map of the @-IPGlc derivative described in Table I reveals no detectable binding of P-IPGlc at any other site in the molecule, During the course of these studies, several lines of evidence suggested that the binding of saccharides by Con A might differ in solution and in the crystal5 (61-64, see belew).
To investigate this possibility, we carried out parallel studies of the binding behavior in solution for comparison with the crystallographic data. In agreement with the crystallographic results, these solution studies (Fig. 8) show that, within experimental error, Con A binds 4 molecules of /3-IPGlc per tetramer at pH 7 and therefore that 1 /3-IPGlc is bound per protomer.
Similar data were obtained for the binding of oc-MGlc (Fig. 8). The association constants for the binding of ol-MGlc and @-IPGlc to Con A are 1.6 X lo3 and 8.1 X lo2 M-', respectively.
These studies are in good agreement with those reported by Goldstein et al. (15,64). Additional weak binding sites (K, < 102) would probably not be detected in our experiments.
The specificity of binding of cr-MGlc and fi-IPGlc were tested by competition experiments with a variety of unlabeled compounds.
Con A-bound a-[14C]MGlc can be displaced by p-IPGlc and P-PGlc but not by galactose, fl-NPGal, or P-IPGal. A striking parallel was found between the ability of a saccharide to compete with MGlc and the capacity to inhibit the agglutination of erythrocytes by Con A. While the glucose-containing saccharides inhibited agglutination, the galactose-containing sugars did not. The binding of /3-[14C]-IPGlc can be completely competed with a-MGlc ( Fig. 9) but not with galactose, suggesting that the interaction with the protein in solution is through specific binding at the saccharide binding site. Furthermore, when solutions of Con A and p-['"Cl-IPGlc are saturated with a-MGlc, no further release of p-['"Cl-IPGlc can be detected by addition of unlabeled P-IPGlc (Fig.  9). These results indicate that in solution there is no secondary site in Con A that binds hydrophobically substituted carbohydrates but not unsubstituted monosaccharides. In agreement with our conclusion that P-IPGlc binds to Con A with the same stoichiometry and specificity as simple glucosides and mannosides, studies of the binding of another phenyl sugar, cr-Nl'Man, to Con A at pH 5.35 also yielded a stoichiometry of one phenyl sugar per protomer of Con A (65). In addition, it has been shown that a-NPMan bound to Con A can be 6 J. W. Becker and G. N. Reeke, Jr., unpublished observations. to Con A plotted according to the method of Scatchard (46). The experiments were performed at pH 7 and room temperature by the technique of Colowick and Womack (45). Finally, unlabeled p-IPGlc was added to a final concentration of 8 X 10m4 M to chase any residual /3-[14C]IPGlc bound to the protein.
displaced by cr-MMan (66) in a fashion similar to the displacement of P-IPGlc by (Y-MGlc. Finally, recent data have shown that /!-IPGlc competes on an equimolar basis with cu-NPMan (64). All of these results indicate that 1 molecule of either phenyl-substituted or unsubstituted saccharides is bound per monomer of Con A.
The region of the Con A molecule where fl-IPGlc is bound in the crystal is a deep cavity which extends from the lower right edge of the molecule (Fig. 2) toward the molecular center. The back of the cavity is made up of the large /3 structure which extends across the rear of the molecule (Fig. 3), while the front is delineated by the second /3 structure (Fig. 4), the short helical section at residue 81, and a few sections of random coil. The cavity contains two distinct subsites with different characteristics. The first subsite is a large, predominantly hydrophobic region in which the iodine atom of ,KIPGlc is observed to bind (Fig. 10) 10. Drawing of the side chains in the inner portion of the /3-IPGlc-binding cavity, illustrating the hydrophobic nature of this part of the cavity. The location of the bound iodine atom of fl-IPGlc is indicated by I. The entrance to the cavity is at the right. The three chains at the rear are the lower three chains of the "back" p structure.
FIG. 11. Drawing of certain side chains and backbone carbonyl groups at the mouth of the p-IPGlc-binding cavity, with the backbone as in Fin. 10. The location of the bound iodine atom of &IPGlc is indicated by I. The glucosidic ring of p-IPGlc could be hydrogen-bonded ro some or all of the hydrophilic groups illustrated.
such as the side chains of Tyr 54, Ser 56, Asn 82, Ser 113, and Ser 189, as well as main chain oxygen atoms associated with Lys 114 and Ile 181 (Fig. 11). Although the glucoside ring of p-IPGlc cannot be located in the difference electron density map, it is reasonable to assume that it might be bound in this second, hydrophilic subsite. It has been observed that a variety of nonpolar molecules such as o-iodophenol,5 o-iodobenzoic acid, and P-IPGal (62) are also bound to crystalline Con A with their respective iodine atoms in the large cavity where the iodine of P-IPGlc was located. These compounds are bound with approximately the same binding strength as P-IPGlc.
These results, as well as competition experiments between P-IPGlc and (Y-MGlc using nuclear magnetic resonance techniques (61, 63), have led to the suggestion that the specific saccharide binding site must be outside the cavity and that P-IPGlc must be bound by two distinct sites in each protomer, one specific for the carbohydrate moiety, and one for the hydrophobic iodophenyl group (62, 63).
The binding data discussed above, however, indicate that in solution each Con A protomer binds only one saccharide, phenylsubstituted or unsubstituted.
In addition, the observed orientation of the iodophenyl moiety of P-IPGal (62) indicates that the galactose ring may be bound in the inner, hydrophobic region of the cavity (to the left of the iodine in Figs. 10 and II) and therefore cannot block the hydrophilic region (Fig. 11) near the iodine of P-IPGlc which has been proposed as the saccharide binding site. Thus, the binding of hydrophobic molecules and P-IPGal do not exclude the possibility that the saccharide binding site is within the cavity.
Preliminary data6 indicate that Con A in solution does not bind hydrophobic molecules such as o-iodophenol to any detectable extent.
In addition, P-II'Gal does not displace ol-NPMan from Con A in solution, while a-lMMan does (64). The apparent difference in ability to bind nonpolar molecules suggests that there may be significant structural differences, at least in the hydrophobic binding region of the molecule, between Con A in solution and in the crystalline state. These differences, as well as nuclear magnetic resonance measurements of the Mn2+saccharide distance to be discussed below (61, 63), raise the possibility that the interactions between Con A and fl-IPGlc are significantly different in the two states.
The available data can be interpreted in terms of at least two models. One model assumes that Con A has two distinct binding sites in the two states, a saccharide-specific site unrelated to the P-IPGlc-binding cavity, which binds P-IPGlc and other inhibitory saccharides in solution but not in the crystal, and a predominantly hydrophobic site in the /3-IPGlc-binding cavity which is not present in solution.
The relative activities of the two sites are altered on crystallization either by a conformational change or by blockage by adjacent molecules in the crystal lattice.
/3-IPGlc would be capable of binding to either site, depending on the state of the protein.
In the second model, the protein has a single binding region, the cavity, which contains two binding subsites, a saccharide-specific site adjacent to a hydrophobic binding site. In solution, the saccharide-specific site is most significant but the hydrophobic subsite is sufficiently strong to account for the fact that certain saccharides bearing hydrophobic aglycones are bound more strongly than the corresponding simple saccharides (60, 67). On crystallization, however, a conformational change may enhance the binding strength of the hydrophobic subsite and reduce the influence of the saccharidespecific subsite, or the saccharide-specific subsite may simply be unobservable because of the instability of the crystals in the presence of saccharides.
Attempts to locate the saccharide binding site in solution by nuclear magnetic resonance techniques have yielded conflicting results.
From a study of 1% line broadening induced by the Mn2f ion in Con A, it has been deduced that 13C-enriched o(and P-methyl-n-glucopyranosides (61, 63) and fl-IPGlc (63) are bound 10 to 12 A from the Mn2+, and therefore cannot be located in the /3-IPGlc-binding cavity we have observed. These results have been confirmed by natural abundance 13C magnetic resonance spectroscopy (68). However, a more recent study,' considering the effects on saccharide proton resonances of both the Mn2+ ion and of an added Gd3+ ion bound at Glu 87 (the Sml site, Table III), suggests that the saccharide binding site must be in the cavity where /%IPGlc is bound.
Until these conflicting results can be reconciled, the magnetic resonance data must be considered inconclusive.
The available data of all kinds are insufficient to exclude either the one-site or the two-site model.
Both the resonance data of Brewer et al. (61,63) and the fact that saccharide binding is dependent on a metal-induced conformational change suggest that the saccharide binding site in solution is near the metal-binding region, favoring the two-site model.
On the other hand, there are several considerations that favor the one-site model. First, the stoichiometry of fl-IPGlc binding in both solution and crystal requires that if there are two independent sites, both must bind P-IPGlc strongly, but one must not exist in solution and the other must not exist in the crystal.
Such a combination of changes appears unlikely.
Second, careful examination of the Con A model indicates that the possible saccharide binding subsite in the cavity adjacent to the iodine of P-IPGlc would be capable of displaying the observed saccharide-binding specificity of Con A (Fig. 11). Several other regions on the surface of the molecule contain polar groups which might be capable of binding saccharides, but these regions are not as readily consistent with the detailed requirements of the binding specificity. Third, binding of specific saccharides inhibits the denaturation of fragmented Con A (33). As the proposed binding region in the cavity contains several residues near the point of cleavage in fragmented Con A, it is possible that a bound saccharide molecule could contribute to the stability of cleaved monomers. Finally, saccharides bearing a hydrophobic aglycone at C-l are bound to Con A more strongly than the corresponding unsubstituted sugars (60, 67), suggesting that there is a hydrophobic subsite adjacent to the saccharide binding site in solution, consistent with the presence of the saccharide site in the cavity.
The instability of Con A crystals in the presence of saccharides and the apparent differences between the saccharide-binding activities of Con A in solution and in the crystal suggest that conclusive identification of the saccharide-specific binding region may require crystallographic studies on another crystal form of Con A or an application of other techniques such as affinity labeling. Results of such studies, in combination with the structural data presented here, should provide an explanation for the binding specificity of the protein and make possible a detailed analysis of its interactions with cell surface glycoprotein receptors.