The Crystal Structure of Pea Lectin at 3.0-A Resolution*

The structure of pea lectin has been determined to 3.0-A resolution based on multiple isomorphous replacement phasing to 6.0-A resolution and a combina- tion of single isomorphous replacement, anomalous scattering, and density modification to 3.0-A resolu- tion. The pea lectin model has been optimized by restrained least squares refinement against the data be- tween 7.0- and 3.0-A resolution. The final model at 3.0 A gives an R factor of 0.24 and a root mean square deviation from ideal bond distances of 0.02 A. The two monomers in the asymmetric unit are related by non- crystallographic 2-fold symmetry to form a dimer. Monomers were treated independently in modeling and refinement, but are found to be virtually identical at this resolution. The molecular structure of the pea lectin monomer is very similar to that of concanavalin A, the lectin from the jack bean. Similarities extend from secondary and tertiary structures to the occur- rence of a cis-peptide bond and the pattern of coordi-nation of the Ca2+ and Mn2+ ions. Differences between the two lectin structures are confined primarily to the loop regions and to the chain termini, which are differ- ent and give rise to the unusual permuted relationship between the pea lectin and concanavalin

Among the variety of carbohydrate-binding proteins that have been isolated from a wide range of sources is a group, including both proteins and glycoproteins, that is distinguished solely by in vitro properties. The proteins in this group, the lectins, share the ability to agglutinate cells or precipitate complex carbohydrates (polysaccharides, glycoproteins, glycolipids). Although the lectins are, by definition, not of immune origin, they are characterized by a high degree of specificity in binding to some saccharides in preference to others. Lectins have been isolated from bacteria and molds, lichens, sponges, fish, sera and eggs, animal cell membranes, and, very commonly, from plant seeds. The physiological role of lectins in plants is still debated however, in the legumes, substantial evidence points to a specific involvement of lectins during infection of plant root hairs by their respective bacterial symbionts, the Rhizobia (1). Rhizobium colonization results in the formation of root nodules that are essential to the conversion of nitrogen into ammonia in the legumes (2). The nitrogen fixed by this symbiotic relationship is of considerable agricultural and economic importance.
Despite the lack of detailed knowledge of the physiological role of lectins in plants, this group of proteins has proved to be exceptionally useful as molecular tools for isolating carbohydrate-containing molecules, for typing blood and for studying the structure of blood group substances, the mitogenic stimulation of lymphocytes, and the agglutination of cells. Immobilized on a support, lectins have been used to isolate cell populations and to classify cells on the basis of their cell surface glycoproteins and glycolipids. A large number of purified plant lectins are commonly available for these purposes. The first lectin obtained in purified form, concanavalin A (ConA),' was isolated from the seeds of a legume, the jack bean (Canaualia ensiformis) ( 3 ) . Concentrated study of this lectin has resulted in determinations of its amino acid sequence (4,5) and its three-dimensional crystal structure (6-8). In addition, a considerable volume of literature describes the metal ion requirements and carbohydrate-binding properties (9, 10, and references therein) of ConA from both spectroscopic and structural viewpoints.
With few exceptions, all legume seeds so far examined have been found to contain lectins. While these lectins may differ in carbohydrate specificity and quaternary structure, they are found to share a number of biochemical properties. All are composed of monomers of about 25,000 to 30 lectin (12), and the broadbean lectin, favin (13), have often been grouped together because they show similar (but not identical) preferences for a-methyl-D-glucose and a-methyl-D-mannOSe monosaccharides. They differ, however, in subunit structure and quaternary structure. ConA exists at physiological pH as a tetramer of four identical continuous polypeptide chains of 237 amino acid residues each. At lower pH, the tetramers dissociate to form dimers. Pea lectin, lentil lectin, and favin monomers are each composed of two different polypeptide chains, an a-chain of about 6,000 daltons and a 0-chain of about 18,000 daltons. The quartenary structure of these lectins is dimeric, consisting of two a0 monomers. In addition favin, unlike the others, is glycosylated. Recent amino acid sequence data from a number of legume lectins, including lentil ( l l ) , pea (12), broadbean (13), soybean (14), sainfoin (15), and French bean (16), confirm an unusual aspect of the relationship among the legume lectin sequences first described by Cunningham and co-workers (13) for the broadbean lectin favin. When the sequences are aligned to reflect similarities, they are seen to fall into two groups that differ strikingly in termini. While there is a relatively strong one-to-one correspondence in sequence throughout the chains, one group, ConA and mucana lectin (17), represents a permutation approximately 180" out of phase with that of all the others, a circular permutation of sequences with respect to amino and carboxyl termini. Fig. 1 illustrates the amino acid sequence permutation of pea lectin and ConA.
Sequences of cDNAs transcribed from mRNAs of favin (18) and pea lectin (12) correspond directly to the peptide sequences of these proteins and show that synthesis is initiated at the amino terminus of the @-subunit, continues uninterrupted through the 0and a-subunits, and terminates at the carboxyl terminus of the a-subunit, cleavage of subunits occurring post-translationally. Unexpectedly, the recently determined sequence of the ConA mRNA, rather than corresponding directly to the ConA peptide sequence, shows instead the same permutation as the favin and pea lectin messages and protein sequences (19). Among the paucity of ex- planations for this peculiarity is the suggestion that the permutation corresponding to the ConA peptide sequence is the result of post-translational ligation of the initial amino and carboxyl termini as well as post-translational cleavage to create the new termini (19). This ligation would occur at a position originally thought to be a site of proteolytic cleavage in ConA that may now be interpreted as a partial failure of ligation.
The unusual biochemical complexities of these related lectins, differences in sequence permutation, differences in quaternary structure, and subtle differences in carbohydratebinding specificity, can now be examined in terms of the molecular structures of pea lectin and ConA. The structure of pea lectin has been determined at 3.0-A resolution, an extension of the results reported a t 6.0-A resolution (20). Preliminary accounts of this work have been reported (21, 22). A report of the structure at 5.0 A has been published by a group at the USSR Academy of Sciences in Moscow (23).

EXPERIMENTAL PROCEDURES
Protein Purification and Crystallization-The isolation and crystallization of pea lectin have been described previously (20). Crystals of pea lectin used in this study have space group P212121 with unit cell dimensions of a = 50.73(2) A, b = 61.16 (2) A, and c = 136.59 (8) A. There are two pea lectin monomers per asymmetric unit.
Heavy Atom Deriuatiues-Conditions for preparation of heavy atom derivatives were screened as described previously (20). The original single-site uranyl nitrate derivative could not be reproduced. Instead, multisite uranyl nitrate derivative crystals with a variety of substitution sites and occupancies were prepared by the following method native pea lectin crystals were soaked in solutions slowly increased to a uranyl nitrate concentration of 2 mM over a 36-h period, and then left at this final concentration for 12 h more before being mounted for data collection. A second derivative was prepared by soaking native pea lectin crystals for 45 h in 5 mM para-chloromercuribenzene sulfonate (pCMBS) solution in artificial mother liquor.
Datu Collection-Three-dimensional diffraction data were collected from native and derivative crystals at room temperature on a Picker diffractometer. Data were collected by an w-step scan procedure in shells of looa reflections starting at high and going to low resolution.
For the 3.0-A resolution native, the 6.0-A resolution pCMBS derivative, and the 4.4-A resolution uranyl nitrate derivative data sets, each reflection was sampled by l-s counts at each of 37 w-steps. Steps were spaced 0.02" apart and were symmetrically disposed about calculated peak positions. Data were collected from six additional uranyl nitrate derivative crystals. One data set extended to 5.6-A resolution and the other five data sets spanned the range from 6.0-to 3.0-A resolution. These data sets were collected by a peak top step-scan method (24). Each reflection was measured for 2 s at each of five to seven steps in w. Steps were spaced 0.03" to 0.05" apart and were symmetrically disposed about predicted peak positions. Background counts for 1 s were taken at displacements of 0.4" in w to both sides of predicted peak positions. Throughout the range of each data set, Friedel mates for all unique reflections were collected. Crystal alignment was monitored for each reflection. Seven to nine standard reflections were monitored for decay in intensity every 100 to 200 reflections. No data were included for which decay estimates were greater than 30%.
Data Processing-Integrated intensities and variances were determined for the 3.0-A resolution native, 6.0-A resolution pCMBS, and 4.4-A resolution uranyl nitrate data as described previously (20). Integrated intensities and variances for the remaining data were determined by application of a Gaussian-fit method (24). The intensities and variances were corrected for Lorentz and polarization effects. Absorption corrections were applied by the method of North et al. (25). Linear decay corrections were applied to the intensities based on the average decay observed for a subset of the periodically collected standard reflections corresponding to the range of sinfl/h of the data set. Anomalous differences for the derivatives were computed from the measured intensities of Friedel mates. The uranyl nitrate derivative data sets were not merged together due to differences in substitution sites and occupancies among the crystals from which data were collected. The derivative data were scaled to the native data first with a relative Wilson (26) scaling procedure and then in shells similar to those used for data collection.
Initial Phase Determinution-A cross-difference electron density pCMBS derivative data and the native data together with centroid map was calculated based on isomorphous differences between phase angles calculated from the isomorphous and anomalous contributions from the previously determined single-site uranyl nitrate derivative (20). Six substitution sites were found. These sites were consistent with the isomorphous and anomalous difference Patterson maps. Centric least squares refinement (27) yielded a model consisting of four major and two minor sites with an R value ( R = ~II~,l-I~,ll/CI~,l)of0.49.
A combination of three methods was used to determine the sites of substitution in the 5.6-A resolution and 4.4-A resolution derivative data sets. These methods were: 1) interpretations of isomorphous difference Patterson maps; 2) interpretations of cross-difference electron density maps based on centroid phase angles derived from the isomorphous and anomalous contributions of the pCMBS sites; and 3) direct methods (MULTAN) based on isomorphous differences as described* by Wilson (28). A total of eight binding sites was found. The 5.6-A resolution data set was modeled as three major and four minor sites. Centric least squares refinement of this model gave an R value of 0.54. The 4.4-A resolution data set was modeled as two major sites and a single minor site not found in the 5.6-A set; centric least squares refinement yielded an R value of 0.54. The most prominent site in both the 5.6-and 4.4-A data sets is the original single site in the 6.0-A data set. When the 6.0-A data set was re-examined, a minor site was discovered, This site corresponded to a major site common to the 5.6-and 4.4-A data sets.
The eight different uranyl sites described above were used to model the five partial derivative data sets that spanned the 6.0-to 3.0-A resolution range. Not all of the sites were found in each data set. Exhaustive centric least squares refinement yielded models consisting of six to eight sites for the individual data sets with R values ranging from 0.49 to 0.58.
In all cases, only the heavy atom position and occupancy parameters were refined; isotropic temperature factors were fixed at 15.0 A ' .
Centroid phase angles were calculated by the method of Blow and Crick (29). Anomalous scattering information was incorporated according to the procedure of Matthews (30). Phasing contributions were obtained from all theo derivative data that are described above as well as from the 6.0-A resolution uranyl derivative described previously (20). The final combined phases of 8776 reflections had a mean figure of merit of 0.69, ranging from 0.90 in the lowest resolution shell to 0.58 at the highest resolution shell. The low figure of merit at higher resolution reflects reduced phasing information in the region of 5.6to 3.0-A resolution, where only SIRAS phases were available.
Electron Density Map Interpretation, Model Building, and Refinement-Electron density maps were calculated on a 1-A grid. The maps were contoured at equal but arbitrary intervals at a scale of approximately 2.5-&cm and were plotted on acetate sheets with sections perpendicular to the noncrystallographic 2-fold axis relating the two monomers of the dimer in the asymmetric unit.
calculated based on on the MIR-SIRAS phases. With this interpre-An initial interpretation of the chain tracing was made from a map tation, approximately 60% of the residues of the known sequence of pea lectin (12) could be placed in the electron density. A second tentative chain tracing of approximately 70% of the total dimer was obtained with the GRINCH system of programs (31) located at the University of North Carolina. The GRINCH interpretation was in agreement with and extended the initial map interpretation. Amino acid residues were fit to the electron density by application of the FRODO system of programs (32)  Bricogne's double-sort technique (34) was used to average the electron density map about the noncrystallographic, molecular 2-fold axis. The averaged map was of lower quality than the MIR-SIRAS map and, therefore, was not used. The reasons for this unsuccessful application of symmetry averaging are not fully understood, but an imprecise placement of the noncrystallographic molecular 2-fold axis may have been a contributing factor at the time.
The initial model was subjected to rigid body refinement with CORELS (35). The 16 segments of the partial model were treated as rigid domains in the refinement. In five cycles (20.0 to 6.0 A data), the R value between Fob and FCa,. for the partial model was reduced  (34). The combined phases and observed structure factors were used as input into Wang's (38) system of programs for density modification. The resultant map was compared to a map calculated with Fob and combined phases and to a second densitymodified map calculated with MIR-SIRAS phases and Fobs. The combined phases density-modified map appeared to be the best of the three maps and was used to rebuild the model. All residues were fitted to the electron density except for residues 235 to 239 in one monomer which were omitted due to the absence of interpretable electron density. (The numbering of pea lectin residues used here is that of Higgins et al. (12) Fig. 4.) The third model was then subjected to restrained least-squares refinement with an overall isotropic temperature factor. The R value was reduced from 0.40 to 0.27 in eight cycles (7.0 to 3.0 A data with a 30 cut off). Four additional cycles of refinement including individual isotropic temperature factors for all atoms resulted in a further decrease in the R factor to 0.22 (5.0 to 3.0 A data with 3a cut off).
The root mean square deviation from ideal bond distances was 0.021 A. Structure factors and phases were calculated from the refined model and used to calculate a new electron density map in which an attempt was made to reduce bias due to the model. For this new map, the electron density was set to zero in a slab corresponding to approximately 10% of the volume of the asymmetric unit. The modified asymmetric unit was then backtransformed to obtain new calculated structure factors and phases. Calculated and observed structure factors were scaled together. Calculated phases were combined with MIR-SIRAS phases as before. The combined phases together with coefficients of w(2F0b -Fc.lc), where w is the figure of merit weight, were used to calculate the electron density within the slab described above. This process was repeated with successive slabs of the calculated map until the entire asymmetric unit of the map had been constructed.
The model was adjusted to fit the electron density in this new map with FRODO. Residues 235 to 239 were not included in the final model in either monomer as the electron density is not interpretable in these regions. The final model was subjected to restrained least squares refinement in which refinement of thermal parameters was limited to an overall temperature factor. In 16 cycles, the R value was reduced from 0.345 to 0.240 (7.0 to 3.0 A data at 3a cut off). The root mean square deviation from ideal bond distances is 0.020 A.

RESULTS AND DISCUSSION
Heavy Atom Substitution Sites-The isomorphous and anomalous contributions from a single-site uranyl nitrate derivative were used to phase the initial 6.0-A native data set (20). While the molecular boundary of the pea lectin dimer and the @sheet secondary structure were clearly visible in the 6.0-A resolution electron density map, it was anticipated that another derivative would be required to determine the structure to 3.0-A resolution. A pCMBS derivative was prepared and data were collected to 6.0-A resolution. Six pCMBS sites were located by examining difference Patterson and difference electron density maps. These sites were clustered around the imidazole rings of the 2 His-230 residues in the dimer. The four major binding sites correspond to mercury binding to both ring nitrogens in each monomer. The two minor mercury binding sites appear to be close to alternative locations of the imidazole rings in both monomers.
Seven overlapping uranyl nitrate derivative data sets were collected. Difference Patterson and difference electron density maps as well as direct methods were used to determine the uranyl binding sites for the two complete lower resolution data sets (5.6 and 4.4 A). A total of eight binding sites were located. Substitution sites and occupancjes varied from data set to data set. The two sites in the 6.0-A data set, which are found in all of the data sets, are related by the molecular 2fold axis and are located in charged pockets consisting of residues Glu-205, His-51, Arg-55, and Asp-54. Another pair of 2-fold related sites is located near Glu-158 in both monomers. The other four sites are not paired one is near Asp-134 and Lys-145 in monomer B, a second is near Thr-27 in monomer A, a third is near Lys-153 in monomer B, and the last is located between the two monomers, near Asp-54, Glu-56, and Glu-205 in monomer A and Ser-12 and Lys-10 in monomer B. The uranyl ion has been found to bind primarily to carboxylate groups and occasionally to hydroxyl side chains (39). The binding sites found in pea lectin are consistent with these observations, except for the site near Lys-153B. (The designations A and B are used arbitrarily but consistently to distinguish between the two monomers in the asymmetric unit.) The Pea Lectin Monomer-The cDNA complementary to pea lectin mRNA has been cloned and sequenced (12) and indicates that pea lectin is synthesized in pre-pro form. During processing to the mature form, a leader sequence is removed co-translationally and the resultant pro-lectin is cleaved post- in both monomers. These areas all consist of turns on the surface of the molecule. Although the density is weak, the path of the polypeptide is clear. Ramachandran plots for the two monomers (A and B) are shown in Fig. 2. There are 11 residues in each monomer that have torsion angles clearly outside normal values. Five of these residues are glycines. One residue is Asn-171, which is the fourth residue of a 4-residue turn. The geometry of this turn is similar to that of a class of 4-residue turns described by Sibanda and Thornton (40), in which the conformation at the fourth residue, usually glycine or asparagine, commonly places it in the +/+ quadrant of the Ramachandran plot. Asn-39, Leu-101, Arg-133, and Asn-142 all appear to be clearly defined by the density. The equivalent for Asn-39 and Arg-133 in ConA due to insertions or deletions. The density for Glu-29, the remaining residue with unusual conformation, is very weak in both monomers, and confirmation of torsion angles in this turn region must await extension to higher resolution.
Another unusual conformational feature in pea lectin is the presence of a cis-peptide bond between residues 80 and 81. The cis-peptide bond is also present in ConA between residues 207 and 208 and is thought to be important for the Ca2+ binding site as described later. This cis-peptide region was omitted from the early refinement runs, but was built into the model when the density of the combined phase maps indicated it was present. The cis-peptide was restrained in the last two rounds of refinement. The electron density in the final map is consistent with the presence of a cis-peptide bond at residues 80-81 in both monomers.
The structure of the pea lectin monomer, shown in Fig. 3, contains 226 amino acid residues. (Figs. 3 and 5-9 were produced by the "ARPLOT" program of Lesk and Hardman (44).) Beginning at the amino terminus of the 0-subunit, at the monomer-monomer interface of the dimer, the polypeptide chain traces a convoluted pathway to form the two major and one minor antiparallel ,&sheet structures that are the framework of the molecule. Along the way, extensive random coil region is overlayed onto the front face of the molecule, the chain breaks to end the 0-subunit and begin the a-subunit at adjacent termini of the @-sheet in the back of the molecule, and the chain of the a-subunit winds its way toward termination adjacent to the beginning of the &subunit. Altogether, there are six strands in the @-sheet in the back of the molecule; this sheet is referred to as sheet I in ConA (7)  strands of these two sheets may be described as forming a 0 barrel structure (41). In Richardson's notation, the topology of this @ barrel is +lx, +1, +9, -7, +1, +1, +1, +3, -1, -1, -5, +9. The single crossover connection is right-handed. A topology diagram is shown in Fig. 4. There are three strands in a smaller 0-sheet seen in the lower left border of the molecule in Fig. 3. Two of the strands in this sheet are extensions of strands from the back sheet.
The Pea Lectin Dimer-The two monomers in the asymmetric unit are related by a molecular 2-fold axis to form the pea lectin dimer. These two monomers were modeled completely independently in an effort to identify structural differences permitted or imposed by crystal packing. The trans-formation that superimposes the B monomer onto the A monomer was calculated by least squares based on 226 eq acarbon atoms plus the Ca2+ and Mn2+ ions and gave a-root mean square difference in atom positions of only 0.57 A. At only 10 of these positions are differences greater than 1.0 A, with the largest difference being 3.8 A for Val-188, the aminoterminal residue of the a-chain. Most of these large differences are located in terminal regions or regions mentioned earlier that are difficult to model due to weak electron density; differences may become smaller as the structure is refined at higher resolution. It is still too early to ascribe any of the deviations between monomers to differences in crystal packing. The transformation in orthogonal coordinates is: An a-carbon tracing of the dimer is shown in Fig. 5. The two pea lectin monomers of the dimer are joined such that the back @-sheets of both monomers come together to form a 12-stranded /?-sheet spanning the back of the dimer. Five main chain hydrogen bonds are formed between residues 1 and 9, 3 and 7 , and 5 and 5 between monomers. Several other intermonomer contacts that may involve hydrogen bond formation, both side chain to side chain or side chain to main chain, are possible. These contacts include Thr-1 0 to Thr-9 Or, Asp-54 0 6 1 to Lys-10 Nl, Glu-2 Ocl to Gln-15 Nc2, Trp-206 Ne2 to Asn 17-061, Ser-48 Oy to Tyr-46 OH, Tyr-46 OH to Ser-48 Oy, and Asp-14 0 to Try-206 Ne, where the first atom in each contact comes from monomer A and the second from monomer B. Apart from the interaction between residues 46 and 48, these latter contacts do not reflect the molecular 2-fold symmetry. The intermonomer interactions are illustrated in Fig. 6.
Comparison of Pea Lectin and ConA Tertiary and Quaternary Structure-As can be seen in Fig. 7, the structures of pea lectin and ConA are very similar, and both molecules have essentially the same overall 0-sheet structure. However, the structures of the two molecules differ in several of the loops between @-strands and also in the regions of the carboxyl and amino termini. In the loops that differ between the two molecules, there are either insertions or deletions in pea lectin with respect to ConA. In addition, the amino acid sequence of ConA is circularly permuted with respect to pea lectin and the ConA monomer consists of a single polypeptide chain, while the pea lectin monomer consists of a-and P-chains. Thus pea lectin and ConA have different carboxyl and amino termini and as expected these regions differ in the two molecules. The transformation that superimposes the ConA monomer on the A monomer of pea lectin was calculated by least squares based on 185 eq a-carbon atoms plus the Mn2+ and Ca2+ ions; the root mean square difference between atom the loop at about position 120, is firmly embedded in the back P-sheet. In pea lectin, the corresponding site, at residues 1 and 234, is not connected by a peptide bond. The close correspondence between pea lectin and ConA structures in this region suggests that, were the ConA chains not ligated, the structure would nevertheless be very similar to that seen with termini lying in close proximity and in a not unfavorable orientation for ligation. The four monomers in the ConA tetramer are paired (1-11 and 111-IV) by a special relationship between monomers such that the tetramer appears to be a dimer of ConA dimers (6,7). The relationship between monomers in the ConA dimer involves a 2-fold symmetric assembly of monomer &sheets to form a 12-stranded antiparallel dimer-wide sheet. This relationship is similar to that relating monomers in the dimer of pea lectin and, in fact, the structures of the pea lectin and ConA dimers are also very similar.
Based on the same atoms described above from both monomers, the root mean square difference between superimposed in ConA), and the close contact between loops 46-48 in the two monomers (176-177 in ConA). However, specific side chain interactions between the monomers are not conserved; in most of these cases, the amino acid residues involved in these interactions are different in pea lectin and ConA.
Pea lectin and ConA differ in that the ConA dimers associate to form tetramers at physiological pH, while pea lectin dimers do not. The 12-stranded P-sheets of the ConA dimers are twisted such that the two dimers are slightly wrapped around each other in the tetramer. This twist is different in the pea lectin dimer, as can be seen in Fig. 8 in which the lower monomers of the two dimers were optimally superimposed, the difference in twist being seen in the upper pair. It is possible that the change in twist is in part responsible for the lack of tetramer formation in pea lectin. It will be interesting to see if the twists of the 12-stranded &sheets in lentil lectin and favin, neither of which form tetramers, are closer to that in pea lectin than to that in ConA. In addition to the twist of the P-sheets, the contacting residues between the sheets are also expected to be important in the formation of the tetramer. In ConA, there are 54 residues involved in the interactions between monomer I and monomers 111 and IV. Of these 54 residues, only 12 are conserved in pea lectin, but no interacting pairs are conserved. The same situation is observed for lentil lectin and f a~i n .~ Metal-binding Regions- Fig. 9 shows a comparison of the metal-binding loops in ConA and pea lectin. The protein ligands to the Ca2+ and Mn2+ ions are conserved in the two proteins and, as can be seen in the figure, the two binding The sequence of the @-chain of lentil lectin is found to terminate unexpectedly some 25 residues short of the carboxyl terminus common to the @-chains of the other 2-chain lectins (11). In pea lectin, the segment corresponding to these missing residues is an extended chain-loop-extended chain structure that forms the fourth and fifth antiparallel strands of the P-sheet at the back of the molecule. Without this segment, the back sheet of the dimer would have 8 rather than 12 strands, a radical, and probably untenable, alteration of the structure. A simple explanation is afforded by supposing an additional post-translational cleavage in lentil lectin causes the small peptide to be lost. sites appear to be very similar. Glu-119, Asp-121, and Asp-129 are unidentate ligands of the Mn2+ ion in pea lectin as are Glu-8, Asp-10, and Asp-19 in ConA. The fourth protein ligand to Mn2+ is provided by His-136 in pea lectin and His-24 in ConA. Asp-121 appears to be a bidentate ligand to Ca2+ in pea lectin as is Asp-10 in ConA. Phe-123 provides a carbonyl oxygen ligand to Ca2+ in pea lectin where Tyr-12 provides this same function in ConA. The remaining two protein ligands to Ca2+ in pea lectin are provided by Asn-125 and Asp-129, Asp-129 serving to bridge the metal ions. In ConA, these ligands are provided by Asn-14 and Asp-19, Asp-19 also bridging the metal ions. During the last two rounds of refinement, the metal-ligand distances were very lightly restrained. The overall ligand pattern did not change significantly during refinement. The electron density in the final map indicates that the ligand assignments listed above are correct and are not biased by this restraining of distances during refinement.
In ConA, there are two water ligands for Mn2+ and two water ligands for Ca2+. At 3.0-A resolution, it is not possible to tell whether these ligands are also present in pea lectin. In ConA, one of the water ligands to Ca2+ is also hydrogenbonded to 062 and 0 of Asp-208. The proximity of these two protein atoms to the binding site is a result of the cis-peptide bond between residues 207 and 208. In pea lectin, an equivalent cis-peptide is located between residues 80 and 81 and the carbonyl and carboxylate oxygen atoms of Asp-81 are at the appropriate distances from the Ca2+ ion for H bonding to a bridging water molecule.
Carbohydrate Binding Sites-The structure of an &-methyl--D-mannoside ConA complex has been determined to 6-A resolution (42,43). The use of iodinated sugars permitted the location of the a-methyl-D-mannose binding site in the vicinity of the side chains Of residues Tyr-12, Tyr-100, Asp-16, and Asp-208, some 10-14 A from the Mn2+ position. This structure has been used along with the refined coordinates and binding specificity data to build a model with computer graphics to better define the orientation of the mannose in the binding site. The modeling studies suggest that residues Asn-14, Gly-98, Leu-99, Ser-168, Asp-208, Thr-226, and Arg-228 are involved in binding a-methyl-D-mannoside (8,10). Of these residues, only 3, Asn-125 (Asn-14 of ConA), Gly-216 (Gly-98 of ConA), and Asp-81 (Asp-208 of ConA), are conserved in pea lectin. Studies with a-methyl-D-mannoside-soaked pea lectin crystals have not so far been successful, although x-ray diffraction patterns show them to tolerate soaking in concentrations of sugar as high as 4 mM for 24 h.4 A pea lectin complex with a trimannoside has been crystallized and data were collected to 2.5-A resolution. Preliminary results based on a rotation-translation solution with native pea lectin COordinates indicate that the carbohydrate binds in a similar location to that in ConA, but details of the interaction are not yet available. 5 Summary-The overall similarity of pea lectin and ConA is readily discernible at 3.0-A resolution. However, at this resolution it is possible to see differences in the structures that are necessitated by both the two-chain nature of pea lectin and the insertions and deletions that are present with respect to ConA. Most of the insertions and deletions occur in surface loops that allow small changes without substantially altering the course of the polypeptide chain. The ligands surrounding the metal sites are identical to those found in ConA, even though there are minor alterations in the course of the backbone folding near the metal sites. The disposition of the two monomers in the pea lectin dimer suggest a subtle, but significant difference that may contribute to the preservation of the dimeric nature of pea lectin at neutral pH in contrast to ConA, which forms tetramers. The pea lectin structure is likely to be the prototype of all of the presently known two-chain lectins, as well as the single-chain peanut, soybean, and sainfoin lectins.
High resolution (1.83 A) native data for pea lectin have been collected. A further understanding of the magnitude and the significance of the differences in the crystallographically independent monomers must await refinement at higher resolution, where details of the metal binding sites and disposition of the water molecules should also be discernible with confidence. The ability of pea lectin crystals to diffract to the unusually high resolution of at least 1.2 A suggests that many ordered water molecules and strong intermolecular contacts are to be expected.