The Covalent and Three-Dimensional Structure of Concanavalin A II. AMINO ACID SEQUENCE OF CYANOGEN BROMIDE FRAGMENT

SUMMARY The amino acid sequence of the COOH-terminal CNBr fragment, FP (residues 130 to 237), of concanavalin A has been established, completing the determination of the covalent structure of this lectin. Analysis of the chemical sequence showed that the distribution of charged residues is generally more dense in the NH*-terminal half of the polypeptide chain than in the COOH-terminal portion and that in the latter region there is a linear stretch composed of many hydrophobic residues. Correlation with x-ray crystallographic results indicates that the hydrophobic region is located in the interior of the molecule, and that it forms a part of a deep cavity which is the binding site for the inhibitor, ,&(o-iodophenyl)-D-glucopyranoside. In conjunction with the three-dimen-sional structure, the amino acid sequence reported here provides new data for analysis of variables involved in predicting the three-dimensional folding of proteins from the primary structure. The sequence of concanavalin A is the first determined for a lectin and it serves as a reference structure for comparisons with other lectins.


M. EDELMAN
From The Rockefeller University, New York, New York 10021 SUMMARY The amino acid sequence of the COOH-terminal CNBr fragment, FP (residues 130 to 237), of concanavalin A has been established, completing the determination of the covalent structure of this lectin.
Analysis of the chemical sequence showed that the distribution of charged residues is generally more dense in the NH*-terminal half of the polypeptide chain than in the COOH-terminal portion and that in the latter region there is a linear stretch composed of many hydrophobic residues.
Correlation with x-ray crystallographic results indicates that the hydrophobic region is located in the interior of the molecule, and that it forms a part of a deep cavity which is the binding site for the inhibitor, ,&(o-iodophenyl)-D-glucopyranoside.
In conjunction with the three-dimensional structure, the amino acid sequence reported here provides new data for analysis of variables involved in predicting the three-dimensional folding of proteins from the primary structure.
The sequence of concanavalin A is the first determined for a lectin and it serves as a reference structure for comparisons with other lectins.
Studies on the structure and activities of the lectin Con Al have been pursued in a number of laboratories in order to understand the wide variety of biological and ligand-binding activities exhibited by this protein (see Ref. 1). The over-all structure of Con A at physiological pH has been shown to be a tetramer of identical subunits (2-4), each with a single saccharide binding site (5,6).
In addition to studies to characterize the molecular structure of Con A, other experiments have attempted chemical modifica-* This work was supported by a Teacher Scholar Grant from the Camille and Henry breyfus Foundation, United States Public Health Service Grants AI 11378. AI 09921. AI 09999. and AI 09273 from the National Institutes bf Health, and Grant GB 33403 from the National Science Foundation.
These studies include the labeling of the metal and saccharide binding sites (7,8), the dissociation of the subunits of the tetramer (9, lo), and various other derivatizations using group-specific reagents (11,12). The effects of proteolytic cleavages of the polypeptide chain (13,14) have also been examined. Such studies can now be refined further to correlate the modification of specific amino acid residues with corresponding alterations in the activities and to interpret these changes in terms of the complete chemical sequence and three-dimensional structure of the Con A molecule.
In the preceding paper of this series (l), we presented the amino acid sequence of the first 129 residues of the Con A polypeptide chain. We now report the complete covalent structure of the protein.
This sequence represents the first complete chemical structure of a lectin and thus provides a reference for comparison with other mitogenic and non-mitogenic lectins. Correlation of the amino acid sequence with the results of x-ray crystallographic studies are reported in the following papers (15,16). MATERIALS AND METHODS The isolation of the intact subunit of Con A (13) and the preparation of CNBr Fragment Ft (17) have been described. Fragment Fa was separated from Fragments F1 and Fz by gel filtration on Sephadex G-75 in 5% formic acid and purified by repeated gel filtration on the same column. Peptide T15,16 was succinylated by suspension in saturated sodium acetate (9) and addition of a loo-fold molar excess of succinic anhydride.
After 1 hour at room temperature, an equal volume of 2 M propionic acid was added and the reaction mixture was stirred overnight.
The resulting suspension was centrifuged and the precipitate was dissolved in AmProl and separated from any unreacted reagent by gel filtration on Sephadex G-50 in AmProl.
Enzymatic digests were performed as described previously (1). Partial acid hydrolysis of peptides was carried out in 0.03 M HCl, 0.1 M HCI, or 0.25 M acetic acid at 110" for the indicated times in tubes sealed in vacua. Tubes were then opened and the contents were dried by evaporation in oucuo over NaOH or by lyophilization prior to fractionation.

RESULTS
Amino Acid Sequence of CNBr Fragment F,--Previous studies (1) have established the amino acid sequence of CNBr Fragments Fr and Fz, representing residues 1 to 129 of Con A. The remainder of the amino acid sequence was established by the analysis of peptides isolated from enzymatic digests of Fragment Fa, which contains the rest of the peptide chain (residues 130 to 237) with an NHP-terminal sequence of Phe-Asx-Glx-Phe-Ser-Lys and a COOH-terminal sequence of Asx-Ala-Asn (17). The numbering system used here is identical and continuous with that used in the preceding paper (1). Few peptides were required to establish the order of the tryptic peptides and each of the segments that were difficult to analyze is contained within the regions represented by a tryptic peptide.
The determination of the amino acid sequence of Fragment FS, therefore, is discussed in terms of the tryptic peptides as outlined m Fig. 1 and T15,16 as described below. The soluble fraction of the tryptic digest was chromatographed on DEAE-cellulose (Fig. 2). Material in Fraction A was further fractionated by gel filtration on Sephadex G-25 in AmProl to give peptides Tllb and T14.2 Material in Fractions B, G, and F ( Fig. 2) yielded T12, T13, and T17, respectively, without further purification.
Although peptides T15,16 and T15,16a presumably resulted from the failure of trypsin to cleave thepeptide bond between Lys 200 and Ser 201, T15,16 differed from T15, 16a. The latter peptide apparently resulted from chymotryptic cleavage of the peptide bond between Phe 212 and Phe 213. Peptides T15,16a and T16b were obtained from the same digests, although T15,16a was frequently contaminated with T15,16. Peptide T16b was soluble in 0.02 M Tris, pH 8.0, and was also obtained from the soluble fraction of a tryptic digest of the intact subunit of Con A by gel filtration on Sephadex G-50 in AmProP followed by high voltage electrophoresis at pH 4.7 and 1.9. When rigorous efforts were made to reduce the extent of chymotryptic cleavage (using trypsin with minimal chymo-  150 159 Leu-Ile-Leu-Gln~Gly-Asp-Ala-Thr-Thr-Gly-Thr-Asp-Gly-Asn-Leu-Glu-Leu-Thr-arg-V~l- 170 179 Ser~Ser-Asn-Gly-Ser-Pra~Glu-Gly-Ser-Ser-Vol-Gly-Arg-Ala-Leu-Phe-Tyr-Ala-ProYol- 190 199 H~s-Ile-Trp~Glu~Ser-Ser~Alo-Thr-Val-Ser-aloPhe-Gi~-Alo-Thr-Phe-Ala-Phe-Leu-Ile- Th 17    .2 (1) .o (1) .o (1) .9 (2)  G-75 in 5% HCOOH followed by gel filtration on Sephadex G-50 in 8 M urea, 1 M in propionic acid (Fig. 3). Material in Fraction A included peptides T15,16a or T15,16 which were  acid.2 Peptide TI5,16 was also obtained by treating the material that was insoluble at pH 8.0 with succinic anhydride followed by isolation of the succinylated deriyative by gel filtration on Sephadex G-50 in AmPro12 In most cases, T15,16 or T15, Amino Acid Sequence of Tryptic Peptides-The amino acid sequence of tryptic peptides Tllb, T12, and T17 (Table I) was 16a could not be obtained as homogeneous peptides. Because determined directly by Edman degradation and dansylation of their strong tendency to aggregate in solution, they were con-( Fig. 1). Peptides T13, T14, and T15, I6 were too large to be taminated with each other and with small amounts of undigested analyzed directly and cleavage into smaller peptides was required.
Peptide T1S-Direct analysis of peptide T13 (Fig. 1) gave the sequence of the first I5 residues of T13, with the exception of position 146 where no dansyl-amino acid was detected. The residue at this position and the remainder of the sequence were established by digestion of peptide T13 with subtilisin as well as by partial acid hydrolysis of the tryptic peptide.
The subtilisin peptides, S7 to SIO (Table II), were isolated by high voltage paper electrophoresis (pH 4.7 and 1.9). Edman analysis of S7 indicated that residue 146 was alanine and similar treatment of peptides S8 to SIO extended the sequence of peptide T13 to residue 156 (Fig. 1). The remaining sequence was established by the isolation and characterization of peptide PA1 (Table II). Peptide PA1 was one of a mixture of peptides obtained after partial acid hydrolysis (0.03 M HCl, II hours) and high voltage electrophoresis at, pH 4.7. The amino acid sequence of peptide PAI completed the determination of the amino acid sequence of the tryptic peptide T13 (Fig. I).
Peptide T14-The major difficulty associated with the determination of the sequence of peptide T14 (Table I) stemmed from the fact that Edman degradation as routinely used released no new residues after removal of residue 161 (Fig. 1). This difficulty was circumvented by exposing the peptide to acid for a minimal amount of time during each step of the degradation (18, 25), which allowed extension of the sequence to residue 165. The sequence was confirmed by the isolation and analysis of peptides PA2 and PA3 (Table III).
These peptides were obtained after treatment of T14 with 0.1 M HCl for 1 hour at 110" by high voltage paper electrophoresis (pH 6.5 and 1.9). T Sll 0.9 (1) The amino acid sequence of the COOH-terminal portion of T14 (residues I66 to 172, Fig. 1) was determined by analysis of peptides SII and Pm 4 (Table III).
Peptide Sll was obtained from a subtilisin digest of T14 by high voltage electrophoresis at Peptide T15,16-The determination of the amino acid sequence of peptide T15,16 proved to be the most difficult task pH 4.7 followed by electrophoresis at pH 1.9. Peptide Pm 4 in the determination of the primary structure of Con A. As described above, the failure of trypsin to cleave at Lys 200 and was isolated from a pronase digest of T14 by electrophoresis at the chymotryptic cleavage at Phe 212 resulted in a family of similar peptides of which only T16b could be obtained in homo-pH 1.9. geneous form.
The amino acid sequences of these peptides completed In addition, peptides T15,16a and TI5,16 were insoluble at neutral pH and therefore were often resistant to di-the determination of the sequence of T14.
gestion with other proteolytic enzymes. Direct analysis of TI5,16a or T15,16 (Table I) gave the sequence of residues 173 to 186 with the exception of positions 178 and 182, where no dansyl-amino acids were detected. These data indicated that the presence of heterogeneity in the isolated peptides would be associated with the COOH-terminal portion of the peptide.
The remaining sequence of T15,16 was established by isolating and characterizing peptides obtained after digestion with other enzymes and after partial acid hydrolysis (Table IV) and by analyzing peptide TI6b (Table I).
Peptide P14 was obtained from material in Fraction D by high voltage electrophoresis at pH 4.7 followed by electrophoresis at pH 1.9. Sequence analysis of PI3 indicated that residue 178 is proline and confirmed the sequence of residues 176 to 185. The sequence of 1'14 provided the sequence of residues 192 to 195 and the partial sequence of PI5 gave residues 199 to 210, with the exception of positions 200 and 209, where no dansyl-amino acids were detected. Partial acid hydrolysis proved particularly valuable in the analysis of this region of the molecule.
Peptides l'A5, l'A5a, PA6, and PA7 (Table IV) were obtained by hydrolysis (0.03 M HCl, 4 hours) of peptic peptide 1'15, followed by two-dimensional paper electrophoresis at pH 4.7 and 1.9. Analysis of these peptides (Fig. I)     obtained from digests of T15,16 and peptides Th15 and Th17 (Table IV) were obtained from digests of succinylated T15,16. In both cases, gel filtration of these digests on Sephadex G-25 gave an elution profile similar to the one shown in Fig. 5. High voltage electrophoresis at pH 6.5 of material in Fraction D gave Thl6 and of that in Fraction F gave Thl'ia.
Peptides Th15 and Th17 were obtained from material corresponding to that in Fractions D and G, respectively. Peptide Thl5a (Table IV) was obtained from a direct thermolysin digest of Fragment F3 by ion exchange chromatography on DEAE-cellulose2 followed by high voltage electrophoresis at pH 6.5. Peptide Thl5b (Table  IV) was obtained from a thermolysin digest of T15,16 by gel filtration on Sephadex G-25 in AmProP followed by electrophoresis at pH 6.5. The subtilisin peptide S12a (Table IV) was isolated from a digest of T15,16 by gel filtration on Sephadex G-25 in AmProP followed by paper electrophoresis at pH 6.5 and 1.9.
lysine and glycine, respectively, confirmed the sequence results on peptide P15 and extended the sequence of P15 to residue 212, the COOH terminus of T15,16a. Partial acid hydrolysis also proved valuable in filling the gap between residues 186 and 189. Peptide PA4 (Table IV) was isolated after partial acid hydrolysis (0.25 M acetic acid, 8 hours at 110") of T15,16 by high voltage electrophoresis at pH 6.5 followed by paper chromatography. The amino acid sequence of peptide PA4 (Fig. 1) extended the sequence from the COOH terminus of P13 to residue 189. The sequences of the remaining gaps in peptide T15,16a were provided by analysis of peptides obtained from digestion of the tryptic peptide with thermolysin and subtilisin.
The thermolysin peptides Th16 and Thl7a (Table IV) were Treatment of peptide Thl5a with leucine aminopeptidase released isoleucine, followed by a slower release of tryptophan, establishing residue 182 as tryptophan (Fig. 1). Treatment of Th15 with carboxypeptidase A indicated that residue 190 was alanine.
Peptide Thl5b was also treated with leucine aminopeptidase, which indicated that the NHS-terminal residue of this peptide was valine.
Edman analysis of Th16, Th17, Thl7a, and S12a completed the amino acid sequence of the portion of T15,16 represented by peptide T15,16a.
The remaining sequence of T15,16 was established by analysis of peptide T16b (Table I).
Direct analysis of T16b yielded the sequence of the first 6 residues; the sequence of the peptide was complet,ed by the isolation and characterization of peptide Th18. This peptide was obtained directly by gel filtration of a thermolysin digest of T16b on Sephadex G-25 (Fig. 6) and also by gel filtration of a thermolysin digest of T15,16 (Fig. 5, Fraction C) followed by high voltage electrophoresis at pH 6.5. Edman analysis of Th18 (Fig. 1) completed the determination of the amino acid sequence of Tl5,16 and the sequence of all the tryptic peptides of FI.
Order of Tryptic, PeptidesDirect sequence analysis of CNBr Fragment Fs in the automatic sequenator placed peptides Tllb, T12, and T13 (Fig. 1). This assignment was confirmed by Edman analysis of peptide S6a (Table V) which was obtained from a subtilisin digest of Fragment FS by gel filtration on Sephadex G-25 in AmProl followed by ion exchange chromatography on DEAE-cellulose.2 The amino acid composition and partial sequence of peptide C6 (Table V) established the positions of peptides T14 and T15,16.
Peptide C6 was obtained from a chymotryptic digest of Fragment F3 by gel filtration on Sephadex G-50 in AmPro12 followed by high voltage paper electrophoresis at pH 4.7, or alternatively, by ion exchange chromatography on DEAE-cellulose.2 The amino acid composition and partial sequence of this peptide indicated that it began at residue 157, spanned all of peptide T14, and extended into peptide 'IX, 16. Treatment with carboxypeptidase A (Fig. 1) placed the COOH terminus of C-6 at Phe 175, thus confirming the fact that it overlapped T14 and T15,16.
Peptide Thl8a was obtained from a thermolysin digest of Fragment FI by gel filtration first on Sephadex G-50, then on Sephadex G-25, and finally by ion exchange chromatography on DEAE-cellulose.2 The amino acid sequence ( Fig. 1) of peptide Thl8a (Table V) placed T17 adjacent to T15,16 at the COOH Tube number FIQ. 6. Gel filtration of a thermolysin digest of Tl6b on a column (1.5 X 100 cm) of Sephadex G-25 in AmProl.
Each tube contained 2.0 ml of effluent; the absorbance of effluent fractions at 230 nm is indicated by the solid line.
terminus of the polypeptide chain. This is consistent with the fact that the COOH terminus of T17 wds identical with that of the Con A molecule (2,17).
Asparaginyl and Glutaminyl Residues in Iqs-The assignment of the positions of all asparaginyl and glutaminyl residues was made on the basis of the electrophoretic mobilities and amino acids released after enzymatic digestion of peptides.
Those peptides used in determining the amino acid sequence of FB were described above and the amino acid compositions are given in Tables I to V. The amino acid compositions of the other peptides are listed in Table VI, and the sequence represented by each peptide is indicated in Fig. 1. The electrophoretic mobilities of all peptides are given in Table VII.
Peptide C5 (Tables VI and VII) was obtained from a chymotryptic digest of Fragment Fs by gel filtration on Sephadex G-50 in AmPro12 followed by high voltage paper electrophoresis at pH 6.5. This peptide was neutral at pH 6.5 indicating that residue 131 was aspargine and residue 132 was glutamine.
Peptide T12 (Table I) was also neutral at pH 6.5 (Table VII) The assignment of the positions of asparaginyl and glutaminyl residues in peptide T13 was difficult because there were 6 such residues spaced closely within this peptide.
The assignment of Asp 139 was made on the basis of the electrophoretic mobility (Table VII) of a peptide (Ccl, Table VJ) obtained by electrophoresis at pH 6.5 from a chymotrypsin C digest of tryptic peptide T13. This finding also allowed the assignment of Gln 143 on the basis of the electrophoretic mobility of peptide S6b (Table  VI).
Peptide S6b and peptides S7,8 and S9,lO were isolated from subtilisin digests of peptide T13 by two-dimensional high voltage electrophoresis at pH 6.5 and 1.9. The electrophoretic a Values presented are amino acid residues. Amino acids present at a level less than 0.2 residue are omitted.
The assumed integral numbers of residues are given in parentheses.
Peptide S6b gave 1 residue each of aspartic acid and these positions were occupied by acidic residues. Peptide SE1 glutamine and S7,8 yielded 1 residue of aspartic acid.
was isolated by paper electrophoresis at pH 6.5 of peptide S9 The remaining three positions (Asx 151, Asx 153, and Glx 155) after three steps of the Edman degradation. This peptide lacked residue 151 and had a net charge of -1. This result showed that either Asx 153 or Glx 155, but not both, was an acidic residue, and that residue 151 was aspartic acid. Peptide 1'12 (Table VI), obtained from a peptic digest of peptide T13 by ion exchange chromatography on DEAE-cellulose,2 contained arginine and was neutral (Table VII) at pH 6.5, indicating that residue 155 was glutamic acid. Residue 153 therefore must be asparagine.
Within peptide T14 the electrophoretic mobilities (Table VII) of peptides S Pm 1, S Pm 2, and SE2 (Table VI) allowed the assignments of Asn 162 and Glu 166. Peptides S Pm I and S Pm 2 were obtained after digestion of tryptic peptide T14 first with subtilisin and then with pronase, followed by two-dimensional paper electrophoresis at pH 4.7 and 1.9. Peptide SE2 was obtained by two-dimensional paper electrophoresis (pH 6.5 and 1.9) of peptide S Pm 1 after two steps of the Edman degradation.
Within the region represented by tryptic peptide T15,16, Glu 183 and Glu 192 were established on the basis of the electrophoretic mobilities of peptide Thl5a (Table IV) and peptide C7 (Table VI).
Peptide C7 was obtained from a chymotryptic digest of Fragment F3 by gel filtration on Sephadex G-50 in AmProP followed by electrophoresis at pH 6.5. Similarly Asp 203 was assigned on the basis of the electrophoretic mobilities of peptides Pm 5 and Pm 6 ( Table VI) ; these data and the electrophoretic mobilities of peptides Pm 7, Prri 8, Pm 8a, and S12 indicated that residue 208 was aspartic acid. Peptides Pm 5 through I'm 8a were obtained from pronase digests of peptide T15,16 by two-dimensional paper electrophoresis (pH 6.5 and 1.9), and peptide S12 was isolated from a subtilisin digest of I peptide T15,16 by gel filtration on Sephadex G-25 in AmProY followed by electrophoresis at pH 6.5.
The assignment of Asn 216 was made on the basis of the electrophoretic mobility of peptide S13 (Table VI), which was obtained from a subtilisin digest of peptide T16b by two-dimensional paper electrophoresis (pH 6.5 and 1.9). The electrophoretic mobility of peptide Th18 (Table IV) suggested that residue 218 was aspartic acid. This assignment was confirmed by digestion of the peptide with aminopeptidase M, which released 1 residue of isoleucine, I residue of aspartic acid, and 2 residues of serine.
There are 2 aspartyl or asparaginyl residues in peptide T17. The electrophoretic mobility (Table VII) of peptide C8 (Table  VI), obtained in the same manner as described above for peptides C6 and C7, suggested that either residue 235 or 237 was aspartic acid. The COOH-terminal residue has been reported previously as asparagine (2) ; therefore, residue 235 was assigned as aspartic acid.

DISCUSSION
The complete amino acid sequence of Con A is presented in Fig. 7. This sequence differs from the tentative sequence we reported previously (28) in the deletion of an aspartyl or asparaginyl residue at position 72, in the reassignment of residue 87 as glutamic acid, and in two transpositions, one between Ile 106 and Leu 107 and the other between Trp 109 and Phe 111 (1). In addition, three regions of the polypeptide chain (residues 163 to 166, 187 to 190, and 196 to 198) Ile-Pro-Ser-Gly-Ser-Thr-Gly-Arg-Leu-Leu-Gly-Leu-Phe-Pro-Asp-Ala-Asn have now been established chemically. Any remaining ambiguities in the sequence data are contained in the regions represented by peptide T15, 16, particularly residues 187 to 190 and 196 to 198. However, the residues assigned by the chemical sequence are in excellent agreement with the trace of the polypeptide chain in this region of the electron density map. Finally, this study completes the assignment of the positions of all asparaginyl and glutaminyl residues.
Previous structural studies have shown that the molecular weight of the Con A polypeptide chain is about 26,000 (2, 3) and that Con A contains no covalently bonded carbohydrate, lipid, or other detectable prosthetic group (29, 30). The amino acid sequence of the 237 residues shown in Fig. 7, therefore, accounts for the entire Con A protomer, in agreement with the contents of the crystallographic asymmetrical unit (31, 32). Examination of the sequence of Con A shows that it is in general agreement with the amino acid compositions reported for the individual CNBr fragments (17) as well as for the intact polypeptide chain (2, 3). In addition, there is no evidence for the presence of unusual amino acids such as canavanine, known to be present in jack beans (33). While a number of difficulties were encountered in the sequence determination, the sequence reported here is in excellent agreement with the x-ray crystallographic data (15,16).
More detailed analysis of the sequence shows a striking distribution of certain amino acids in Con A (Fig. 8). The distribution of charged residues is generally more dense in the NHz-terminal half of the polypeptide chain than in the COOH-terminal portion (Fig. 8A). There are three clusters of charged groups which stand out: a group of 6 negatively charged residues between Asp 2 and Asp 28, a closely spaced group of positive charges between Lys 30 and Arg 60, and another cluster of negative charges between Asp 71 and Glu 87. The majority of these charged clusters are exposed on the surface of the Con A protomer (16) with the interesting exception of the first negatively charged cluster, in which 3 residues (Glu 8, Asp 10, and Asp 19) are involved in metal ion binding (Fig. 8A) (1, 15, 28). In contrast to the NHz-terminal portion, there are fewer charged residues in the COOH-terminal portion, and the charged residues are usually counterbalanced by the presence of oppositely charged residues in their vicinity.
These observations seem to delineate the linear sequence of the Con A molecule into two distinct regions, arbitrarily separated at about residue 110. This division between the two halves of the Con A protomer is even more striking when the distribution of aromatic amino acids is studied (Fig. 8B). Con A contains 4 tryptophans, 7 tyrosines, and 11 phenylalanines.
The tryptophanyl residues are evenly distributed.
In contrast, 6 of the 7 tyrosines are located in the NHz-terminal half and all of the phenylalanines are between residue 111 and the COOH terminus of the molecule (Fig.  SB). The predominance of such hydrophobic residues as well as the sparse distribution of charged residues in the last half of the linear sequence may account for the strongly aggregating behavior of CNBr Fragment F3 and of tryptic peptide T15,16. In accord with these observations, examination of the x-ray crystallographic model of Con A indicates that these hydrophobic residues form a cluster in the interior of the molecule (16, 28). In addition, within the sequence of Fragment Fa, residues 130 to 132, 136 to 139, and 175 to 178 all appear to be involved in noncovalent interactions between halves of the ellipsoidal dimers and, therefore, are relatively shielded from the solvent (16).
These features of the linear structure of Con A may prove particularly valuable for comparison with other mitogenic and nonmitogenic lectins, although it is clear that such features may not be readily apparent from partial sequence data. Furthermore, analysis of the relationship between primary and three-dimensional structure is obviously much more complex than is implied by the simple distribution of certain amino acid residues within the linear structure.
Detailed features of the folding of the Con A polypeptide chain are described by Reeke et al. (16). Nevertheless, the fact that both the complete amino acid sequence and the three-dimensional structure of Con A have now been reported in detail (1, 15, 16) makes this protein a valuable source of new data for studying the factors that might be involved in predicting tertiary structure from the amino acid sequence of a protein.
The tentative sequence of Con A (28) has already stimulated an analysis of the influence of nearest neighbor residues on the conformation of the intermediate amino acid residues (34), the development of empirical rules for predicting the initiation and termination of helical and p structures (35), and the analysis of nucleation points for protein folding (36).
Above all, however, the sequence should be particularly useful in correlating the modification of specific amino acid residues of chemically derivatixed Con A with corresponding alterations in biological activity.
As we have discussed in the previous paper (l), Lys 114 and Lys 116 have been implicated as residues affected by succinylation and acetylation of Con A (9), resulting in dissociation of the tetramer into dimers and alteration of the effects of the protein on lymphocyte surface receptors.
Similar mechanisms may apply to the dissociation of the tetramer by maleylation (10).
In contrast to these chemical derivatives in which carbohydrate-binding specificity is preserved, reaction of Con A with N-acetylimidazole results in loss of carbohydrate-binding capacity, thus implicating the participation of tyrosyl residues in the binding of saccharides (11,37). It has also been shown that derivatization of carboxyl groups using glycine methyl ester and carbodiimide resulted in inactivation of the carbohydrate-binding activity of Con A (8). This and the results of potentiometric titration studies have suggested that a carboxyl group is involved in the saccharide binding site of the lectin.
In view of the facts that saccharide binding is dependent upon metal binding (38) and that the protein ligands in the metal binding site are predominantly carboxyl groups (1, 15, 28)) the loss in carbohydratebinding activity on derivatization with glycine methyl ester could occur as a result of the abolition of the metal-binding capacity of the protein rather than through direct action at the carbohydrate binding site. In any case, similar studies using specific affinity-labeling techniques (39) will now be particularly valuable in the precise location and detailed description of the saccharide binding site.
Finally, knowledge of the structure of Con A and the chemical characterization of its cell surface receptors should help in the design of experiments to localize other key features of the protein 16. responsible for mitogenesis and cell membrane alteration in 17. lymphoid and neoplastic cells. 18.