Isolation from Lima Bean Lectin of a Peptide Containing a Cysteine Residue Essential for Carbohydrate Binding Activity*

The location and amino acid sequence surrounding a cysteine residue required for carbohydrate binding in the lima bean lectin (LBL) was determined. Following selective conversion of the sulfhydryl group to its S- cyano derivative, LBL was cleaved at the essential cysteine residue to give two fragments, estimated by sodium dodecyl sulfate-polyacrylamide gel electropho- resis in two buffer systems to have molecular masses of 16.5-19 kDa and 10.5-11 kDa. The larger frag- ment, which contained the glycosyl moiety of the lectin, was shown by sequence analysis to contain the NH2- terminal sequence of LBL. The smaller COOH-termi-nal fragment was found to contain the cysteine residue involved in the intersubunit disulfide bond of LBL. Digestion of LBL with pepsin and trypsin yielded four peptides containing the essential cysteine. gave a single concensus sequence, Val-Glu-Phe-Asp-Thr-Cys-His- Asn-Leu-Asp-, for the sequence surrounding the cysteine. The peptide sequence and site of cyanylation cleavage were used to predict alignment of the LBL peptide with the primary sequence of concanavalin A. Maxi-mum homology was found with a sequence in concanavalin A beginning at valine carbohydrate metal

of LBL was shown to be sensitive to modification of the single free cysteine sulfhydryl group present on each subunit of the lectin (7, 8). Modification by several cysteine-specific reagents, including dithiobis-(2-nitrobenzoic acid) Nbs2, Nethylmaleimide, mercurials, and Cu2+, inactivated the lectin (8). Furthermore, the modification and inactivation of LBL by Nbsp and N-ethylmaleimide was inhibited in the presence of D-GalNAc but not D-G~cNAc (8). These findings established that this cysteine is required for carbohydrate binding and suggested that it may be located in the carbohydrate binding site.
In the accompanying paper (9), we have used a kinetic assay to further examine the reactivity of this thiol, its interaction with carbohydrate and its participation in metal ion binding in LBL. These studies confirmed the earlier work of Gould and Scheinberg (8) and established that one sulfhydryl group on each subunit is required for and protected by carbohydrate binding. This is intriguing in light of the stoichiometry of carbohydrate binding as measured by equilibrium dialysis using ['4C]methyl-a-~-GalNAc. Bessler and Goldstein (10) found two binding sites/tetramer of component 111, which is equivalent to one sugar binding site for two subunits. To account for this behavior, a sugar binding site may be situated between 2 subunits in close proximity to a sulfhydryl group from each subunit, or sugars may bind with negative cooperativity so that binding to one subunit induces a conformation change blocking the second site on a subunit pair and concomitantly masking the second sulfhydryl. Alternatively, equilibrium dialysis may yield an incorrect stoichiometry due to the low affinity of binding. In the latter case, 4 mol of GalNAc may bind to component I11 with one sugar protecting each sulfhydryl group. However, Pardolfino and Magnuson ( 5 ) also reported apparent half-of-sites metal ion binding to LBL, which is consistent with the observed sugar binding stoichiometry of one site/two subunits.
In order to gain further insight into the role of the sulfhydryl groups in carbohydrate binding to LBL, we determined the position and alignment of the essential and disulfide cysteines. In the present paper, we report the localization of these cysteines and the isolation and primary sequence of a peptide containing the cysteine required for carbohydrate binding. The homology between this peptide and other know lectin sequences was also examined.

Methods
Chemical Modification of LBL-Modification with iodoacetamide was done in 0.1 M NH,HCO,, p H 8.2. Native lectin or lectin reduced with 3 eq of dithiothreitol for 24 h under N2 was treated with 1.5 eq of i~do['~C]acetamide (relative to total thiol) for 4 h a t room temperature in the dark. Excess reagents were removed by dialysis. Incorporation of label was determined by scintillation counting in ACS (Amersham). Lectin concentration was determined by absorbance a t 280 nm using Et: = 12.3 ( 7 ) .
Preparation of CN-LBL was done using the indirect method of Vanaman and Stark (14). The free sulfhydryl of LBL was converted to the thionitrobenzoate disulfide by reaction with excess Nbss and isolated by gel filtration on Sephadex G-50 in sodium phosphate (pH 7.0, r/2 = 0.1) containing 1 mM EDTA. Following adjustment to pH 8.2 with Tris base, the S-cyano derivative was prepared by reaction with 0.01 M KCN for 16 h at room temperature. The reaction mixture was dialyzed against 0.1 M acetic acid and the CN-LBL recovered by lyophilization.
CN-LBL was cleaved at the modified cysteine by incubation in 0.1 M Tris acetate, pH 9.0, containing 0.5% SDS for 48 h at 37 "C. The LBL fragments were separated by preparative SDS-polyacrylamide gel electrophoresis on 12% acrylamide gels (15) and electrophoretically eluted (16) into dialysis tubing (Spectra Por 2000 molecular weight cutoff, Pierce Chemical Co.). Following dialysis into water, the fragments were recovered by precipitation with 8 volumes of acetone in the presence of 0.01 M HCI. LBL (40 mg) was dissolved a t 50 mg/ml in 88% formic acid. The Isolation of Cysteine-containing Peptides-Lyophilized salt-free denatured lectin was diluted to 1 mg/ml in 1 mM HC1 containing 10 wg/ml of pepsin and digested with stirring for 30 min. The reaction mixture was brought to p H 6.0 by addition of 2 M N H 4 0 H and applied to a column (0.7 X 6.5 cm) of mercurial agarose. Unbound peptides were eluted successively with 0.1 M ammonium acetate, pH 6, 2 M guanidine HCI in 0.1 M ammonium acetate, and 0.1 M ammonium acetate, pH 6. Bound peptides were eluted with 50 mM 6-mercaptoethanol in the same buffer and recovered by lyophilization. The peptides were dissolved in 0.1 M NH4HC03, reduced with 1 eq of dithiothreitol, and alkylated with ['4C]iodoacetamide.
Trypsin digests of the alkylated pepsin fragments were done in 0.1 M NH4HCO:3 using 10 pg/ml of trypsin. After 2 h a t room temperature, the digestion was stopped by addition of acetic acid. The mixture was used directly for HPLC.
Analytical and preparative HPLC of pepsin and tryptic fragments was done on a Varian 5000 HPLC using a Spherisorb 5-pm octadecyl silica column (0.4 X 25 cm) (Brownlee Laboratories). The column was equilibrated with 1% H3P04 adjusted to pH 2.3 with KOH. A gradient was run to 70% CHJCN a t a flow rate of 1.5 ml/min. Peptides were detected by absorbance a t 210 nm.
Sequencing and Amino Acid Analysis-End group analysis of purified peptides was done by the dansyl chloride method (17) with identification of the dansyl amino acids on polyamide thin layers (Pierce Chemical Co.). Sequence analysis was performed by the University of Michigan Protein Sequencing Facility using manual Edman degradation (18). 3-Phenyl-2-thiohydantoin amino acids released were quantified by reversed phase HPLC (19). Identification of 3-phenyl-2-thiohydantoin-[~'C]carboxamidomethy1cysteine was confirmed by scintillation counting of an aliquot of the released residues.
Electrophoresis-Analytical SDS-gel electrophoresis was done in 12 and 15% acrylamide gels using either a Tris glycine (15) or a sodium phosphate, pH 7.2, buffer system containing 6 M urea (22). Gels were stained for protein with Coomassie brilliant blue R-250 and for carbohydrate using the thymol-sulfuric acid stain (23). Radioactivity in gel slices was determined as described in Ref. 24.
Isoelectric focusing in the presence of 8 M urea was performed as described previously (11).
Lectin Actiuity-Activity of LBL was determined by hemagglutina-tion of human Type AI red blood cells (3). Activity was also determined by assaying the ability of D-GalNAc to protect LBL from modification by Nbss in a kinetic assay (9).

RESULTS
Our strategy for determining the position and sequence surrounding the essential cyteine of LBL was first to locate the position of the cysteine on the LBL sequence and then to determine the amino acid sequence surrounding the cysteine. Incubation of polypeptides containing S-cyano derivatives of cysteine at alkaline pH often results in specific cleavage of the polypeptide on the amino side of the modified cysteines (25). By determining the size of the resulting fragments and location of iminothiazolidine derivatives of cysteine, the approximate position of the cysteine in the original polypeptide may be assigned. To determine the amino acid sequence surrounding the cysteine, the lectin was subjected to proteolytic digestion to give fragments with a free NH2-terminus containing the essential cysteine.
In order to distinguish the cysteines required for carbohydrate binding and the disulfide cysteines, methods were developed for selective modification of the two thiols on each subunit. The failure of iodoacetamide to react with LBL was reported previously (8). I~do['~C]acetamide was used to confirm this finding and to establish conditions for labeling. Native LBL did not react with iodoacetamide (0.001 mol/mol of subunits incorporated). Following reduction of the lectin with dithiothreitol, however, treatment with iodo[14C]acetamide resulted in incorporation of 0.9 mol/mol of subunit. Specific labeling of cysteine was verified by determining the radioactivity associated with dansyl-carboxymethylcysteine following hydrolysis, reaction with dansyl chloride, and separation on polyamide thin layer sheets. Determination of radioactivity in the three subunit forms of LBL (1 1) following separation by isoelectric focusing in 8 M urea indicated identical incorporation of label into the three subunits. The modified lectin still contained 1 -SH/subunit by Nbsz assay which was protected by D-GalNAc. The lectin was also completely active as a hemagglutinin. From these data, it was concluded that the disulfide cysteine was specifically labeled whereas the essential cysteine in both dithiothreitol-reduced and native LBL was unreactive towards iodoacetamide under the conditions examined. It may also be concluded that the disulfide bond is not required for the carbohydrate binding and hemagglutinating activity of LBL.
With the disulfide cysteine specifically labeled and protected, we next attempted to cyanylate the essential cysteine. Initial experiments using 2-nitro-5-thiocyanatobenzoic acid were unsuccessful. Treatment with 1 mM reagent at pH 7.0 gave no release of thionitrobenzoate as monitored by absorbance at 412 nm. Some reverse reaction with the reagent to release CN-may occur (26), although the lectin was still active after the above reaction. The indirect method of Vanaman and Stark (14) was used to form the S-cyano derivative. LBL modified with Nbs2 was reacted with excess KCN to give S-cyano-LBL, releasing 1 eq of thionitrobenzoate. Reactions with [I4C]KCN verified incorporation of 1 mol of cyanide/ subunit. This lectin derivative was inactive in a hemagglutination assay but could be reactivated by reduction with 0.01 M dithiothreitol.
Several conditions were examined for cleavage of S-cyano-LBL. Optimum cleavage was obtained using a 48-h incubation period at pH 9 in the presence of 0.5% SDS. SDS-gel electrophoresis of the cleaved lectin showed intact lectin and two fragments (Fig. 1, lune a ) (Fig. 2 4 ) . During cleavage, the labeled carbon on Scyano-cysteinyl residues is retained in the new NH2-terminal residue. Therefore, the 10.5-kDa fragment is the carboxylterminal fragment of LBL and the 19-kDa fragment is the NH2-terminal fragment of LBL. T o confirm this assignment, the two polypeptides were isolated by preparative SDS-gel electrophoresis (Fig. 1, lanes b and c). The 10.5-kDa fragment was blocked to Edman degradation (25) but the 19-kDa fragment would be expected to contain the same NH2-terminal sequence as the intact lectin. Ten cycles of manual Edman degradation gave the same sequence for the 19-kDa fragment as determined previously for the three subunit forms of LBL (11).
S-cyano-LBL labeled with i~do['~C]acetamide in the disulfide cysteine was then used to determine the location of the disulfide cysteine. Labeling was found exclusively in the 10.5-kDa fragment (Fig. 2B), indicating that the disulfide cysteine is located in the COOH-terminal region of the lectin. Staining of the fragments resolved on an SDS gel for carbohydrate indicated that only the NH,-terminal fragment contained carbohydrate (Fig. 2C). The 19-kDa fragment was also selectively retained when a mixture of fragments was chromatographed on concanavalin A-Sepharose in 0.05% SDS. The bound fragment was eluted using 0.1 M methyl-a-D-mannoside. These results indicated that the glycosyl moiety is located on the NH2-terminal portion of LBL.
Pepsin-generated fragments of LBL containing the essential cysteine residue were isolated on a mercurial agarose column at low pH to minimize disulfide interchange. The isolated peptides were alkylated with i~do['~C]acetamide and analyzed by reversed phase HPLC, giving a broad multiplet eluting at 45-55% CH,CN. On SDS-gel electrophoresis, a single band of apparent molecular weight 4300 was seen. Several peaks were isolated by HPLC using a shallow gra- in the disulfide cysteine with iodo["C]acetamide as described under "Experimental Procedures." The modified lectin was then cyanylated and cleaved, and the fragments were resolved by electrophoresis. The gel was cut and counted as before. C, densitometric scan a t 525 nm of cleaved S-cyano-LBL stained for carbohydrate using thymol-sulfuric acid (23).

Minutes
FIG. 3. HPLC separation of cysteine-containing LBL peptides. Cysteine-containing peptic fragments of LBL were purified on mercurial agarose. Following alkylation with iodo[14C]acetamide, the peptides were redigested with trypsin. Peptides were resolved on a CIS column with an 80-min gradient to 70% CH&N in 1% &POI, pH 2.3, at a flow rate of 1.5 ml min-l. Absorbance (-) was monitored at 210 nm and radioactivity (---) was determined by scintillation counting of aliquots of the eluent fractions.
Cysteine-containing Peptide from Lima Bean Lectin dient. Dansyl end group analyses of these peptides gave multiple NH2-terminal residues. The peptide mixture was therefore redigested with trypsin. HPLC now gave well resolved peaks, four of which contained I4C label (Fig. 3). The three major labeled peptides, denoted a, b, and c in Fig. 3, were isolated and subjected to manual Edman degradation (Tables  1-111). Amino acid compositions of the three peptides (Table  IV) indicated that sequencing terminated before reaching the carboxyl end of all three peptides. Low yields of tryptophan were released on cycle 9 of peptide b and cycle 11 of peptide c. Thus, tryptophan was tentatively assigned at this position. The three peptides can be aligned to give the following concensus sequence from which the three peptides could be derived by staggered pepsin cleavage at the NH,-terminus and having a unique trypsin cleavage site following the

COOH-terminal lysine: Val-Glu-Phe-Asp-Thr-Cys-His-Asn-
Leu-Asp-Trp-(Asx, Pro)-Lys. Pepsin digests were also conducted using the three isolated subunit forms of LBL prepared by isoelectric focusing in 8 M     Table 11. Determined by scintillation counting.  urea (11). Sulfhydryl-containing peptides were purified on mercurial agarose and analyzed by HPLC. Identical elution profiles were obtained for all three subunits, indicating that there are probably no differences in primary sequence of the three subunits in the vicinity of the essential cysteine.
Homology between the LBL sequence surrounding the essential cysteine and other lectin sequences was examined by a computer program using two PAM matrices to test all possible alignments, including amino acid insertions and deletions (27,28). Comparing the first 10 residues of the LBL peptide, which sequenced with high yields, with the primary sequence of ConA (29) gave one significant alignment of Val 1 of the peptide with Val 7 of ConA (4.1 standard deviations from mean alignment score). Comparison with the primary sequence of favin (30,31) gave the best alignment with Val 119 of favin (4.0 standard deviations) and a secondary alignment with Ile 165 (3.4 standard deviations).

DISCUSSION
Assignment of the relative positions for the essential and disulfide cysteinyl residues and the glycosyl moiety of LBL, based on cleavage of S-cyano-LBL, are summarized schematically in Fig. 4. A disulfide-linked dimer of two 31-kDa subunits is depicted. Only the position of the essential cysteine is known relative to the NH2 and COOH termini of LBL. The glycosyl moiety (CHO) and disulfide cysteine have been assigned to the 19 and 10.5-kDa fragments, respectively; but their relative positions within the respective fragments is unknown. Attempts to locate the disulfide cysteine by cleaving LBL cyanylated at this residue were unsuccessful.' Previous investigators noted NH2-terminal sequence homology among many legume lectins, including LBL (11,32). In lectins composed of two chains, e.g. favin and the lectins from lentil and pea, the region of homology is located at the  (30,33). The amino acid sequence of concanavalin A is circularly permuted relative to that of other legume lectins (30). Homology with other lectins was observed when the NH, termini of these lectins were aligned with residues 120-123 of ConA (30). Determination of complete amino acid sequences for ConA, favin, and lentil lectin demonstrated that regions of homology extended throughout the lectin sequences. Since the three-dimensional structure of ConA is also known and some of the residues involved in carbohydrate binding tentatively identified (34)(35)(36), we were interested in determining whether the essential cysteine of LBL could be aligned with regions of ConA thought to function in carbohydrate binding. The position of the essential cysteine is now known relative to the NHz and COOH termini of LBL. Therefore, the approximate position on the ConA sequence corresponding to the cysteine of LBL can be predicted. The NH, terminus of LBL was aligned with residue 123 of ConA. This alignment placed the site of cleavage of 5'-cyano-LBL near the NH, terminus of ConA.
Sequence homology between the cysteine-containing peptide and ConA allowed better alignment of the peptide. The best fit with the ConA sequence was also near the NH, terminus of ConA. The statistics for this alignment were insufficient to unambiguously assign the peptide to this position, Sequence homology at the level seen here for a 10residue peptide aligned with a randomly generated 112-residue sequence of average compositions was found to occur at a frequency of 32% (37). However, cyanylation results restricted the location of the peptide to a small region of the ConA primary sequence within which the homologous sequence was found. The coincidence of these two independent assignments makes the alignment highly significant.
Further confirmation of this alignment was obtained by testing homology of the peptide with the primary sequence of favin. A best fit was obtained with the region beginning at Val 119 of the favin 8-subunit. This residue was previously shown to align with Val 7 of ConA when the favin and ConA sequences were compared (30). The amino acid sequence of the 8-chain of lentil lectin is identical to that of favin in this region (33). Thus, homology is observed between the LBL peptide and corresponding regions of three other legume lectins.
Sequences of ConA, lentil lectin, and favin and their homologies with the cysteine peptide from LBL are depicted in Fig.  5. The cysteine of LBL is aligned with Tyr 12 of ConA. The carbonyl oxygen of this residue was found to be a calcium ligand (39). The side chain of Tyr 12 is directed towards the carbohydrate binding site of ConA (36). We have established that both binding of carbohydrate and removal of metal ions alters the reactivity of the essential cysteine in LBL (9). Thus, the essential cysteine of LBL aligns with a residue in ConA which may have a similar function. Additional residues involved in metal binding by Con A including Glu 8, Asp 10, and Asn 14 are also conserved in the LBL peptide.
The histidine at position 7 of the LBL peptide may affect the reactivity of the LBL essential thiol. The pH dependence for reaction of this thiol with Ellman's reagent suggested that the thiol may be in an ion pair with a nearby positively charged group. The imidazole group of the histidine could provide this charge. Examination of molecular models reveals that interaction between the imidazole and the thiol is possible when the two residues are linked by a trans peptide bond. The sequence in ConA homologous to the LBL peptide is found in a loop surrounding the metal ion binding sites and was shown to adopt a different conformation in ConA crystallized in the absence of metals (40). Movement of this region in LBL upon demetalization is also suggested by the effect of EDTA on the thiol. The reactivity of the essential thiol with Ellman's reagent increased 50-fold upon removal of Ca2+ and Mn2+ from the lectin (9). Determination of the specific interactions between the essential cysteine and carbohydrate or metal ligands of LBL will require crystallographic analysis of the lectin. The present work provides strong indirect evidence, however, for localization of the cysteine residue between the carbohydrate and metal binding sites of LBL. Inactivation of LBL following cyanylation, which adds only two atoms to the sulfhydryl group, also strengthens the argument for an intimate role of the cysteine sulfhydryl group in carbohydrate binding.

D D Roberts and I J Goldstein
for carbohydrate binding activity.