Characterization of the B-chain of human plasma alpha 2HS-glycoprotein. The complete amino acid sequence and primary structure of its heteroglycan.

alpha 2HS-Glycoprotein, a normal human plasma protein, was recently shown to consist of two polypeptide chains. In the present study, we have separated these two chains from one another and have elucidated the complete primary structure of the B-chain. Employing automated Edman degradation, the polypeptide moiety of this chain was shown to consist of 27 amino acid residues with an unequal distribution of the neutral and charged amino acid residues. The first 20 residues are uncharged, whereas the carboxyl-terminal heptapeptide contains all charged residues. Utilizing 500-MHz 1H-NMR spectroscopy, the carbohydrate unit proved to be a trisaccharide consisting of sialic acid, galactose, and N-acetylgalactosamine O-glycosidically linked to serine (residue 6). The structure of the B-chain was found to be as follows. (formula; see text) Thus, the molecular weight of the B-chain is 3386. Evaluation of the polypeptide chain by the procedure of Chou and Fasman (Chou, P.Y., and Fasman, G.D. (1979) Adv. Enzymol. 47, 45-148) predicts that the B-chain has two beta-turns. Thereby, the carbohydrate unit which is linked to the Ser residue located in the first beta-turn appears to be directed away from the protein. The second beta-turn probably includes the Cys residue which links the B- to the A-chain. In agreement with the CD analysis, the B-chain lacks beta-conformation but possesses a short alpha-helical region.

Thus, the molecular weight of the B-chain is 3386. Evaluation of the polypeptide chain by the procedure of Chou and Fasman (Chou, P. Y., and Fasman, G. D. (1979) Adv. Emymol. 47, 45-148) predicts that the Bchain has two /?-turns. Thereby, the carbohydrate unit which is linked to the Ser residue located in the first 8-turn appears to be directed away from the protein.
The second /?-turn probably includes the Cys residue which links the Bto the A-chain. In agreement with the CD analysis, the B-chain lacks /?-conformation but possesses a short a-helical region.
azHS-Glycoprotein,' a normal human plasma globulin (I, 2), has been demonstrated to be associated with a considerable number of biological functions. The plasma level of this protein was reported to be significantly decreased in certain malignant (3) and inflammatory diseases (4), malnutrition (5), and Paget's disease (6). Furthermore, in bone, a2HS-glycoprotein was found to be concentrated up to 3W-fold with respect to other plasma glycoproteins (7,8). This glycoprotein also displays opsonic properties (9) and promotes endocytosis (10). Of particular interest is the observation that in certain cancer patients the serum level of this protein correlates with diminished lymphocyte activity, suggesting a relationship between this protein and a depression of cell-mediated immunity (11).
The physicochemical and chemical properties of azHS-glycoprotein have been described earlier (2, 12) and, pertinent to the present study, revealed two NH2-and two carboxyl-terminal amino acids/mol of protein, indicating the presence of two polypeptide chains designated A and B. In this investigation, separation of these two chains from one another and the elucidation of the amino acid and monosaccharide sequences of the B-chain of azHS-glycoprotein are reported.

Complete Structure of the B-chain of apHS-Glycoprotein
Reduction and Alkylation-Human a2HS-glycoprotein was prepared as described earlier (2) and further purified by a method published recently (12). The glycoprotein (28 mg) was reduced with dithiothreitol (20-fold molar excess over the disulfide bonds) in 1.2 ml of 0.13 M Tris-HC1 buffer, pH 7.6, containing 6 M guanidine HCl and 0.2% EDTA at room temperature for 4 h. Subsequently, the reduced protein was alkylated with 4-vinylpyridine (3-fold molar excess over the reducing agent) at room temperature for 2 h (13). The reaction mixture was then passed through a Sephadex G-75 column (1.2 X 150 cm), previously equilibrated with 0.1% ammonium bicarbonate.
Molecular Weight Determination-The apparent molecular weight of the B-chain was determined by polyacrylamide gel electrophoresis in 0.1% sodium dodecyl sulfate and 8 M urea (14). Reference proteins or peptides were lysozyme (M, = 14,600) and cyanogen bromide fragments I (Mr = 8,279), I1 (Mr = 6,420), and I11 (M, = 2,550) of sperm whale myoglobin. The gels were stained with either Coomassie brillant blue or periodic acid-Schiff stain.
Sugar Analysis-After methanolysis and N-(re)acetylation and trimethylsilylation of the B-chain, the quantification of each monosaccharide (15) was carried out by capillary gas-liquid chromatography on an SE-30 column (25 m X 0.35 mm inner diameter). The oven temperature was programmed from 130 to 220 "C at 2 "C/min. Circular Dichroism-Circular dichroism measurements of the Bchain were made with a Cary Model 61 spectropolarimeter. AS solvents, 0.1 M NaF, 50% 2-chloroethanol, and 6 M urea were used. The results were expressed as reduced mean residue ellipticity (16).
Amino Acid Analysis-The B-chain and peptides were hydrolyzed with twice-distilled 6 N HC1 in evacuated sealed glass tubes under NZ at 110 "C for 20 h. Hydrolysates were analyzed with the aid of a Beckman 119 CL automated amino acid analyzer in combination with a Model 126 data system. Values for threonine and serine were not corrected for destruction during hydrolysis and no correction was applied for the partial liberation of valine.
NHz-and Carboxyl-terminal Amino Acid Analyses-The NH2terminal amino acid of the B-chain was determined by the 5-dimethylaminonaphthalene-1-sulfonyl) technique (17). The carboxyl-terminal sequence was established by digestion with carboxypeptidase A (18). For this experiment, the B-chain (0.3 mg) was treated with 0.1 mg of this enzyme (for explanation, see "Results") in 0.2 M N-ethylmorpholine acetate buffer, pH 8.5, at 25 "C. The liberated amino acids were determined by amino acid analysis.
Tryptic Digestion-The B-chain (4 mg) was digested with 0.08 mg of L-1-tosylamino-2-phenylethyl chloromethyl ketone-trypsin in 0.9 ml of 0.1 M N-ethylmorpholine acetate buffer, pH 8.6, at 25 "C for 6 h. The reaction was terminated by adding 9% formic acid to lower the pH of the solution to 2.5, and the resulting solution was then applied to a Bio-Gel P-4 column (1.2 X 220 cm) equilibrated with 0.1 M formic acid, pH 2.3. Automated Edman Degradation-Automated Edman degradation was carried out by stepwise degradation with the aid of a Beckman 890C Sequencer equipped with a cold trap using program 121078 with 0.25 M Quadrol and a combined S I and Sz wash (19). Samples were applied to the cup in 15% acetic acid (0.3 ml) containing 2 mg of Polybrene using Beckman Sample Application program 02772. PTHnorleucine (33 nmol) was utilized as an internal standard. PTH derivatives were obtained as described earlier (20) and identified by high pressure liquid chromatography (21) or, in some cases, by gasliquid chromatography (22) or by back hydrolysis (23). PTH-S-pyrdylethylcysteine, which is water soluble, was identified by high pressure liquid chromatography.
Preparation of the Glycopeptide Fraction-The B-chain (6 mg) was dissolved in 0.5 ml of water and subjected to exhaustive pronase digestion at pH 8.0. Pronase (0.3 mg) was added three times at 4-h intervals and the solution was incubated at 60 "C. The reaction mixture was then concentrated to 0.4 ml in uacuo and applied to a Sephadex G-10 column (0.9 X 105 cm) equilibrated and eluted with water. The effluent was monitored at 230 nm, and an aliquot of each fraction (0.6 m l ) was assayed for neutral sugars (24). The fractions containing sugar were combined, concentrated, and lyophilized.
5w-MHz 'H-NMR Spectroscopy-Prior to NMR analysis (25), the glycopeptide preparation was exchanged repeatedly in DzO employing intermediate lyophilization. For 'H-NMR spectroscopic analysis, a Bruker WM-500 spectrometer (SON Facility, Nijmegen, The Netherlands) was employed operating in the Fourier transform mode at a probe temperature of 300 K and equipped with a Bruker Aspect-2000 computer. Resolution enhancement of the spectrum was achieved by Lorentzian to Gaussian transformation from quadrature phase detection, followed by a complex Fourier transformation (25). Chemical shifts are given relative to sodium-2,2-dimethyl-2-silapentane-5-sulfonate (indirectly to acetone in DzO: 6 = 2.225 ppm).

RESULTS
n2HS-Glycoprotein, after reduction and alkylation, yielded two fractions on gel filtration (Fig. 1). The major fraction ( A ) was eluted near the void volume and contained the A-chain. The minor fraction ( B ) comprising the B-chain eluted just before the salt peak (S). The B-chain preparations of several experiments were pooled and lyophilized. On sodium dodecyl sulfate-polyacrylamide gel electrophoresis, a single band with an apparent M , = 5400 was observed. This band stained with both Coomassie brilliant blue and periodic acid-Schiff stain, which demonstrated that this chain contains carbohydrate with sialic acid. NH2-and carboxyl-terminal amino acid analyses showed the presence of only Thr and Val, respectively, thus confirming the homogeneity of the B-chain preparation.
Circular Dichroism-The circular dichroism spectrum of the absence of P-pleated sheet confiiation (Fig. 2). In the presence of 2-chloroethanol, the percentage of a-helical structure increased slightly and in 6 M urea the CD spectrum revealed essentially random conformation. Evaluation of the B-chain by the method of Chou and Fasman (26) predicts the presence of two p-turns (residues 4 to 7 and 18 to 21) and a short a-helix (residues 7 to 12) and the absence of p-conformation.
Amino Acid and Carbohydrate Composition of the B- This low value is probably due to partial hydrolysis of the Val-Determined as S-pyridylethylcysteine. Asp, Leu, and Tyr were found to be present in small amounts and the corresponding values were 0.18, 0.07, and 0.04, respectively. Met and Trp were not detected. e Traces of Man, GlcN, xylose, and Glc were also noted. 'Gal assumed to be 1.0.

Val bond.
Chain-The amino acid composition (  (Table I) suggests the presence of an 0-glycosidic carbohydrate unit. The molecular weight of the B-chain calculated from the amino acid and carbohydrate composition was 3386, in contrast to the apparent M, = 5400 mentioned above, This observation is consistent with the fact that glycoproteins are known to afford erroneously higher molecular weight values on sodium dodecyl sulfate-polyacrylamide gel electrophoresis.
Amino Acid Sequence-Automated Edman degradation of the B-chain provided the complete amino acid sequence ( Fig.  3 and Table 11). Thr, which is located in position 1, was identified as PTH-Thr in the organic phase and obtained in the expected yield. This fiiding indicates that the oligosaccharide chain is not bound to this amino acid. Cycle 6 yielded unusually small amounts of PTH-serine in the organic phase, suggesting that the carbohydrate unit is linked to this amino acid and that the B-chain contains two species of polypeptide chains, the predominant one glycosylated at residue 6, and a minor one lacking a carbohydrate unit. Furthermore, the single serine residue of this chain must reside in position 6 as the residues in all other positions were positively identified. Digestion with carboxypeptidase confirmed the sequence His-Phe-Lys-Val-COOH (the corresponding data are summarized below).
The entire sequence was confiied by automated Edman degradation of the tryptic peptides derived from this chain. These peptides were separated from each other by gel filtration affording six fractions (Fig. 4), and their amino acid compositions are given in Table 111. The amino acid sequences of the peptides T-la and T-lb were found to be identical (Table IV and Fig. 3) except that residue 6 in T-la could not be positively identified. It was concluded that residue 6 in Tl a is a serine O-glycosidically linked to a carbohydrate unit. *

T -5 I le-Arg
This is consistent with the fact that T-la was eluted before Tl b on gel filtration. Peptide T-5 was the dipeptide Ile-Arg which could be placed at residues 22 and 23. Peptide T-3, the carboxyl-terminal peptide, represents residues 24 to 27. Peptide T-4 could be identified as residues 24 to 26. Fraction T-2 contained only Val (Table 111) which was also identified as PTH-Val (Fig. 3). As expected from the carboxyl-terminal sequence -Lys-Val, Val was liberated only to a molar yield of 30% at an enzyme/substrate ratio of 1:30 after 1 h of incubation. At an enzyme/substrate ratio of 1:3 and a short incubation time of 5 min, again only Val was released (0.48 residue/ mol of P-chain). However, when the incubation period was extended to 1 h, in addition to Val, Lys, Phe, and His were liberated (0.86, 0.52, 0.26, and 0.24 residues/mol of ,B-chain, respectively), and after 3 h the corresponding values were 0.91, 0.65, 0.32, and 0.29, respectively. These data are in full agreement with the complete covalent sequence mentioned above. Carbohydrate Structure-The 500"Hz 'H-NMR spectrum (Fig. 5 ) of the glycopeptide fraction appears relatively complex, because many signals stemming from amino acid protons of the relatively large peptide portion occur in the same spectral regions as structural reporter group signals of the oligosaccharide moiety (in the ranges from 6 = 4.7 to 3.9 ppm and from S = 2.8 to 1.7 ppm). However, since the carbohydrate composition of the glycopeptide had been determined (Table I), the carbohydrate structural reporter group signals (Table V) could be recognized and assigned to the primary structure of the carbohydrate units. The NeuAc raphy.
-,identified as PTH-S-pyridylethylcysteine.     (25,(27)(28)(29). In agreement with this interpretation, the H-1 signal of Gal was found at 6 = 4.564 ppm. This conclusion was further substantiated by the chemical shift of Gal H-3 being S = 4.070 ppm. However, the characteristic shape of the latter signal could not be observed, because of the partial overlap with an amino acid signal. The glycosidic linkage between Gal and GalNAc was found to be of the ,B-type. This conclusion was based on the value of the coupling constant J,,B being 8.0 Hz for the H-1 signal of Gal. In turn, the GalNAc residue is a-glycosidically linked to Ser. The a-linkage is indicated by the chemical shift of the GalNAc H-1 (6 = 4.920 ppm) in conjunction with the value for its coupling constant J1.z being 4.3 Hz. This anomeric signal is accompanied by a lower intensity doublet at 6 = 4.911 ppm having the same coupling constant J1>2 of 4.3 Hz, pointing to heterogeneity of the peptide moiety of the glycopeptide fraction. This heterogeneity was also evident from the N-acetyl proton region; GalNAc of the major peptide component possessed its N-acetyl signal at 6 = 2.015 ppm, while the minor constituent had this singlet at 2.008 ppm (Fig. 5). These findings are in agreement with the

DISCUSSION
In the present study, the two polypeptide chains of mHSglycoprotein were separated from each other and the complete primary structure of the B-chain of this glycoprotein was elucidated. As to the amino acid sequence of the B-chain, the following is noteworthy. The first 20 residues are uncharged, 5 of the 6 valine (including two Val-Val sequences) and all proline and alanine (Ala-Ala-Ala) residues are within this region. However, this hydrophobic segment carries the carbohydrate unit linked to serine which probably renders this section of the peptide chain relatively hydrophilic. All charged residues are located within the seven amino acids at the carboxyl terminus. Evaluation of the B-chain according to Chou and Fasman (26) indicated that this chain does not assume p-conformation and that it probably possesses a short a-helical region. These findings are in agreement with results obtained by circular dichroism analysis. Two P-turns (33,34) are predicted in the B-chain with the mentioned helix projected to immediately follow the first p-turn (35). According to this prediction, the carbohydrate would be located on the first P-turn. Similar observations have been reported for other proteins (36, 37). It is also of interest that the second p-turn includes the cysteine residue which links the B-and the Achain.
The carbohydrate moiety elucidated by high resolution 'H-NMR spectroscopy was found to be comprised of a trisaccharide linked to Ser. This procedure affords simultaneously the monsaccharide sequence and the position of the glycosidic linkages between the monosaccharides and anomeric configurations. This type of carbohydrate unit has recently been found in several other proteins (e.g. Refs. 25 and 36-38). It should be noted, however, that this is the fist time that the structure of a mucin-type (0-glycosidic) oligosaccharide chain which is still linked to a peptide has been determined completely by 'H-NMR spectroscopy. To date, structural studies have been carried out employing the corresponding oligosaccharide alditol (25) prepared from the protein by alkaline borohydride cleavage. The approach that has been chosen here affords the precise structure of the carbohydrate-peptide linkage, information which is difficult to obtain by chemical techniques.