Human Chorionic Gonadotropin: LINEAR AMINO ACID SEQUENCE OF THE α SUBUNIT

The linear amino acid sequence of the α subunit of human chorionic gonadotropin (hCG-α) has been derived from a study of the tryptic and cyanogen bromide peptides. The total number of amino acids varies from 89 to 92 because of NH2-terminal heterogeneity probably caused by proteolysis of hCG. The molecular weight of hCG-α is estimated to be 14,900 as calculated from its chemical composition, approximately 10,200 for the protein and 4,700 for the carbohydrate part of the molecule. The carbohydrate moiety consists of 2 bulky carbohydrate units attached to the amide groups of asparagine residues 52 and 78 by N-glycosidic bonds. Assignment of the five disulfide bonds and some of the amide groups has not been made. There are no free sulfhydryl groups or tryptophan present in hCG-α. The amino acid sequence of hCG-α exhibits considerable homology with the α subunits of luteinizing (LH) and thyroid-stimulating (TSH) hormones. The sequences of human LH-α and hCG-α differ only in a 2-residue inversion and a 3-residue deletion at the NH2 terminus of hLH-α. There are 25 amino acid substitutions in hCG-α when compared with ovine or porcine LH-α or bovine TSH-α. The structure of the carbohydrate units of hCG-α appears to be different from that of the carbohydrate units in the α subunits of the other hormones.

The linear amino acid sequence of the c11 subunit of human chorionic gonadotropin (hCG-a) has been derived from a study of the tryptic and cyanogen bromide peptides. The total number of amino acids varies from 89 to 92 because of NHz-terminal heterogeneity probably caused by proteolysis of hCG.
The molecular weight of hCG-cr is estimated to be 14,900 as calculated from its chemical composition, approximately 10,200 for the protein and 4,700 for the carbohydrate part of the molecule.
The carbohydrate moiety consists of 2 bulky carbohydrate units attached to the amide groups of asparagine residues 52 and 78 by N-glycosidic bonds.
Assignment of the five disulflde bonds and some of the amide groups has not been made.
There are no free sulfhydryl groups or tryptophan present in hCG-cr. The amino acid sequence of hCG-cr exhibits considerable homology with the a subunits of luteinizing (LH) and thyroidstimulating (TSH) hormones.
The sequences of human LH-cr and hCG-cr differ only in a Z-residue inversion and a j-residue deletion at the NH2 terminus of hLH-cr. There are 25 amino acid substitutions in hCG-a when compared with ovine or porcine LH-a or bovine TSH-(r. The structure of the carbohydrate units of hCG-cr appears to be different from that of the carbohydrate units in the Q! subunits of the other hormones.

Developments
in the chemistry of human chorionic gonadotropin have been quite rapid.
First, simple procedures suitable for large scale preparation of hCG1 were developed (1,2). The hCG preparations obtained by these procedures possessed similar physicochemical properties. Subsequently, initial studies on * This research was supported by grants from the United States Public Health Service (AM-10273) and the Population Council of New York (M72-39).
One of us (R.B.C.) is a recipient of a fellowshin from the Pooulation Council. This is the third naner in a series. Papers I and II are References 1 and 3.
-- the monosaccharide sequence of the carbohydrate moiety of native hCG, based on the sequential removal of monosaccharides by specific glycosidases, were reported (3).
Initial studies based on NH,-and COOH-terminal analyses and/or gel filtration of reduced and alkylated hCG suggested the presence of two identical chains (1,2). However, polyacrylamide gel electrophoresis of S-carboxymethylated hCG indicated the presence of two dissimilar chains, (Y and /3 (then called A and B chains, respectively), which differed in their electrophoretic mobility and amino acid compositions (4). Further evidence for the nonidentical nature of the chains came from the observation that during the S-carboxamidomethylation of hCG, the alkylated derivative of hCG-cr precipitated during dialysis (5). Whether the two chains were bonded noncovalently or by disulfide bonds remained unresolved.
The dissimilarity of the subunits and the noncovalent nature of their attachment were unequivocally established by the separation of the a! and p subunits from the native hormone on DEAE-Sephadex (5) and by their recombination (6). The reconstituted hCG was indistinguishable from native hCG in electrophoretic, immunological, and biological properties (7). This separation procedure was found suitable for the preparation of the subunits (5, 8) on a large scale. As a consequence, it became possible to initiate the work on their complete amino acid sequences (9). A preliminary report of the amino acid sequences of hCG-a! and hCG-fl has been made (10).
In this communication details of the isolation of the tryptic and cyanogen bromide peptides of hCG-ar and their amino acid compositions and sequences are described.2 Comparisons between the amino acid sequences of hCG-a and the cr subunits of bovine TSH (11) and human (12,13), bovine (14,15), ovine 2 Some of the data are presented as a miniprint supplement immediately following this paper. Tables III,   IV, (14,16), and porcine (17) LH are presented. Similar studies on hCG-0 are reported in the succeeding paper (18). MATERIALS AND METHODS coupled peptide with 200 ~1 of anhydrous trifluoroacetic acid at 45" under nitrogen for 30 min. Following this the mixture was again dried over NaOH and P205 at 60" in uacuo for 20 min. The dried residue was dissolved in 150 ~1 of water and extracted Trypsin (three times crystallized) and leucine aminopeptidase (treated with diisopropylphosphorofluoridate) were obtained from Worthington; chymotrypsin (3 x crystallized) was supplied by Sigma, and thermolysin was purchased from Daiwa, Kasei, Osaka, Japan. All Sephadex preparations were from Pharmacia.
Suitable aliquots (10 nmoles) were taken from the aqueous phase for dansvlation The hCG used in these studies was purified as reported earlier (1) from a crude commercial preparation, 2900 i.u., purchased from Organon, West Orange, N. J. The purified hCG was dissociated into subunits with 8 M urea and the subunits were separated by successive chromatography on DEAE-Sephadex and Sephadex G-100 (5, 9). and the remainder dried over NaOH and PZ05 at room temnerature in vacua prior to the next cycle.
Desialyzation-A 2% solution of hCG-Lu in 0.025 N HCl was desialyzed at 80" for 1 hour. The desialyzed hCG-a was freed of sialic acid and salts by chromatography on coarse Sephadex G-25 with an elution buffer of 0.5% NH,HCOz.
The solution of desialyzed hCG-cr was concentrated by ultrafiltration with a UM-2 membrane (Amicon Corp.) and was lyophilized. Reduction and S-Carboxamidomethylation and S-Aminoethylation of hCG-a--The S-carboxamidomethyl derivative of the desialyzed, reduced (Y subunit was prepared as described in a succeeding paper (18).
S-aminoethylation of reduced hCG-cu was carried out essentially as described by Raftery and Cole (26). A solution of 107 mg of hCG-a! in 20 ml of 8 M urea (ultrapure, Mann), containing 0.2% EDTA and 0.7 M Tris-HCl buffer, pH 8.5, was flushed with nitrogen, and 77 mg of dithiothreitol were added. The reaction Amino Acid Analyses-Samples of peptides were hydrolyzed at 110" with 6 N HCl for 24 hours in evacuated, sealed tubes. The oxidation of S-carboxymethylcysteine and the destruction of tyrosine were minimized by the addition of 2 ~1 of thioglycolic acid and 50 ~1 of a 5% solution of phenol to samples prior to hydrolysis (21). The tryptophan content of hCG-a was determined by hydrolysis with 3 N p-toluenesulfonic acid containing 0.2% tryptamine at 110" in an evacuated sealed tube for 24 hours (22). Analyses were performed on a Spinco model 120C amino acid analyzer with both columns maintained at 55". However, in samples which contained homoserine, the 55.cm column was initially run at 50" for 85 min before automatically increasing the temperature to 55" in order to effect separation of homoserine from glutamic acid. The amounts of homoserine and homoserine lactone were based on integration constants of 0.85 and 0.65 that of leucine, respectively, and an integration constant of 0.92 that of lysine was used in calculating the amount of S-aminoethylcysteine (23). Amino acid values less than 0.10 residue are not reported in the tables giving amino acid compositions.
mixture was left at room temperature for 1 hour and a total of 1.2 ml of ethylenimine (Pierce Chemical Co.) was then added in three equal portions at IO-min intervals.
The S-aminoethyl derivative was obtained free of salts with quantitative recovery by chromatography on coarse Sephadex G-25 in 0.5y0 NHIHC03, followed by lyophilization.
Hydrolysis of Cyanogen Bromide Cleavage Fragments of S-Aminoethyl hCG-a-The cleavage was carried out by treating 107 mg of S-aminoethyl hCG-a! with an equal weight of cyanogen bromide (50.fold excess over methionine) in 2.5 ml of 80% formic acid for 24 hours at 4". Prior to use, the cyanogen bromide was sublimed and the formic acid was purified by fractional crystallization (27).

Edman Degradation and NHz-terminal
Analysis-Ilost of the sequences were determined by a modification of the Edman procedure described by Gray (24) coupled with dansylation. Dansylamino acids were identified by thin layer chromatography on polyamide sheets (Cheng Chin Trading Co., Taipei, Taiwan, obtained through Gallard-Schlesinger Chemical Co.) by the method of Woods and Wang (25).
Ten nanomoles of peptide were used per cycle of Edman degradation, which was carried out in 10 x 75-mm test tubes. For the coupling reaction the peptide was dissolved in 200 ~1 of pyridine-water-triethylamine buffer (4 : 2.2 :0.3), and 100 ~1 of 59ib phenylisothiocyanate in pyridine was added. The coupling reaction was carried out at 45" under nitrogen for 1 hour. The mixture was then dried over NaOH and PZ05 at 60" in vacua for 30 min. Cleavage was accomplished by incubation of the The reaction mixture was diluted with 40 ml of glass-distilled water and lyophilized.
The resulting 130 mg of dried residue were dissolved in 5 ml of 1% propionic acid and fractionated by gel filtration on Sephadex G-75 as shown in Fig. 4. The cyanogen bromide glycopeptides, found in Fraction IV of Fig. 4, were further separated from non-carbohydrate-containing peptides by countercurrent distribution as shown in Fig. 5. The solvent system of Howard and Pierce (al), formed by equilibrating equal volumes of 0.05 M p-toluenesulfonic acid and set-butyl alcohol. was utilized.
The peptide sample (usually 25 mg) was lyophi: lized and dissolved in 1.0 ml of 0.05 M p-toluenesulfonic acid saturated with set-butyl alcohol.
Twenty-five transfers utilizing 1 ml of each phase were carried out at room temperature in 12-ml glass-stoppered centrifuge tubes.
Upon completion, the two phases were broken by the addition of 0.1 ml of methyl alcohol, and aliquots (10 ~1) were taken from by guest on March 23, 2020 http://www.jbc.org/ Downloaded from each tube and analyzed with ninhydrin following alkaline hydrolysis (28). After neutralization with 0.1 ml of 0.5 M NHbOH, the absorbance of each tube was read at 282 nm. The pooled fractions were evaporated to remove butyl alcohol and chromatographed on a column (1.9 X 74 cm) of coarse Sephadex G-25 equilibrated with 0.05 M NHbHCOs to remove p-toluenesulfonic acid. The desalted Sephadex fractions were lyophilized, dissolved in water, and stored at -20".
Final purification of the CNBr peptides was obtained by gel filtration as shown later.
Hydrolysis of Peptides with Proteolytic Enzymes-Peptides were about 1 mM in 1 7. NH4HC03 and were hydrolyzed with thermolysin, trypsin, or chymotrypsin (2 to 20% by weight) at 37" for periods of 2 to 24 hours. Determination of Amides-A 25-nmole sample of peptide was hydrolyzed with 12.5 pg of leucine aminopeptidase in 25 ~1 of 0.1 M Tris-HCl, and 0.0025 M MgC&, pH 8.6, for 40 hours at 37". The reaction mixture was analyzed for glutamine or asparagine as described in the following paper (18).
Nomenclature of Peptic&s-The major tryptic and cyanogen bromide peptides are numbered according to their sequence order; CXT-1 and CNBr-1 are the respective NHz-terminal peptides. The minor tryptic peptides are designated by adding the suffix of a lower case letter, such as aT-la.
Peptides obtained from a secondary enzymatic digest with thermolysin (Th), chymotrypsin (C), or trypsin (T) are designated by adding the respective suffixes C-, Th-, or T-to the designation of the major peptide, again numbering them in the order of their sequence. Example: crT-1, Th-1 is the NHz-terminal thermolysin peptide of tryptic peptide aT-1.

Isolation and Sequencing of Tryptic Peptides from Reduced S-Carboxamidomethylated,
Desialyzed hCG-a-The Sephadex G-25 elution pattern of the tryptic hydrolysate of the reduced Xcarboxamidomethylated, desialyzed hCG-a! is shown in Fig. 1. Fraction 1 from the Sephadex G-25 column was further fractionated on Sephadex G-50 ( Fig. 2) to effect a more complete separation of the tryptic glycopeptides from other large tryptic peptides.
Each of the fractions from the Sephadex columns, shown in Figs. 1 and 2, was further fractionated by high voltage paper electrophoresis and/or paper chromatography as described for each tryptic peptide given below. The amino acid compositions and yields of the tryptic peptides are given in Tables I and II. Residues sequenced by Edman degradation with dansylation are given with an arrow (A-) over the residue in the text. Where they were established, the proper amide assignments are indicated in the peptide sequences.
-\\-A Peptide crT-l (Residues 1 to 28) Ala-Pro-Asx-Val-Glx-Asx-Cys(Ca)-Pro-Glx-Cys(Ca)-Thr-Leu-Glx-Glx-dsx-Pro-Phe-Phe-Ser-Glx-Pro-Gly-Ala-Pro-Ile-Leu-Gln-Cys(Ca)-High voltage electrophoresis at pH 4.7 of Fractions 1 and 2 of the G-50 column ( Fig. 2) yielded three acidic peptides with amino acid compositions similar to crT-1. Analysis by the dansyl technique showed that each of the fractions exhibited NHz-terminal heterogeneity; the most acidic fraction had NHz-terminal aspartic acid with a background of alanine, the middle fraction had about equal amounts of aspartic acid and alanine, and the slowest migrating fraction had a strong NH&erminal alanine with a faint background of valine. This is in agreement wit,h analysis of the intact hCG-a by the dansyl technique, which showed NHr terminal alanine with a background of aspartic acid and valine. Edman degradation with dansylation of the slowest migrating fraction of peptide otT-1 yielded the KHz-terminal sequence Ala-Pro-Asx-Val-Glx.
Since peptide (YT-1 was isolated as a mixture of difficultly separable peptides presumably resulting from NH?-terminal heterogeneity and partial deamidation of asparagine and glutamine residues, no attempt was made to study peptide crT-1 further. The remaining sequence of crT-1 was deduced from the minor tryptic peptides, arT-la and olT-lc, given below, and from thermolysin fragments of the NHz-terminal cyanogen bromide peptide, CNBr-1 (Table VII).
The sequence thus obtained for peptide (YT-1 indicates that an anomalous tryptic cleavage at S-carboxamidomethylcysteine occurred.   The total sequence of peptide tiT-la was determined from thermolysin peptides obtained from the NHz-terminal cyanogen bromide peptide CNBr-1 shown in Table VII. _)\\-I-Cd Peptide aT-lc (Residues 18 to 28) Phe-Ser-Gk-Pro-Gly-Ala-Pro--\\\ Zle-Leu-Gin-Cys(Ca)-Peptide crT-lc, arising from a chymotryptic-like cleavage of peptide crT-1 at phenylalanine (residue 17), was isolated from Fraction 2 (Fig. 1) and further purified by high volta.ge electrophoresis at pH 1.9 and paper chromatography using Solvent I. During the sequencing of this peptide, it was discovered that the NHz-terminal phenylalanine residue did not react completely with phenylisothiocyanate. This problem was overcome by carrying out the coupling reaction two times during Step 1 of the Edman procedure before cleaving the coupled peptide as usual with trifluoroacetic acid. An anomalous partial cleavage at serine was also experienced during Step 1 of the Edman procedure such that the dansyl analysis detected a background of glutamic acid appearing one step early in addition to the expected serine residue. Following Step 1, the degradation went smoothly such that at each step, in addition to the dansylated residue being examined, a background of the next residue was always evident on the thin layer plates.
Peptide aT-2a, methionine, was isolated from Fraction 4 ( Fig. 1) by high voltage electrophoresis and paper chromatography with Solvent I. Peptides aT-2a and aT-2b presumably resulted from a chymotryptic-like cleavage of peptide olT-2 at methionine (residue 29).
The NH2terminal residue was identified as serine by the dansyl technique, which established the sequence.
Peptide ocT-6 (Residues 36 to 42) Ala-Tyr-Pro-Thr-Pro-Leu-Arg-Peptide aT-5 was isolated from a tryptic hydrolysate of reduced and carboxamidomethylated hCG. The sequence of the first 4 residues was determined by subtractive Edman degradation.

) separated
Peptide otT-?'a (Residues 48 to 51) Leu-Val-Gin-Lys-Peptide glycopeptides ~YT-8 and olT-11 as previously described. Unlike LOT-7 was separated from peptide crT-10 by subjecting Fractions peptide c&8, glycopeptide aT-11 smeared on paper electro-2 and 3 ( Fig. 1) to high voltage electrophoresis at pH 1.9 followed phoresis giving a broad band. The NH&erminal sequence was by paper chromatography with Solvent I. Paper chromatog-determined to be Val-Gls. Peptide otT-11 (1.0 pmole) was raphy separated peptide olT-7 into two peptides with identical hydrolyzed with 0.3 mg of thermolysin at 37" for 14 hours in 0.7 amino acid compositions, presumably due to partial oxidation ml of 0.57, NHaHC03, which yielded seven peptides whose of the methionine at position 47.
amino acid compositions are given in Table III. The hydroly-Peptide oLT-7a, resulting from a partial cleavage at methionine sate was fractionated by gel filtration on Sephadex G-25 as (residue 47), was isolated as a minor fraction during the purifica-shown in Fig. 3 (see miniprint supplement p. 1). Peptide crT-11, tion of peptide otT-7 and was not stu+ further. 4\\\ Th-1, obtained from Fraction 2, and peptide otT-11, Th-5 from from the acidic NHz-terminal peptides as well as from the slightly Peptide olT-11, Th-3, which had NH,-terminal serine, stained faster migrating basic glycopeptide crT-Il. The sequence of brown with ninhydrin. Peptide crT-11, Th-4, which also stained peptide otT-8 was determined by Edman degradation with brown with ninhydrin, and peptide olT-11, Th-2b were impure dansylation, but the dansyl derivative of Ser-55 was difficult to and were not studied further.
The amino acid compositions of identify and appeared as a very faint fluorescent spot on the thin the thermolysin peptides from peptide cuT-11 are given in Table   layer plates. In order to confirm the sequence of olT-80.5 pmole III. '1 he sequences of the thermolysin peptides are shown in of the peptide was hydrolyzed with thermolysin (0.1 mg) in 0.5 Table IV and their order was established with the chymotryptic ml of 1% NH4HC03 at 37" for 6 hours. The hydrolysate was peptides derived from peptide CNBr-4 (Table XIII). aT-Qa (Residues 64 to 65) Ser-Tyr-Peptide olT-9 was isolated Peptides crT-12a and aT-12b were isolated after tryptic hyfrom the tryptic hydrolysate of S-carboxamidomethylated hCG, drolysis of S-carboxamidomethyl hCG-cr, indicating that a and its sequence was determined by two steps of subtractive chymotryptic-like cleavage at tyrosine (residue 89) had occurred.
Peptide aT-12a was isolated from Fraction 4 ( Fig. 1) by successive high voltage electrophoresis at pH 1.9 and paper Step 1: Asp, 1.03; Ser, 0.30; Tyr, 0.97 chromatography using Solvent I. In both cases, it migrated in Step 2: Asp, 1.00; Ser, 0.32; Tyr, O..@ the same position as a tyrosine standard. Peptide olT-12b was isolated by high voltage electrophoresis at Arginine was placed at the COOH terminus since aT-9 was a pH 1.9 of Fraction 5 (Fig. 1). tryptic peptide.
Peptide cyT-IS (Residue 92) Ser-Peptide crT-13, which was Peptide cuT-Sa, contaminated with glycine (Table II), was serine, was isolated by high voltage electrophoresis at pH 1.9 of obtained from the tryptic hydrolysate of S-carboxamidomethyl Fraction 5 (Fig. I). hCG-a! by successive high voltage electrophoresis at pH 1.9 and paper chromatography with Solvent I of Fraction 7 (Fig. 1).

Isolation of Cyanogen Bromide Peptides from Reduced, S-Amino-
The isolation of peptide cuT-9a indicates that a chymotryptic-ethyl hCG-a--Amino acid analysis of the reduced S-aminoethyl like cleavage had occurred at tyrosine (residue 65), although the hCG-cr indicated that all of the cysteine residues were lost, with corresponding peptide Asn-Arg was not isolated. 84 '% recovery as S-aminoethylcysteine. Although precise quan -A-\--titation of S-aminoethylcysteine was hindered due to incomplete -Peptide crT-iO (Residues 68 to 75) Val-Thr-Val-Met-Gly-Glyseparation from lysine on the 8-cm column, the results indicated Phe-Lys; Peptide crT-1Oa (Residues 68 to 71) Val-Thr-Val-Xet-that complete conversion of cysteine to S-aminoethylcysteine Successive high voltage electrophoresis at pH 1.9 and paper had occurred. chromatography in Solvent I of Fractions 2 and 3 ( Fig. 1) sepa-Amino acid analysis of the mixture of cyanogen bromide reacrated peptides crT-10 and crT-1Oa. Peptide atT-10 was obtained tion products showed 0.6 methionine residue remaining, indiby paper chromatography in two fractions with identical amino eating that 20% of the total methionine had not been converted acid compositions, presumably from partial oxidation of Met-71.
to homoserine. Additional incubations with cyanogen bromide to 200.fold excess over methionine for 24 hours at 4" resulted in no further conversion of methionine to homoserine. The cyanogen bromide reaction products of S-aminoethyl hCG-a were fractionated by Sephadex G-75 column chromatog- hCG-a. The lyophilized cyanogen bromide reaction mixture was dissolved in 5 ml of 1% propionic acid and applied to a Sephadex G-75 (fine) column (2.2 X 187 cm). Elution was at room temperature with the same solution.
The flow rate was 20 ml per hour and 5-ml fractions were collected.
The fractions pooled are indicated with solid bars. raphy (Fig. 4). Fractions I to III were found to contain residual methionine when examined by amino acid analysis and most likely represent incompletely cleaved fragments or unreacted material.
These fractions, comprising 33% by weight of the starting material, were not analyzed further.
Fraction V contained peptide CNBr-2 (Fig. 4). This material was further purified by gel filtration through Sephadex G-25 to remove ultraviolet absorbing material and a small amount of CNBr-1 contamination.
The amino acid composition of peptide CNBr-2, obtained in 31% yield from the starting material, is given in Table V. Peptide CNBr-1 was separated from peptides CNBr-3 and CNBr-4 by countercurrent distribution as shown in Fig. 5. Amino acid analysis indicated that Fraction 1 of the countercurrent distribution system contained an approximately equimolar mixture of the glycopeptides, CNBr-3 and CNBr-4, each present in about 31 y. yield.
The glycopeptides were partially separated by gel filtration on Sephadex G-50 as shown in Fig. 6. CNBr-3 was obtained from Fraction 1 and CNBr-4 from Fraction 3. The middle fraction contained a mixture of the two glycopeptides and was repeatedly rechromatographed to effect further separation of the two glycopeptides.
The compositions of CNBr-3 and CNBr-4 thus isolated are given in Table V.  6 (center). Gel filtration of elvcooeDtides CNBr-3 and CNBr-4, dbtained from countercurrent disthbution. Fraction 1 (Fig. 5) was chromatographed on a column (1.35 X 192 cm) of Sephadex G-50 (fine) and eluted at room temperature with 0.05 N Fraction 2 from the countercurrent distribution (Fig. 5) contained peptide CNBr-1, which was further purified by gel filtration through Sephadex G-50 as shown in Fig. 7. The amino acid composition of peptide CNBr-1, obtained in 36% yield from Fraction 2 of the G-50 column, is given in Table V. Analysis of Fraction 1 (Fig. 7) revealed the presence of residual methionine and an amino acid composition similar to the sum of peptides CNBr-1 and CNBr-2.
This material probably represents a peptide composed of CNBr-1 and CNBr-2 in which cleavage at Met-29 did not occur. It was not analyzed further.
Fraction 3 from the countercurrent distribution (Fig. 5) contained impure CNBr-1 in low yield and was not studied further.
Characterization of Cyanogen Bromide Peptides-hCG-cy contains 3 methionine residues and yielded four peptides when cleaved with cyanogen bromide.
Peptide CNBr-4 did not contain homoserine (Table V) and was therefore determined to be COOH-terminal.
The positioning of the other three cyanogen bromide peptides, numbered according to their sequence order, was then established by means of the three methionine-containing tryptic peptides aT-2, crT-7, and crT-10.
NHz-terminal sequences of the four cyanogen bromide peptides were determined by Edman degradation with dansylation; residues thus sequenced are shown in the text with a superscript arrow (2).
The amide groups, wherever established, are indicated in the peptide sequences.
The complete sequences of the cyanogen bromide peptides were then determined by hydrolyzing the peptides with thermolysin, trypsin, or chymotrypsin and then sequencing the smaller secondary peptides.
One micromole of peptide CNBr-1 was hydrolyzed with thermolysin (200 pg) in 1 ml of 1 y0 NH4HC0a at 37" for 3 hours. Six thermolysin peptides were separated by high voltage electrophoresis at pH 4.7. Peptides CNBr-1, Th-2 and CNBr-1, Th-5 were further purified by paper chromatography with Solvent I. The compositions of the thermolysin peptides are given in Table VI (see miniprint supplement p. 5). Their sequences are given in Table VII.
Hydrolysis of peptide CNBr-1, Th-6 with leucine aminopeptidase indicated residue 27 to be glutamine. Three steps of Edman degradation with dansylation on peptide CNBr-1 showed the NH?-terminal sequence Ala-Pro-Asx-Val. This information, plus the partial sequences and compositions of the NH*-terminal tryptic peptides, was sufficient to establish the ordering of the thermolysin peptides from peptide CNBr-1 as shown in Table VII (see miniprint supplement Table VIII (see miniprint  supplement p. 7), and their sequences are given in Table IX (see miniprint supplement p. 8). The sequence of the 8 NH&erminal residues of peptide CNBr-2 was determined to be Gly-Cys(Ae)-Cys(Ae)-Phe-Ser-Arg-Ala-Tyr, and was sufficient to establish the positioning of peptides CNBr-2, Th-1, CNBr-2, Th-2, and CNBr-2, Th-3 as shown in Table IX.
The positioning of peptide CNBr-2, Th-4 was determined with the overlapping tryptic peptide crT-5, which relegated the dipeptide CNBr-2, Th-5, Thr-Hsr, to the COOH- CNBr-3 was determined by sequencing the tryptic peptides obtained from it. An aliquot of 0.5 pmole of glycopeptide CNBr-3 was hydrolyzed with trypsin (75 pg) in 0.6 ml of lyO NH4HC03 at 37" for 1 hour.
The initial fractionation of the six peptides thus obtained was carried out by Sephadex G-25 column chromatography as shown in Fig. 8a (see miniprint supplement p. 2). Fraction 1 contained the glycopeptide CNBr-3, T-2 which was further purified by high voltage electrophoresis at pH 4.7 and yielded three fractions due to incomplete hydrolysis at lysine (residue 51) and X-aminoethylcysteine (residue 59). Fraction 3 contained peptides CNBr-3, T-l, CNBr-3, T-3 (stained gray-brown with ninhydrin), CNBr-3, T-4, and CNBr-3, T-6, which were separated by high voltage electrophorcsis at pH 1.9. Peptide CNBr-3, T-5, which stained brown with niuhydrin, was present in Fraction 5 and was subjected to high voltage electrophoresis at pH 1.9 to remove serine contamination.
The compositions of the six tryptic peptides obtained from CNBr-3 are given in Table X (see miniprint supplement p. 9), and their sequences are given in Table XI (see  miniprint  supplement p. 10). The glutamine and asparagine residues at positions 50 and 66, respectively, were determined by hydrolysis of peptides CNBr-3, T-l and CNBr-3, T-5 with leucine aminopeptidase.
The position of peptide CNBr-3, T-3 was established with the overlapping tryptic peptide (YT-8. Peptide CNBr-3, T-6 was placed at the COOH terminus because it contained homoserine.
The only remaining peptide, CNl3r-3, T-5, was positioned by difference, and its amino acid composition togcther with the compositions of the other tryptic peptides obt.ained from CNBr-3 account for all of the residues of peptide CSBr-3.
The compositions of the six peptides obtained from chymotryptic hydrolysis of peptide CNBr-4 are given in TabIe XII (see miniprint supplcmcnt p. II), and the sequences of the chymotryptic peptides of CNBr-4 are given in Table XIII (SW miniprint supplement p. 12).
Edman degradation with dansylation of intact CNBr-4 established the following 1c H2-terminal sequence of the peptide: Gly-Gly-Phe-Lys-Val-C lx-Asn(CH0). This information, together with the overlapping thermolysin peptides obtained from the tryptic peptide aT-11, was sufficient to establish the sequence of peptide CNBr-4 as shown in Table XIII. DISCUSSION The linear amino acid sequence of hCG-ar is shown in Fig. 9. The molecular weight of hCG-a is approximately 14,900 as computed from its chemical composition, 10,200 for the protein and 4,700 for the carbohydrate portion of the molecule. The total carbohydrate of hCG-a is contained in 2 bulky units attached by N-glycosidic bonds to asparagine residues 52 and 78 (Fig. 9). As shown for other glycoproteins (29), hCG-a has the invariant Asn-X-Ser/Thr sequence at the carbohydrate attachment sites. The total number of amino acids in hCG-cr varies from 89 to 92 because of NHz-terminal heterogeneity discussed below. There are five disulfide bonds in hCG-cr and there is no evidence for the presence of free sulfhydryl groups or tryptophan.
The assignment of the disulfide bonds as well as some of the amide groups still remains to be completed.
Attempts to determine the amide groups by leucine aminopeptidase, aminopeptidase M, or Pronase hydrolysis were not successful in the NHz-terminal region of the molecule because of the presence of clusters of adjacent acidic residues and several proline residues.
Proline residues have been reported to form diketopiperazines (30) which might have resulted in incomplete hydrolysis. The Edman degradation procedure with direct identification of the phenylthiohydantoin derivatives should facilitate the determination of such amide groups. However, possible deamidation in the NHz-trrminal region might still be a problem.
Several problems were encountered during the isolation and sequencing of the tryptic and cyanogen bromide peptides of hCG-cr. Tryptic hydrolysis of reduced, S-carboxamidomethylated, desialyzed hCG-a resulted in a large number of nonspecific partial cleavages in addition to the expected hydrolysis at arginyl and lysyl residues.
Partial cleavages occurred at all 3 methionyl residues (positions 29, 47, and 71), at 3 of the 4 tyrosyl residues (positions 65, 88, and 89), and at S%arboxamidomethylcysteinyl residues 28 and 31. It appears that the trypsin preparation used in the present studies had some chymotryptic-like activity.
The resulting chymotryptic-like cleavages yielded a multitude of minor peptides which complicated the fractionation of the tryptic hydrolysate. Treatment of X-aminoethyl hCG-oc with cyanogcn bromide (50-to 200.fold excess over methionine residues) resulted in approximately 80% conversion of methionine to homoserine. The reasons for the incomplete reaction with cyanogen bromide are not clear, although &ram et al. (13) have reported similar results with human LH-(r.
The two cyanogcn bromide glycopeptides CNBr-3 and CNBr-4 proved difficult to separate by paper chromatography or electrophoresis. An adequate separation of these peptides in high yield was effected by Sephadex G-75 gel filtration as shown in Fig. 6.
Internally located S-aminoethylcysteine was consistently difficult to identify presumably due to destruction of its dansyl derivative during acid hydrolysis, as postulated by Gray (31). However, no problems were encountered in the identification of bis-dansyl-X-aminoethylcysteine from the NH%-terminal Saminoethylcysteinyl residue in peptides CNBr-3, T-3 and CNBr-4, C-3. The