Ovine Submaxillary Mucin PRIMARY STRUCTURE AND PEPTIDE SUBSTRATES OF UDP-N-ACETYLGALACTOSAMINE:MUCIN TRANSFERASE*

Tryptic digests of ovine submaxillary apomucin were frac-tionated by gel filtration and ion exchange chromatography to give 14 peptide fractions.

Tryptic digests of ovine submaxillary apomucin were fractionated by gel filtration and ion exchange chromatography to give 14 peptide fractions. Three purified tryptic peptides, representing 106 of the 650 residues in apomucin, were submitted to automated sequence analysis. The NH,-terminal 50 of the 74 residues in one peptide and the entire sequence of the other-two hexadecapeptides were established. These studies suggest that purified ovine submaxillary mucin is chemically homogeneous, containing a unique primary structure without substantial repeating sequences in its polypeptide chain.
The sequences adjacent to 28 known 0-glycosidically substituted seryl and threonyl residues were compared. No homologies were apparent around the glycosylated seryl and threonyl residues which might define the specificity of the UDP-N-acetylgalactosaminyl:mucin polypeptide transferase that incorporates N-acetylgalactosamine into O-glycosidic linkage in glycoproteins.
However, there appears to be a minimum size requirement for glycosylation, because the transferase catalyzes glycosylation of tryptic peptides efficiently, while chymotryptic and thermolytic peptides were much poorer substrates for the transferase.
In the preceding paper (1) evidence is reported that ovine submaxillary mucin is formed by noncovalent aggregation of glycoprotein subunits, with a molecular weight about 154,000. One-third of the residues in the subunit polypeptide chain (molecular weight about 58,300) are threonine or serine, each of which is substituted by the disaccharide group, N-acetylneuraminyl-a2 --f 6-N-acetylgalactosamine in 0-glycosidic linkage.
In this paper we report the purification and sequence analy-* These studies were supported by grants from the National Heart and Lung Institute, National Institutes of Health (HL-06400) and from the National Science Foundation (GB-29334 tography on Dowex 50 as described in Fig. 2 for T4 and contained only NH,-terminal glycine by the dansyl method. However, analysis on a sequenator through 13 cycles indicated two peptide sequences, in approximately equal amounts, each commencing with NH,-terminal glycine. These two peptides have not been separated and further characterized. Peptide T3 -This fraction gave several peaks when fractionated on Dowex 50 as described for peptide T4 (Fig. 2), and contained at least 13 different peptides, but none were obtained in pure form as judged by dansyl end group analysis. The peptides in this fraction appear to account for that portion of the mucin structure not represented by the peptides in Fractions Tl, T2, and T4.
Peptide T4 -This fraction was further purified on Dowex 50 as shown in Fig. 2 to yield two hexadecapeptides, T4C and T4F, in 39 and 40% yield, respectively, with the amino acid compositions listed in Table I. After 16 cycles of automated sequenator analysis the results shown in Table IV were obtained. From these data and the amino acid compositions, the sequences shown in Table III were deduced.

Comparison of Sequences around Each Seryl and Threonyl
Residue in Tl, T4C, and T4F Tables V and VI compare the amino acid sequences adjacent to each of the threonine and serine residues identified in peptides Tl, T4C, and T4F. The composite sequence data from these three peptides contain 100 of the 650 residues and about 15% of the hydroxyamino acids in the apomucin subunit chain. An average of one in every 3 residues in apomucin is a serine or threonine, and each of these hydroxyamino acids is glycosylated on the basis of the composition of mucin and asialomutin (1). Inspection of Tables V and VI discloses no common neighboring sequence which might serve as a recognition complex for glycosylation, analogous to that found in N-glycosylated glycoproteins (4, 51, Asn-X-Thr/Ser, where Asn is glycosylated and X has been observed to be any one of a number of amino acids.

Glycosylation of Apomucin, Proteolytic Digests of Apomucin and Pure Peptides from Apomucin by UDP-N-Acetylgalactosamine:Mucin
Polypeptide Transferase amino groups during proteolysis of apomucin indicated that the average length of peptides was 5 residues in the thermolytic, 15 residues in the chymotryptic, and 70 residues in the tryptic digests. This latter value is considerably greater than the average length of 31 residues for tryptic peptides (predicted from the 20 areinine residues ner 650 residues in the Quantitative ninhydrin determination of the appearance of 1 Structure and Glycosylation of Ovine Submaxillary Mucin   mucin subunit chain), a value consistent with the observed size distribution of the tryptic digests and the lengths of purified tryptic peptides (Fig. 1, Tables I and IV). This discrepancy for the tryptic peptides may reflect a systematic error in the estimation of average peptide length by ninhydrin reaction; however, the relative average lengths of peptides in the three enzymatic digests are expected to remain unchanged.
The proteolytic digests were tested as acceptors for UDP-N-acetylgalactosamine:mucin polypeptide transferase at the concentrations shown in Fig. 3. Apomucin and its tryptic peptides were glycosylated equally efficiently, whereas its chymotrypsin and thermolysin digests were poorer substrates.
Peptides Tl, TlA, and TlB, although pure by sequence analysis, were found to be contaminated with an unidentified inhibitor of the glycosylation reaction, which presumably is not of peptide origin, but rather from the acetic acid" used for their chromatography.
Therefore, the peptides were further purified by chromatography on columns (0.4 x 10 cm) of Sephadex G-50 (superfine) in 0.01 M sodium cacodylate buffer, ' The pH of the assay mixture was unaffected by addition of the peptide.   -Ala-Thr-Pro-Gly-SThr-Thr-Gly-Arg-pH 6, containing 0.02 M sodium chloride. The peptides emerging near the void volume of the columns proved to be good acceptor substrates, as shown in Fig. 4. Peptide TlA (18 residues) had 50 to 60%, and peptide TlB (56 residues), 60 to 90% of the acceptor activity of the parent peptide Tl (74 residues) over the concentration range tested. Peptide Tl had the same acceptor activity as apomucin. In a separate experiment, peptide Tl was glycosylated with [WlGalNAc to the Structure and Glycosylation of Ovine Submaxillary Mucin 3803 extent of about 1% of the potential acceptor sites, cleaved with V8 protease and peptides TlA and TlB separated chromatographically by the gel filtration system of Fig. 1, and the 14C radioactivity associated with each determined. Peptide TlA, containing 22% of the serine and threonine residues found in Tl, had 10% of the radioactivity, and peptide TlB had 90%. This difference may reflect an inherent decreased reactivity of the residues near the NH, terminus toward glycosylation, or it may indicate that the amino terminus is less accessible in the structural state of isolated Tl under assay conditions. It has been established that peptides T4C and T4F are acceptors, but    FIG. 4 (right).
Glycosylation of peptides Tl, TlA, and TlB from the kinetics of glycosylation has not been examined thoroughly. DISCUSSION Ovine and bovine submaxillary mucins have similar physical and chemical properties (3). Each is a glycoprotein containing N-acetylneuraminyl cu2 + 6-N-acetylgalactosamine prosthetic groups in 0-glycosidic linkage with serine and threonine residues in the protein backbone. Both contain about 60% carbohydrate and have molecular weights ranging from about 375,000 to well over l,OOO,OOO (31, the exact value depending primarily on ionic strength. The preceding paper (1) indicates that ovine mucin is formed by noncovalent self-association of subunits with a molecular weight of about 154,000, and that oligomer formation is dependent upon carbohydrate content, ionic strength, and mucin concentration.
In contrast, apomutin does not self-associate and behaves as a monodisperse species on ultracentrifugation with a molecular weight of about 58,300. The studies reported here suggest that the polypeptide chain of ovine mucin (apomucin) has a unique amino acid sequence, and does not contain a regularly repeating sequence of about 28 residues as suggested for bovine mucin (2). This is supported by analysis of tryptic peptides of apomutin. The tryptic peptides of apomucin cannot be analyzed by peptide mapping on paper, since they have similar charges and streak on paper chromatography in the usual chromatographic solvents. By a combination of gel filtration and ion exchange chromatography ( Figs. 1 and 21, tryptic digests could be resolved into at least 18 peptide fractions, close to the 21 expected from the arginine content of apomucin. Three peptides (Tl, T4C, T4F) were purified, representing a total of 106 residues. Others, e.g. T2 (about 80 residues) and T3 appear to contain the remaining residues in the molecule. A total of 82 residues in the three pure peptides were placed in exact sequence (Table III). There was no indication of extensive sequence homologies among the peptides or of an internally repeating sequence of any significant length. Serine or threo-Structure and Glycosylation of Ovine Submaxillar,y Mucin nine residues were found on an average of about 1 in every 3 residues, as anticipated from the amino acid composition (1). No evidence was found for sequence microheterogeneity in that part of ovine apomucin whose partial sequence was examined.
Comparison of the sequences adjacent to each serine and threonine residue in known sequences (Tables V and VI) did not suggest a primary structural explanation of why each of these hydroxyamino acids is glycosylated. There must be structural requirements for acceptors of the UDP-N-acetylga-1actosamine:mucin polypeptide transferase, which adds N-acetylgalactosamine to each serine and threonine hydroxyl group, but these are not readily evident from the sequences available. There is no common sequence in the 4 adjacent residues either side of the glycosylated threonine and serine residues and no indication of a sequence analogous to Asn-X-Thr/Ser, which is found in all glycoproteins with carbohydrate prosthetic groups N-glycosidically linked to asparagine (4, 5). The O-glycosylated serine and threonine residues appear in clusters of 3 to 9 residues in which at least every other residue is serine or threonine and the clusters are interspersed by segments of 4 to 7 residues containing neither serine nor threonine. Noteworthy, however, is the sequence of the 50 residues in Tl, the longest stretch of sequence established. Based upon the rules of Chou and Fasman (17) for predicting secondary structures, an average of 1 out of every 3 to 4 residues throughout the sequence is likely to break or destabilize either cy helices or p structures. This suggests that apomucin as well as mucin may resemble random coils, in accord with earlier physical studies by Gottschalk and McKenzie (18). Thus, 0-glycosylation of mucin and perhaps other glycoproteins (5) may well occur if the serine and threonine acceptor residues are in regions of the molecule with little secondary structure and are readily exposed on the surface of the molecule. It may be that accessibility rather than recognition of amino acid sequences is the key to the specificity of glycosylation of ovine mucin. Studies with apomucin, proteolytic digests of apomucin and peptides Tl, T4C, and T4F, support this view, since each was a substrate for UDP-N-acetylgalactosamine:mucin polypeptide transfer-ase. Tl (74 residues) was as good as an acceptor as apomucin. TlB (56 residues) was a bet& acceptor than either TlA (18 residues) or T4C and T4F (each containing 16 residues), but was not as readily glycosylated as Tl. In accord with earlier findings that small peptides containing threonine and serine were not acceptors (12), thremolysin and chymotryptic digests were poorer acceptor substrates than peptides in tryptic digests. Since the smallest peptides in a tryptic digest probably contain no more than 16 residues, it is feasible that a minimum size is required for a good acceptor, although studies with model peptides will be required to establish this size more exactly.