The primary structure of a salivary calcium-binding proline-rich phosphoprotein (protein C), a possible precursor of a related salivary protein A.

The complete primary structure of a calcium-binding "proline-rich phosphoprotein" named salivary Protein C was determined from peptides obtained by enzymatic and chemical cleavages of the protein. The protein consists of a single polypeptide chain of 150 residues. It contains the entire primary structure of a previously isolated salivary Protein A in its NH2-terminal 106 residues. The COOH-terminal 44 residues consist mostly of glycine, glutamine, and proline, including a hexaproline sequence, but no polyproline structure could be detected by CD spectroscopy. There is extensive repetition of sequences in the protein, suggesting gene multiplication and recurrent folding. Comparison of the primary structure of salivary Proteins A and C with known protein sequences indicate that the salivary proteins constitute a new family. A mouse submaxillary protease will cleave salivary Protein C between residues 106 and 107 only, giving rise to salivary Protein A and a 44-residue COOH-terminal peptide. This cleavage and the sequence data suggest that salivary Protein C may be a precursor of salivary Protein A.

The complete primary structure of a calcium-binding "proline-rich phosphoprotein" named salivary Protein C was determined from peptides obtained by enzymatic and chemical cleavages of the protein. The protein consists of a single polypeptide chain of 150 residues. It contains the entire primary structure of a previously isolated salivary Protein A in its NHz-terminal 106 residues. The COOH-terminal 44 residues consist mostly of glycine, glutamine, and proline, including a hexaproline sequence, but no polyproline structure could be detected by CD spectroscopy. There is extensive repetition of sequences in the protein, suggesting gene multiplication and recurrent folding. Comparison of the primary structure of salivary Proteins A and C with known protein sequences indicate that the salivary proteins constitute a new family. A mouse submaxillary protease will cleave salivary Protein C between residues 106 and 107 only, giving rise to salivary Protein A and a 44-residue COOH-terminal peptide. This cleavage and the sequence data suggest that salivary Protein C may be a precursor of salivary Protein A.
Human parotid and submandibular saliva contain a number of so-called "acidic proline-rich proteins." These proteins have similar unusual amino acid compositions, and present evidence indicates that they aid in maintaining the calcium concentration in saliva and in keeping the exposed mineralized tissue of the teeth intact under physiological conditions (1,2). While these activities are shared by all the acidic proline-rich proteins investigated so far, quantitative differences are observed between individual proteins (3). This could be related to differences in structure. At present it is not clear why several of these proteins are present in saliva. It is possible that they are products of different genes or they could arise by postribosomal modification of a primary gene product.
To solve these problems, a knowledge of the primary structure of the proteins is important. The primary structure of one of the acidic proline-rich proteins, salivary Protein A, has been reported ( 4 ) . This paper describes the primary structure of a related salivary Protein C and explores the possibility that salivary Protein A may be a postribosomal modification of salivary Protein C. DISCUSSION' The primary structure of salivary Protein C is given in ' Portions of this paper (including "Experimental Procedures," 1, together with peptides and methods used in sequence determination. As expected, trypsin rapidly cleaves the arginine-glycine bond at positions 106 to 107, and it slowly cleaves the arginine-proline bonds at positions 91 to 92 and 139 to 140; these latter two have identical surrounding sequences. The lysine-proline bond a t positions 74 to 75 is slowly hydrolyzed, but the same type of bond at positions 129 to 130 is not cleaved. Surprisingly the only difference in the surrounding sequences is the presence of glutamine at position 72 and proline a t position 127. It is possible that the presence of proline in the second position from the COOH-terminal end of the susceptible bond causes steric hindrance of trypsin. The other enzymatic and chemical cleavages of salivary Protein C are the same as occur in salivary Protein A (4). The sequence of the NH2-terminal 106 residues is identical to the primary structure of salivary Protein A. As in salivary Protein A, the biological activities, that is, inhibition of hydroxyapatite formation, binding to hydroxyapatite, and calcium binding, are all located in the NH2-terminal 30 residues (5, 6, 7). Salivary Protein A contains six or seven calcium-binding sites, but salivary Protein C contains only three (3,8). It seems that the longer polypeptide chain in salivary Protein C causes steric hindrance and inhibition of calcium binding. The positively charged amino acids located between residues 107 and 150 in salivary Protein C may, for example, interact with some of the negatively charged residues which otherwise would be part of a calcium-binding site.
Difference CD failed to demonstrate the presence of polyproline. This is somewhat surprising because of the presence of hexaproline in positions 122 to 127. It is possible that the rest of the polypeptide chain induces conformational restraint in the hexaproline region, preventing the formation of all cisor all trans-peptide bonds necessary for helix formation. The techniques used to detect polyproline helix may not be sensitive enough and, at present, it cannot be ruled out that salivary Protein C contains a polyproline structure.
Previously, it was noted that there were a number of recurrent sequences in salivary Protein A. This pattern of recurrency is even more extensive in salivary Protein C, as illustrated in Fig. 2, and supports the suggestion that gene multiplication has occurred (4). In order to investigate further the relationship of the primary structure of salivary Protein C with other proteins, a computer search of the protein sequences submitted to the National Biomedical Research  ~G l u~A s p -L e u~A s n -G I u -A s p -V a I -S e r -G I n~G l u -A s p -V a l -P r o -L e u~V a I -I I e~S e r -A s p G l y -G l y -A s p -S e r -G I u -G I n -P h e -l I e -A s p G I u -G l u -A r g c T y _ _~.
A search for CTX (residues 1 to 30) demonstrated, apart from the presence of the identical sequence in salivary Protein A, that there are no known sequences closely related to CTX. The search, however, did indicate the possibility of a distant relationship to proteins in the Troponin C superfamily and Factor X. It is, for example, interesting to align the sequence between residue 19 and 30 in salivary Protein C with that of residues 51 to 62 in cod parvalbumin.

Salivary Protein C Cod Parvalbumin
This sequence of parvalbumin is located in the C and D helical regions which contains one of the calcium-binding sites (10). The proposed ligands for the octahedral calcium binding have been indicated with the letters X, Y, Z, -Y, -X, and -Z.
Three of the ligands in parvalbumin are also present in salivary Protein C, and one of the remaining is a glutamic acid which is replaced by aspartic acid in salivary Protein C. Several proteins have been found to have sequences homologous to those of the C-D and E-F regions in parvalbumin (10). These homologous regions include a helices preceding and following t1.e calcium-binding site. To explore the similarity between the salivary protein and parvalbumin, it is, therefore, more meaningful to align the complete sequence of the C-D region in parvalbumin with the salivary protein, rather than with that which only contains the calcium-binding ligands. Such a comparison shows that the sequences corresponding to both the C and D regions in the salivary protein contain proline and, therefore, cannot form the a helices found in parvalbumin.
It is likely that some of the calcium-binding sites of the salivary protein are located in the sequence between residues 19 and 30 since dephosphorylation of phosphoserine 22 decreases the amount of bound calcium (3). Even so, it is clear that the calcium-binding sites in the proteins are quite different since the apparent dissociation constants are approxi: mately 2 x M for salivary Proteins A and C and it is 10. ' M for parvalbumin (11).
The sequence of CTX can be aligned with residues 16 to 45 in bovine Factor X as follows: The glutamic acids which are carboxylated in Factor X and involved in calcium binding have been underlined. Only in the case of two (residues 20 and 25) is a glutamic acid present in the corresponding position in the salivary protein. No y-carboxyglutamic acid is present in the salivary proteins and it is not known if they are substrates for carboxylating systems. A total of 12 of the 30 residues are identical or need only replacement of aspartic acid with glutamic acid, arginine by lysine, or threonine by serine. Of the remaining residues, 6 involve replacement of 1 hydrophilic residue with another and 2 are replacements of hydrophobic residues.
Recently, the partial sequence of the NH~-terminal end of an acidic proline-rich protein isolated from monkey (Macaca fascicularis) parotid saliva has been reported (12). There is a high degree of similarity and it is likely that the protein is homologous to the human proteins described in this paper.
The proline-rich part of salivary Protein C was also investigated for sequence similarities with other proteins. Both salivary Proteins A and C contain tetraproline and salivary Protein C also contains hexaproline. The only other protein with four or more consecutive prolines is a fragment from human immunoglobulin A in the constant region of an a-2 chain which contains pentaproline. Recently, there has also been a report of hept,aproline in a peptide isolated from human whole saliva (13). The sequences of two proline-rich peptides from the same source were reported (13). The relationship of these proteins to those described in this paper remains to be established. A search for similarities of the repeat,ing sequence PQGPPQQGG in the salivary proteins was also undertaken. The sequence PAGPPGEAG from collagen a-1 (I) chain had the greatest similarity. Of the 9 residues, 5 are identical. On the other hand, the absence of glycine in every third position in the salivary proteins makes it highly unlikely that the proline-rich part of the salivary proteins and collagen are in the same family. At the most, these prot.eins could be very distantly related to each other. Salivary Proteins A and C, therefore, appear to be members of a new family of proteins. If there are any distant relationship to other proteins, they are apparently different for the proline-poor and the prolinerich regions. If such relationship do exist they suggest that the gene for the salivary proline rich proteins may have arisen by combining separate genes for the two regions.
The primary structures of salivary Proteins A and C suggest that salivary Protein A is formed by a postribosomal cleavage of salivary Protein C at residues 106 to 107. A very slow cleavage is observed with thrombin, giving rise to products suggesting cleavage at positions 106 to 107 only.
In contrast to the slow cleavage by thrombin, mouse submaxillary gland protease causes a much more rapid cleavage. Schenkein et al. (14) found that the mouse enzyme cleaved arginine-glycine bonds in lysozyme as well as several other arginine-containing bonds. There have been no reports of the susceptibility of the other types of arginine-containing bonds found in salivary Protein C. It has been suggested that this protease may function to modify glandular proteins (14) and it is possible that this includes postribosomal modification of acidic proline-rich salivary proteins. An attempt was made to demonstrate if homogenates of human sativary glands could cleave salivary Protein C labeled by reductive methylation. While the homogenate did cause hydrolysis of the salivary proteins, a conversion to salivary Protein A could not be observed. This does not exclude the possibility of such a specific cleavage during the synthesis or secretion in the human salivary gland since the homogenate may contain proteolytic enzymes to which the proteins would not be exposed in uivo. If such a conversion occurs in the human gland, it is not clear why only part of salivary Protein C would be specifically cleaved.