Identification of a human insulinoma cDNA encoding a novel mammalian protein structurally related to the yeast dibasic processing protease Kex2.

We have identified a human insulinoma cDNA (PC2) that encodes a protein homologous to the precursor processing Kex2 endoprotease of yeast by using a polymerase chain reaction to detect and amplify conserved sequences within the catalytic site. The 638-residue amino acid sequence of PC2 begins with a cleavable signal peptide, indicating that it enters the secretory pathway, and contains a 282-residue domain that is homologous to the catalytic modules of both Kex2 and the related bacterial subtilisins. Within this region 49 and 27% of the amino acids are identical to those in the aligned Kex2 and subtilisin BPN' sequences, respectively, and the catalytically essential Asp, His, and Ser residues are all conserved. Northern blot analysis revealed the presence of 2.8- and 5.0-kilobase hybridizing bands in mRNA from the insulinoma. The PC2 protein also shows great similarity to the incomplete NH2-terminal sequence of the human furin gene product, a putative membrane-inserted receptor-like molecule. We propose that PC2 is a member of a family of mammalian Kex2/subtilisin-like proteases that includes members involved in a number of specific proteolytic events within cells, including the processing of prohormones.

We propose that PC2 is a member of a family of mammalian Kexa/subtilisin-like proteases that includes members involved in a number of specific proteolytic events within cells, including the processing of prohormones.
Proteolytic processing at dibasic amino acids represents an important step in the maturation of a large number of prohormones, neuropeptides, and other biologically active peptides and proteins (1,2). Despite the widespread occurrence of this mechanism in nature, little is known about the endoproteases involved in this process. Recently, two Ca'+-dependent proteolytic activities have been partially purified from an insulinoma and have been shown to cleave proinsulin appropriately at the B chain-C peptide and C peptide-A chain junctions (3). Designated types I and II, these activities specifically cleave on the carboxyl side of Arg-Arg and Lys-Arg residues, respectively.
A similar Ca*+-dependent proteolytic activity has also recently been described in liver and appears to be involved in the processing of proalbumin (4,5). The yeast Saccharomyces cereuisiae also requires endoproteolytic cleavage at dibasic amino acids in the processing of proteins involved in its life cycle. The o-mating factor peptide is translated in tandem copies which must be cleaved at Lys-Arg residues to be released (6,7). In addition, the yeast prokiller factor requires proteolytic processing at both Lys-Arg and Arg-Arg residues to be activated (8,9). In both of these cases the endoprotease involved has been mapped to the KEX2 locus (10). Biochemical characterization of Kex2 has revealed it to be a Ca2+-dependent serine protease (11, 12). Upon cloning (13), Kex2 was found to be a subtilisin-like protease since it contained an active site domain homologous to the bacterial subtilisins (12, 13). Furthermore, Kex2 has been shown to process proinsulin expressed in yeast (14) and also to process proopiomelanocortin when transfected into proopiomelanocortin-secreting mammalian cells (15). In order to explore the possible existence of processing enzymes related to the yeast Kex2 protease in pancreatic p cells, we have used polymerase chain reaction (PCR)' in a strategy we have called amplification of homologous DNA fragments (16). We report here the identification and characterization of a cDNA from a human insulinoma which encodes a protein with a high degree of similarity to KexP that may represent a homologous mammalian converting protease. All other methods were as described (18,19).

AND DISCUSSION
Examination of a large number of proteolytic enzymes has shown that both the amino acid sequences surrounding the active site residues and the distance between the catalytic sites are highly conserved within any one family. In an effort to identify a mammalian gene homologous to the yeast KEX2 gene, we therefore designed two degenerate oligonucleotides based upon the consensus amino acid sequences surrounding the aspartate and histidine active site residues of Kex2 and the related bacterial subtilisins ("Experimental Procedures"). These oligonucleotides were used to prime PCR amplification of cDNA synthesized from human insulinoma total RNA. Analysis of the PCR products by polyacrylamide gel electrophoresis revealed a major band of 150 nucleotides (Fig. 1A). Since this was of a length consistent with the distance between the Asp and His catalytic residues encoded by the KEX2 gene, the DNA was subcloned for further analysis. Sequencing of the cloned DNA (designated pPCR1) revealed that one of the two potential open reading frames displayed extensive amino acid sequence similarity to the corresponding region of the KEX2 gene. In addition, Northern blot analysis indicated the presence of both 5 and 2.8kb transcripts in human insulinoma mRNA (Fig. 1B).
To isolate the corresponding full length cDNA clone, pPCR1 was used as a probe to screen a human insulinoma library. Screening of lo6 phage produced five positive signals which were plaque-purified and subcloned. Sequence analysis of the longest insert, designated PC2, showed that it contained a single open reading frame of 1914 base pairs that was predicted to encode a 638-amino acid protein with an NHZterminal signal peptide-like region (Fig. 2) kDa which was processed to a slightly smaller size in the presence of dog pancreas microsomes (data not shown). The most salient feature of the predicted amino acid sequence of PC2 was the presence of a domain homologous to the subtilisin serine protease family. As shown in Fig. 3, the amino acid sequences surrounding AspiG7, HisZos, and Ser"R4 of PC2 are closely related to the catalytic sites of both the bacterial subtilisins and the subtilisin-like yeast proteases Kex2 and proteaseB. In addition, the distances between these residues are consistent with those observed in proteases of the subtilisin family (Fig. 3). A more thorough comparison of the similarities of PC2 with representative members of the subtilisin proteases indicated that strong homology existed throughout the active site domain. Computer analysis of the complete amino acid sequences of both PC2 and subtilisin BPN' generated only one region of extended overlap (not shown). This corresponded to the active site region in the subtilisin (Gly" to Asp'"') and extended from Gly'"" to Asp4"' in PC2. Within this 246-amino acid segment 27% of the residues were identical while 49% represented conservative substitutions. The strongest and most extensive similarities, however, were evident when PC2 was compared with the catalytic domain of the yeast KexP protease. Alignment of these two sequences (Fig. 4) indicated that within a 282amino acid overlap extending from Asn"' to Leu4i" in PC2, and Asn'"4 to Leu4" in Kex2, 49% of the amino acids were identical. Moreover, an additional 35% of the residues in this region were similar. In all of these comparisons, the Asp, His, and Ser residues of the catalytic triads aligned exactly. ProteaseB is a vacuolar protease from yeast that is also related to the subtilisins (20). A comparison of PC2 to proteaseB, however, revealed little similarity outside of the amino acid sequences directly adjacent to the catalytic residues (Fig. 3) It thus appears that within the active site domain, PC2 is more closely related to Kex2 than Kex2 is to proteaseB.
A comparison of the overall domain structures of PC2 and Kex2 is displayed in Fig. 4. Both proteins contain a putative signal peptide followed by the subtilisin-like domain. Although this region contains the highest levels of homology between the two proteins (see above), Fig. 4 shows that PC2 contains amino acid sequence similarity with Kex2 throughout its sequence. Of the first 594 amino acids of PC2,34% are identical and 41% similar to those of the aligned Kex2 sequence. By comparison, 40-50% of the amino acids are identical between related mammalian members of the trypsin family of serine proteases (21) domain in Kex2 is thought to be involved in O-linked glycosylation of the protein (12). Both PC2 (Fig. 2) and KexP (13) possess consensus sequences for N-linked glycosylation. While analysis of PC2 indicates that it does not possess a transmembrane-spanning domain and therefore may not be associated with a membrane in uiuo, it is of interest that Northern blot analysis of the human insulinoma RNA using either pPCR1 (Fig. 1B)  A search of the NBRF-PIR protein sequence library revealed that PC2 was also related to the human furin gene product. Furin was identified based upon its proximity to the c-fes/fps proto-oncogene and is transcribed as a 4.5kb mRNA (22). Cloning and sequencing of 3.1 kb of the furin mRNA revealed a cystine-rich region with homology to the human insulin and growth factor receptors as well as a transmembrane domain resembling those of the class II major histocompatibility complex antigens. Both of these domains are contained within the COOH-terminal half of the protein (22). Since the complete furin mRNA has not been cloned the nature of the amino-terminal region of this protein remains unknown. Interestingly, the similarity between furin and PC2 extends over the first 280 amino acids of the cloned furin fragment while spanning a 287-amino acid segment near the COOH terminus of PC2 (Asp310 to Sersg6). Within this region, 48% of the amino acids are identical.
Likewise, the same region of furin shows 36% sequence identity with the corresponding region of Kex2 (Gly315 to Ser5Q7). Given these high levels of similarity and the fact that this overlap includes a subtilisin-like serine active site region within furin that aligns to the putative active site serine residues of both PC2 (Ser383) and Kex2 (Ser385), we infer that the uncharacterized aminoterminal portion of furin may also encode a subtilisin/Kex2like catalytic domain.
The high degree of similarity between PCi and Kex2 make it tempting to ascribe a role for this mammalian gene in the endoproteolytic processing of prohormones. However, the functional specificity of PC2 must await biochemical characterization of the encoded protein, e.g. the presence of Asp rather than Asn at position 310 might lower the catalytic efficiency of this putative protease (23). Further studies are under way to rule out a possible cloning artifact or point mutation occurring in the insulinoma DNA.