Identification and Molecular Cloning of a Novel Mouse Mucosal Mast Cell Serine Protease*

A novel 28,000 M, serine protease, designated mouse mast cell protease-(MMCP-S), that is stored in the secretory granules of Kirsten sarcoma virus-immor-talized mouse cells (KiSV-MC) has been identified and its NHz-terminal amino acid sequence has been determined.

A novel 28,000 M, serine protease, designated mouse mast cell protease-(MMCP-S), that is stored in the secretory granules of Kirsten sarcoma virus-immortalized mouse mast cells (KiSV-MC) has been identified and its NHz-terminal amino acid sequence has been determined.
Analysis of a 953-base pair cDNA that encodes MMCP-2 revealed that this serine protease is a basically charged protein, possessing the histidineaspartic acid-serine charge relay system that is characteristic of other serine proteases. DNA blot analysis using the full-length MMCP-2 cDNA indicated the existence of a family of highly related serine protease genes in the mouse genome.
When the same DNA blot was probed with the 149-base pair KpnId3' fragment of the cDNA, the probe hybridized to a single DNA fragment, thereby demonstrating that this 3' fragment could be used as a gene-specific probe. The presence of high levels of the MMCP-2 mRNA transcript in the intestines of nematode-infected mice, and its absence in mouse bone marrow-derived mast cells and peritoneal cavity-derived connective tissue mast cells, suggest that this member of the mouse mast cell protease family is preferentially expressed late in the differentiation of mucosal mast cells.
In both rats and mice, at least two subclasses of mast cells have been identified based on differences in tissue distribution, histochemical staining properties, and T cell-factor dependence (reviewed in Ref. 1 peritoneal cavity, whereas the T cell-dependent mucosal mast cell is the major type present in the intestinal mucosa of nematode-infected animals. In the rat, both types of mast cells contain large amounts of basically charged serine proteases (2-7), which are stored in active form bound to aciditally charged proteoglycans within the secretory granules (2). Rat CTMC contain a 28,000 M, serine protease (2, 6), designated rat mast cell protease (RMCP) I, whereas mucosal mast cells contain a distinct 25,000 M, serine protease, designated RMCP-II (3)(4)(5). Both enzymes resemble pancreatic chymotrypsin in that they are serine endopeptidases that cleave to the carboxyl side of hydrophobic amino acids with a neutral to basic pH optimum (8). RMCP-I and RMCP-II have substantial amino acid sequence homology with one another (3,6,9), but they are distinct gene products that can be distinguished from one another with polyclonal antibodies. A cDNA that encodes RMCP-II (9) has been isolated from a cDNA library prepared from the mucosal mast cell-like (10) rat basophilic leukemia-l tumor cell line (11). In the mouse, a 25,000 M, chymotryptic serine protease has recently been purified from the intestines of animals infected with Trichinellu spiralis (12), and its amino acid sequence has been determined (13). This mouse mucosal mast cell protease, designated MMCP (13), is not detected in mouse peritoneal CTMC by protein blot analysis (14).
Recently, we described the immortalization of mouse mast cells by co-culture of mouse splenocytes with fibroblasts that were producing the Ki-ras-containing Kirsten sarcoma virus (15). We now report the identification of a novel 28,000 M, mouse mast cell protease (MMCP-2) that is a major protein constituent of the secretory granules of Kirsten sarcoma virus-immortalized mast cells (KiSV-MC). A cDNA that encodes this mouse mast cell serine protease has been isolated and characterized.
As assessed by RNA blot analysis, the gene that encodes MMCP-2 is expressed in the proximal small intestines of mice that have a local mastocytosis elicited by infection with Nippostrongylus brasiliensis. The MMCP-2 gene is not expressed in mouse peritoneal CTMC or in the relatively immature interleukin 3-dependent bone marrowderived mast cell (BMMC).
Because MMCP-2 is distinct from MMCP (13), mouse mucosal mast cells appear to possess at least 2 serine proteases that are not present in CTMC. Since the full-length cDNA for MMCP-2, but not its gene-specific portion, hybridized to a mRNA species from mouse CTMC, it is apparent that mast cells can express a family of secretory granule serine proteases with selective distribution of some members to a particular mast cell subclass.

424
Characterization of a Mouse Mast Cell Proteme toneal lavage and were purified to >97% as previously described (16)(17)(18) After the gel was stained with Coomassie Blue and sectioned, the radioactivity was found in the areas of the gel that corresponded to the 28,000 and 32,000 M, proteins but not to the 36,000 M, protein (not shown). Partially purified granules from KiSV-MC5 were subjected to SDS-PAGE and transblotted to a PVDF membrane. When the resolved 28,000 M, protein (designated MMCP-2) was sequenced, the first 23 amino acids at the NH* terminus were unambiguously determined (Fig. lA). Isolation and Characterization of a cDNA That Encodes MMCP-2-Because the NH*-terminal amino acid sequence of MMCP-2 was 65% homologous to the NH*-terminal sequence of RMCP-II (see below), a blot of total RNA from KiSV-MC1 was probed under conditions of low stringency with the rat RMCP-II cDNA (9). Hybridization to a l.O-kb species of RNA was detected (data not shown), and therefore the mouse KiSV-MC1 cDNA library was screened under conditions of low stringency with the rat probe to isolate two mouse mast cell cDNAs (cDNA-2 and cDNA-4). Upon EcoRI digestion of phage DNA containing cDNA-2, the insert was recovered from the Xgtll vector as two fragments of -400 and -600 bp in length. Identically sized fragments were seen upon EcoRI digestion of cDNA-4. The restriction enzyme map and the sequencing strategies of cDNA-2 and cDNA-4 are shown in Fig. 1B. The lengths of cDNA-2 and cDNA-4 were 953 and 956 bp, respectively.
Upon rescreening of the cDNA library under high stringency conditions with cDNA-2, a third cDNA (cDNA-14) that had a poly(A) tail was cloned and partially sequenced (Fig. 1B). The consensus nucleotide sequence of the three cDNAs and the deduced amino acid sequence of MMCP-2 are shown in Fig. 1C. The initiation site for translation is defined by an ATG codon (nucleotide residues [17][18][19]. Based on the TAG stop codon (nucleotide residues 749-751) in the same reading frame, 16 and 226 untranslated base pairs were located 5' and 3', respectively, of the open reading frame that encodes this mouse mast cell protease. The translated form of MMCP-2 has a 26,700 M,. It consists of 244 amino acids and contains a typical hydrophobic signal peptide of 18 amino acids. The mature serine protease (as determined by direct amino acid sequencing of the isolated protein) lacks the 2 glutamic acid residues carboxyl-terminal to the predicted cleavage site of the signal peptide, indicating that the enzyme is probably translated in a "pre-pro" form. The M, of the mature form of the protein is 24,700, and it consists of 224 amino acids. A potential Nlinked glycosylation site is present at amino acid residue 24. Based on the number of basic (Arg + Lys = 27) and acidic Asp + Glu = 20) amino acids, the mature form of the mouse mast cell enzyme is basic in charge at pH 7.4. A comparison of the deduced amino acid sequence of cDNA-2 with the amino acid sequences of other rat and mouse mast cell secretory granule proteases is shown in Fig. 2. The deduced amino acid sequence of the mature form of the mouse mast cell protease MMCP-2 was found to be 65% homologous to MMCP, 63% homologous to RMCP-I, and 60% homologous to RMCP-II. As in the case of other serine proteases, MMCP-2 possesses the charge relay system of histidine, aspartic acid, and serine at amino acid residues 45,89, and 182, respectively. The mature protein contains 6 cysteine residues, and thus MMCP-2 can have no more than three intrachain disulfide bonds.
DNA and RNA Blot Analyses-Samples of mouse genomic liver DNA were digested with a variety of restriction endonucleases, and the DNA fragments were separated in agarose gels and blotted onto Zetabind. When the DNA blots were probed with the full-length cDNA-2 under conditions of high stringency, multiple DNA fragments were detected regardless of which restriction enzyme was used (Fig. 3A). When the same DNA blot was reprobed with the KpnI+3' fragment of cDNA-2 (Fig. 3B), hybridization to a single DNA fragment was seen in each lane, indicating that this 3' fragment could be used as a gene-specific probe.
Blots containing total RNA from KiSV-MC5, from the proximal small intestines of N. brasiliensis-infected mice (where mucosal mast cells are increased markedly) (20, 21), and from the proximal small intestines of uninfected mice were probed with the gene-specific KpnI+3' fragment of cDNA-2. As shown in the representative experiment in Fig.  4A, the l.O-kb mRNA transcript for MMCP-2 was detected in the intestine of the nematode-infected mouse but not in the intestine of the uninfected control mouse. Similar results were obtained for three other pairs of N. brasiliensis-infected and sham-infected mice. RNA blot analysis of total RNA probed with the gene-specific KpnI+3' fragment of cDNA-2 under conditions of high stringency revealed that the gene for MMCP-2 was not expressed in mouse peritoneal CTMC, BMMC, WEHI-cells, 3T3 fibroblasts, or KiSV-infected 3T3 fibroblasts (data not shown). Nevertheless, when reprobed with the full-length cDNA-2, a l.O-kb transcript was detected in RNA from CTMC (Fig. 4B). This latter finding indicates that mouse CTMC express a distinct but homologous serine protease transcript.
No hybridization was seen when a blot containing 40 pg of total RNA from proteose peptone-elicited inflammatory cells (obtained from the peritoneal cavities of mice that were infected with S. mansoni) was probed with the gene-specific KpnI+3' fragment of cDNA-2 (data not shown).

DISCUSSION
A 28,000 M, DFP-binding protein was found to be a major constituent of the secretory granules of KiSV-MC. NH*terminal amino acid analysis (Fig. L4) revealed this protein to be novel, and thus we have used the designation MMCP-2 to distinguish it from the only other mouse mast cell serine protease, MMCP, for which an amino acid sequence is known (13, 14). Because a cDNA that encodes RMCP-II (9)  The 23-amino acid portion of the deduced amino acid sequence that is boxed corresponds to the amino acid sequence obtained directly from the protein.
The two arrows indicate the putative sites for cleavage of the signal peptide (37) and the pro-peptide. Stop represents the stop codon, and asterisks (***) underline the potential glycosylation site. The three charge-relay amino acids are circled. The polyadenylation signal nucleotide sequence is underlined.
The numbers on the right and on the left indicate the amino acid and nucleotide positions in the respective sequences.
The most 3' nucleotide before the poly(A) tail is displayed in parentheses because it was present in cDNA-2 and cDNA-4, but not in cDNA-14. which contained a poly(A) tail, was isolated when the library was screened using cDNA-2 as the probe. The deduced amino acid sequence of these three cDNAs (Fig. 1C) revealed that MMCP-2 contained the same histidine-aspartic acid-serine charge relay system present in the active site of all serine proteases. Although no stop codon was found in the 5' untranslated region of either cDNA, it was concluded that translation begins at the first ATG codon because it is in the same position as the translationinitiation codon of RMCP-II and the subsequent nucleotide sequence encodes a hydrophobic signal peptide with an amino acid sequence identical to that of RMCP-II (Figs. 1C and 2). The deduced amino acid sequence of the mouse protease predicts that the pre-pro form of the enzyme is 26,700 M, and consists of 244 amino acids. Based on the -3, -1 rule for cleavage of signal peptides (37), the 18-amino acid signal peptide would be predicted to be removed between amino acid residues -2 and -3, resulting in a pro form of the enzyme that consists of 226 amino acids. Since the NHL-terminal sequence of the mature protein begins with an isoleucine (Fig.   23 lA), the 2 glutamic acids at -1 and -2 (Fig. 1C) are apparently removed from the NH, terminus of the pro form of the enzyme during its subsequent post-translational processing. During the post-translational processing of bovine pancreatic chymotrypsinogen, the NHP-terminal isoleucine of the mature protease forms an ion pair with aspartic acid 194, which is adjacent to the "charge relay" serine 195, resulting in activation of the chymotrypsin (3). Since an aspartic acid residue (position 181) is also adjacent to the charge relay serine (position 182) in MMCP-2, it is likely that removal of the diglutamic acid pro-peptide activates this mast cell enzyme. The mature form of the enzyme is predicted to have 224 amino acids and a 24,700 Mr. Since amino acid residue 24 of the mature enzyme is a potential N-linked glycosylation site, it is likely that the mature serine protease has a larger M, because it contains an oligosaccharide.
length cDNA-2 was used to probe a blot containing mouse CTMC RNA (Fig. 4B), it was concluded that mouse CTMC express a homologous but distinct serine protease.
MMCP-2 has a serine residue at position 176 (Fig. 2), which is conserved in RMCP-I (Fig. Z), rat chymotrypsin B (position 189) (38), and bovine chymotrypsins A and B (position fS9 in both) (39). This serine residue is located in the substratebinding region of the rat and bovine enzymes and confers the preference for hydrophobic residues to be positioned at the site of substrate cleavage (6,8,40). In bovine trypsin (39) and rat trypsins I and II (41), the corresponding residue in the substrate-binding region is an aspartic acid, which confers the preference for basic residues to be positioned at the site of substrate cleavage. Thus, it is likely that the substrate preference for MMCP-2 is chymotryptic rather than tryptic. Based on its deduced amino acid sequence, MMCP-2 has only 3 intrachain disulfide bonds, and the positions of the cysteines are identical to those in MMCP, RMCP-I, and RMCP-II (Fig. 2). The presence of 3 intrachain disulfide bonds is also characteristic of other secretory granule serine proteases from hematopoietic cells such as mouse cytotoxic T lymphocyte granzymes (42-46) and human neutrophil cathepsin G (47), and thus differs from pancreatic and plasma serine proteases which contain 4 such bonds (3,38,39).
In contrast, pancreatic serine proteases (39) such as rat chymotrypsin B (38), trypsin I (41), and trypsin II (41) have Hi-, S-, and Samino acid pro-peptides, respectively, which are cleaved from the mature protease at either a lysine or an arginine residue.
DNA blot analysis revealed that 3 to 4 mouse genomic DNA restriction enzyme fragments hybridized to the full-length cDNA-2 regardless of which nuclease was used to digest the genomic DNA (Fig. 3A). This finding indicates either that there are 3 to 4 genes in the mouse that encode this and homologous proteins or that there are a more limited number of genes which contain introns susceptible to all of these restriction enzymes. Because the KpnI+3' fragment of MMCP-2 cDNA recognized only one mouse genomic DNA fragment (Fig. 3B), this 3' fragment of the cDNA could be used as a specific probe for mRNA encoding MMCP-2. RNA blots probed under conditions of high stringency with the gene-specific portion of cDNA-2 indicated that the gene that encodes the 28,000 M, serine protease was expressed in KiSV-MC5 and in the small intestines of mice infected with N. brasiliensis (Fig. 4A), where mucosal mast cells are prominent (20, 21). The gene was not expressed in the proximal small intestines of sham-infected mice, or in mouse peritoneal CTMC, BMMC, or WEHI-myelomonocytic cells (Fig. 4). The MMCP-2 transcript was also not detected in uninfected or KiSV-infected 3T3 fibroblasts, or in peritoneal exudate cells from S. munsoni-infected mice which were predominantly eosinophils, lymphocytes, and macrophages (data not shown). Because hybridization was readily seen when the full-Nakano and co-workers (48) have demonstrated, based on histochemical criteria, that BMMC can give rise to both CTMC and mucosal mast cells because both populations are reconstituted in mast cell-deficient (W/W") mice by the administration of interleukin 3-dependent BMMC. Because the gene-specific portion of cDNA-2 failed to hybridize to RNA from mouse BMMC or from CTMC (data not shown), the gene for MMCP-2 appears to be expressed relatively late and selectively in the differentiation of mast cell progenitors to the mucosal mast cell type.
The recent chemical and immunochemical characterization of a mouse mucosal mast cell protease (MMCP) (12,14) that possesses an amino acid sequence distinct (13) from the cloned protease reported in this manuscript (Fig. 2), suggests that mouse mucosal mast cells contain at least 2 serine proteases with chymotrypsin-like substrate preferences. The additional finding of a homologous but distinct protease mRNA in CTMC, by hybridization with the full-length cDNA-2 but not its gene-specific portion, argues that mast cells will express a family of related proteases but with a selective distribution to members of a particular mast cell subclass. Although subclassrelated distribution of two proteases has been previously recognized in the rat (4,7), the evidence in the mouse for a larger mast cell serine protease family of genes, and the lack of transcription of the MMCP-2 gene in the relatively immature interleukin-3-dependent BMMC, has implications for the differentiation, function, and heterogeneity of mast cells.