Structure, chromosomal assignment, and expression of the gene for proteinase-3. The Wegener's granulomatosis autoantigen.

Proteinase-3 (PR-3) is a neutral serine proteinase present in the azurophil granules of human polymorphonuclear leukocytes. It degrades a variety of extracellular matrix proteins including elastin in vitro and causes emphysema when administered by tracheal insufflation to hamsters. It is identical to the target autoantigen (c-ANCA) associated with Wegener's granulomatosis and to myeloblastin, a serine proteinase first identified in HL-60 leukemia cells. In this study, the gene encoding PR-3 was cloned and sequenced. The gene spans approximately 6.5 kilobase pairs and consists of five exons and four introns. The genomic organization of PR-3 is similar to that of the other serine proteinases expressed in hemopoietic cells. Each residue of the catalytic triad of PR-3 is located on a separate exon, and the positions of the residues within the exons are similar to those in human leukocyte elastase and cathepsin G. The phase and placement of the introns in the PR-3 gene are also similar to those in human leukocyte elastase and cathepsin G. The 400-base pair (bp) 5'-flanking sequence of the PR-3 gene contains a TATA box at position 379. There is no CAAT box promoter element. The 3'-untranslated region is 200 bp, extending from a TGA stop codon to the site of polyadenylation 10 bp after the canonical AATAAA signal. Amplification of PR-3 from a human/hamster hybrid cell line localizes the gene to human chromosome 19. Evidence from Northern analysis suggests that PR-3 expression is primarily confined to the promyelocytic/myelocytic stage of bone marrow development.

of the azurophil granules, present in amounts comparable to those of human leukocyte elastase (Gabay et al., 1989;Ohlsson et al., 1990). PR-3 degrades elastin; at a mildly acid pH, it is more active against insoluble elastin than human leukocyte elastase (Kao et at., 1988). PR-3 degrades a variety of other extracellular matrix proteins, including fibronectin, type IV collagen, and laminin, but not the interstitial collagens type I and I11 (Rao et al., 1991). This broad proteolytic activity of PR-3 suggests potentially important physiological functions, such as fostering PMNL migration through basement membranes (Henson and Jonston, 1987) or digesting phagocytosed microbes (Janoff, 1985). PR-3 may contribute to the pathogenesis of disease since it is a potent inducer of emphysema in hamsters (Kao et al., 1988).
In addition to the potential roles inferred by its ability to degrade extracellular matrix molecules, PR-3 may participate in other physiological or pathological events. PR-3, via a nonproteolytic mechanism, has potent antimicrobial activity against both bacteria and fungi (Campanelli et al., 1990a;Gabay et al., 1989). PR-3 is identical to myeloblastin, which is down-regulated during induced differentiation of HL-60 promyelocytic/myelocytic leukemia cells. Blocking myeloblastin expression with an antisense oligodeoxynucleotide inhibits proliferation and promotes differentiation of HL-60 cells to monocytes (Bories et al., 1989), implying that this serine proteinase plays a key role in myelopoiesis. PR-3 is also identical to the target autoantigen (c-ANCA) associated with Wegener's granulomatosis, a systemic necrotizing vasculitis (Niles et al., 1989;Jennette et al., 1990;Rao et al., 1991). The presence of autoantibodies against PR-3 is diagnostic for the disease; however, the role of PR-3 in Wegener's granulomatosis is not known (Ewert et al., 1991).
Given its potential importance in human health and disease, we sought to define the structure and function of the PR-3 gene. In this study, we describe the complete sequence of the PR-3 gene and its chromosomal assignment. To examine the presumed genetic locus of its cell-specific expression, we determined the 5"flanking sequence of the PR-3 gene and compared it to human leukocyte elastase, cathepsin G, and myeloperoxidase. Finally, Northern blot hybridization was performed to determine the stage of myelocytic development in which PR-3 is synthesized.

MATERIALS AND METHODS
Isolation of PR-3 Genomic Cosmid-A human genomic library was constructed in a pWE15 (Stratagene) cosmid that had been modified by insertion of an M13 polylinker into the BamHI site. Human genomic DNA from peripheral blood lymphocytes was partially digested with Sau3AI and fractionated in a 10-45% sucrose gradient (Seed et al., 1982). The Sau3AI sites were partially filled in and then ligated to the partially filled-in XhoI site of the modified pWE15 21193 cosmid in the presence of 10% polyethylene glycol 8000. The cosmid was packaged with Gigapack Gold (Stratagene) and transformed with bacterial strain 490A. The average size of the inserts was 40 kb. The library (5 X lo6 clones) was screened with a 32P-labeled 278-bp fragment from the 5"region of the PR-3 cDNA. The probe was generated via PCR amplification using specific primers corresponding to bases +79 to +lo1 of the cDNA (5"AGATCGTGGGCGGGCAC-GACGCG) and the complementary bases +329 to +346 ( T T G T T -TGACTTGCTGTAA). mRNA derived from HL-60 cells was used as the template. Prehybrization and hybridization were performed in 5 X SSC, 5 X Denhardt's solution, 0.5% SDS, 50% formamide, and 200 pg/ml denatured herring sperm DNA at 42 "C. Prehybridization was for 2 h, and hybridization for 15-18 h. A low stringency wash was carried out in 2 X SSC, 0.1% SDS at room temperature for 20 min, followed by three washes at a higher stringency (0.1 X SSC, 0.1% SDS at 65 'C for 20 min). Fourteen colonies exhibiting a positive signal were selected and rescreened using a second probe generated by PCR to correspond to a 3"portion of the cDNA, the primers being 5'-CATAACATTTGCACTTTC (bases +553 to +570) and 5'-TTGGATGATGCCATCACAGAT (bases +622 to +642). One of the 14 clones was positive after the second screen. This clone was grown and purified using two CsC12 gradients.
DNA Sequencing-The cosmid clone containing the PR-3 gene was digested with a panel of restriction enzymes, and the DNA was analyzed by Southern blot hybridization with the 5'-and 3"probes used in the library screening. Two strongly positive bands were selected for subcloning into Bluescript"' KS' (Stratagene). Both bands resulted from digestion of the cosmid with the enzymes NotI and BglII; the sizes were -3 and 8 kb. The two fragments were electroeluted from a preparative gel and ligated into the NotI and BamHI sites of Bluescript. The subclones were sequenced using synthetic oligonucleotides to M13 and to five AT-rich areas representing different regions of the PR-3 cDNA. The primers were made in the forward and reverse directions.
Sequencing was performed on double-stranded DNA using the Sequenase"' reagent kit (U. S. Biochemical Corp.) and 36S-dCTP. The sequencing strategy is presented in Fig. 1 and described under "Results." Areas recalcitrant to sequencing with this method were crossed using single-stranded DNA as the template. This generally resulted in longer stretches of readable sequence and clarified areas of compression. Single-stranded DNA was generated from Bluescript KS' and KSusing the helper phage VCSM13. The final sequence was determined from both strands, and areas of compression were sequenced several times to ensure a correct read.
Chromosomal Assignment of Human PR-3"Purified DNA of a human/hamster somatic hybrid cell mapping panel was obtained from the National Institute of General Medical Sciences (Coriell Institute, Camden, NJ). The human chromosome designation and amount were cytogenetically analyzed for the panel of DNA samples. PCR was used to assign the chromosome containing the PR-3 gene. All 18 DNA samples plus a sample of total human genomic DNA for a positive control and samples of mouse and hamster DNAs for negative controls were included in the screen for the PR-3 gene. Each reaction contained 100 ng of hybrid cell genomic DNA and 25 pmol of a forward and reverse primer corresponding to bases 2848-2868 and 2978-2998 of the PR-3 gene. Amplification was carried out using Gene-Amp" reagents (Perkin-Elmer Cetus Instruments) on a Tempcycler" (Coy Laboratory Products). The reactions were cycled by denaturing at 94 "C for 1 min, annealing at 48 "C for 1 min, and extension at 72 "C for 2 min. After 30 cycles, aliquots of the reactions were fractionated by electrophoresis on a 1.0% agarose gel. Tissue Expression-Conventional Northern blot analysis was performed with a probe generated from PCR with the primers used in the mature PR-3 protein from ATCGTGGGCGGG (IVGG) to localizing the gene. The probe is 150 bp and encodes the 5'-end of CTGCGGGACAT (LRDI) of exon 2 (Fig. 2). Total RNA was prepared by the method of Chomczynski and Sacchi (1987) from bone marrow, alveolar macrophages, monocytes, lymphocytes, and lymphocyteactivated killer cells in addition to the leukemia cell lines HL-60, U937, and PLB-985 (Tucker et al., 1987), and 10 pg of total RNA was loaded per lane. RNA was transferred to Hybond-N (Amersham Corp.) by capillary action, and hybridization was performed at 42 "C in 50% formamide, 10% dextran sulfate, 5 X SSC, 1% SDS, and 1 X Denhardt's solution containing 100 pg/ml denatured sonicated salmon sperm DNA. High stringency washes were performed at 65 "C in 0.1 X SSC, 0.1% SDS.

RESULTS
Genomic Organization of PR-3"The complete PR-3 gene resided within a single cosmid clone isolated from the pWE15 library after duplicate screening with 5'-and 3'-PR-3 cDNA probes. The sequencing strategy is shown in Fig. 1. Briefly, the 5'-end of the gene and 412 bp of 5"untranslated DNA were sequenced from overlapping ApuI, SmI, SmaI, and EcoRI subclones. The remainder of the gene including the polyadenylation signal resided in the two BglII/NotI clones. These two fragments were overlapped by employing PCR to generate the joining piece of DNA using the original cosmid clone as a template. The size of the gene from the initiation codon to the polyadenylation site is 6570 bp, making PR-3 the largest of the three serine proteinase genes whose proteins are stored in the azurophil granule of PMNL (Takahashi et al., 198813;Farley et aL, 1989;Hohn et al., 1989).
The PR-3 gene consists of five exons and four introns (Fig.  2). Exon 1 is short, consisting of 80 bp. It contains an 18-20bp untranslated region from the putative cap site (Labbaye et al., 1991) to the ATG codon. The remaining 60 bp encodes the hydrophobic region of the signal peptide, but not the signal peptidase cleavage site or the 2-amino acid propeptide. These are encoded from the 5'-end of exon 2. The presence of a small exon 1 encoding a signal peptide is a common feature within the trypsin-like serine proteinases, including cathepsin G (Hohn et al., 1989), human leukocyte elastase (Takahashi et al., 1988b;Farley et al., 1989), murine adipsin (Min and Spiegleman, 1986), and rat mast cell protease (Benfrey et aL, 1987). Exon 2 is 165 bp and contains, 15 bp from the 3'-end, the sequence for the catalytic histidine. Exon 3 is 142 bp; 18 bp from the 3'-end, it contains the nucleotide sequence encoding the catalytic aspartic acid residue of the protein. Exon 4 is the largest exon at 230 bp and does not contain any sequence encoding an active-site residue of the enzyme. The final amino acid of the catalytic triad is encoded 7 bp from the 5'-end of exon 5, which is 162 bp long. The catalytic triad of Hiss7, AsplOz, and Ser'" (chymotrypsin numbering) is conserved among all enzymatically active serine proteinases (Neurath, 1984).
The five-exon and four-intron organization of PR-3 corresponds to the organization found in the other PMNL azuro-phi1 granule serine proteinases, cathepsin G (Hohn et al., 1989) and human leukocyte elastase (Takahashi et aL, 1988b;Farley et al., 1989). The four introns of PR-3 are 2360, 259, 1883, and 1159 bp. Although the cDNAs of the three genes are of similar size (1.3 kb), the PR-3 gene is larger than the genes for cathepsin G and human leukocyte elastase due to the considerably larger size of introns I and 111. The three genes do not have introns of similar size, but the positions of the introns in the coding sequence show a high degree of conservation.
The four introns of PR-3 have donor and acceptor sites ( Table I) that conform to the consensus rules (Breathnach and Chambon, 1981) and contain a putative branch-point consensus sequence (Reed and Maniatis, 1985) within 18-40 bp of the 3'-end of each intron. The intron splice phases for the four PR-3 introns are type I for intron I, type I1 for intron 11, and type 0 for introns I11 and IV (Table I). The phasing of the four introns is identical to that reported for other serine proteinases expressed in hemopoietic cells (Takahashi et al., 1988b;Farley et al., 1989;Hohn et al., 1989;Min and Spiegleman, 1986;Caput0 et al., 1990;Haddad et al., 1990a;Caughey et aL, 1991). PR-3 introns contain repetitive sequences. There are several stretches of poly(A) or poly(T) sequences and several short sequences (9-12-mers) that repeat once or twice. In addition, there are two larger areas of unusual sequence; intron I contains an area (positions 790-1026) of repeating GAAT and TGAGT, and intron IV contains a region (positions 6300-6420) of GA repeats.
Sequences near the 5'-end of the PR-3 gene contain a TATA box at position 379. There is no CAAT box promoter element. The 5'-flanking sequence of 412 bp does not contain any of the following response elements: AP-1 (Lee et al., 1987), AP-2 , CAMP (Comb et al., 1986), serum-response element (Treisman, 1985), SP-1 (Briggs et al., 1986), octomer-binding factors (Parslow et al., 1984), glucocorticoid-response element (Karin et al., 1984), or the retinoic acid receptor-response element (Umesono et al., 1991). The octomer ACCCACAT occurs within 220 bp 5' of the initiation site of human leukocyte elastase and cathepsin G genes (Hohn et al., 1989), but it is not present in the PR-3 5"flanking sequence. Hohn et al. (1989) reported a second conserved motif in the 5"promoter region of human leukocyte elastase and cathepsin G (CCCACCC) that is present in the PR-3 gene at position 62. The 3"untranslated region is 200 bp, extending from a TGA stop codon to the site of polyadenylation, which occurs 10 bp after the canonical AATAAA signal (Fitzgerald and Shenk, 1981). Chromosomal Assignment-The chromosomal assignment of PR-3 utilized a panel of human/hamster somatic hybrid DNA samples as templates for the amplification of a fragment of PR-3 using PCR. All human chromosomes are represented in the panel. The samples of DNA showing the presence of an amplified product of the correct size of 150 bp are shown in Fig. 3a. The product was purified, subcloned, sequenced, and confirmed as PR-3 (data not shown). A PR-3 fragment was present in lanes 1 4 9 , and 12 of the panel. Lanes 19 and 20 represent two samples of human fetal genomic DNA as positive controls. No band was detectable in lanes 21 and 22, which contained mouse and hamster DNAs, respectively. The cell lines yielding the PR-3 fragment have in common chromosomes 6,8, 14, 17, and 19. It was possible to localize PR-3 to chromosome 19 and to exclude chromosomes 6, 8, 14, and 17 by examining the remaining 13 samples that did not result in a PR-3 fragment and by inference do not contain a suitable template. We also repeated the screen using a second set of primers that amplified a larger fragment of the gene (from bases 3343 to 5430) that includes the sequence of intron 111. These results also localized PR-3 to chromosome 19 (data not shown). A PR-3-specific fragment was also amplified from a human/hamster hybrid (GM10449) that contains only human chromosome 19 (Fig. 3b). This experiment confirmed the presence of PR-3 on chromosome 19.
Tissue Expression-To define the pattern of expression of PR-3 in hemopoietic cells, total RNA from these cells was hybridized with a unique 32P-labeled 5"probe prepared from PR-3 cDNA (Fig. 4). A signal of the appropriate size (-1.3 kb) occurred in the HL-60 and PLB-985 cells lines as well as in bone marrow-derived RNA. No PR-3 transcripts were detected in U937 cells (ATCC CRL1593), lymphocytes, lymphokine-activated killer cells, monocytes, PMNL, or alveolar macrophages. All samples contained similar amounts of hybridizable B-actin mRNA (data not shown). Subsequent analysis showed PR-3 transcripts in U937 cells that were obtained from the laboratory of Dr. R. Senior, Washington University, St. Louis. This disparity in transcripts of U937 cells depends on the supplier and has been previously observed.2 The pattern of expression of PR-3 shown in Fig. 4 suggests that PR-3 gene activity is confined to the early phases of myelocytic differentiation.

DISCUSSION
Gene Structure-The PR-3 gene contains five exons and four introns, which is similar to several other mammalian serine proteinases. It is, however, larger than the genes of other PMNL serine proteinases (Farley et aL, 1989;Hohn et al., 1989). The increase in size is mainly due to introns I and I11 being considerably larger in PR-3 than in the other genes. The four introns of PR-3 (2360PR-3 ( , 259, 1883PR-3 ( , and 1159 are defined by consensus donor and acceptor sites, and the exons encode a precursor enzyme with signal, proN, and proC peptides.
The PR-3 and myeloblastin cDNAs have been recently T. J. Ley, personal communication.

TARLE I Splice junctions of the PR-3 EPne
The consensus sequence for splice donor and acceptor sites is shown on the top line (Rreathnach and Chamhon. 1981). Pyr. pyrimidine. The residues surrounding each of the four intron junctions of PR-3 are indicated in upper-case letters. Note that the splice phases for intron I is I, for intron I1 is 11, and for introns 111 and IV is 0.

FIG. 4. PR-3 expression in hemopoietic cells.
Shown is a Northern analysis of total cellular RNA ( 1 0 pg/Iane). The autoradiograph was exposed for 2 days. The prohe used was a '*P-1aheIed PR-3-specific fragment generated from the :i'-end of PR-B cDNA using PCR. Positive transcripts of the correct size (1.3 kh) were found in thelirst (PLR-985), third (HL-60). and eixhth (human hone marrow)

FIG. 3. Assignment of human I T -3 gene to chromosome 19.
a, DNA from 18 human/hamster hyt~rid cell lines was screened for the presence of PR-3 using gene-specific primers to amplify a 150-hp fragment by PCR. The amplified products from the 18 hybrid cell lines plus a positive control containing the whole human genome and two negative controls containing mouse and hamster genomic DNAs were analyzed hy agarose electrophoresis. An amplified product of the correct size was seen in lanes 1-4. 9, and 12 and in lanes 19 and 20, which contain human genomic DNA as a positive control. No product was seen in lones 21 and 22, which contain mouse and hamster DNAs. The only chromosome shared by all the lines that contains a PR-3 product and that is absent from the other samples is chromosome 19. b. amplification of a PR-3 fragment from the GM10449 human/hamster hybrid, which contains only human chromosome 19.
reported (Bories et al., 1989;Campanelli et al., 1990b). There are four amino acid discrepancies between the myeloblastin cDNA (Bories et al., 1989) and the reported PR-3 cDNA (Campanelli et al., 1990b). In three cases, our genomic sequence agrees with the myeloblastin sequence. This suggests that amino acid 92 is isoleucine, amino acid 108 is threonine, and amino acid 109 is serine. The fourth amino acid discrepancy occurs close to the stop codon. There the genomic and PR-3 cDNA sequences read lysine, glycine, and proline, whereas the myeloblastin cDNA sequence infers the place-ment of an arginine between the glycine and proline (Rories et al., 1989). This is an area of considerable compression, and although we read the sequence from both strands as well as from single-stranded DNA, the confirmation of the presence or absence of the arginine awaits the amino acid sequence at the COOH-terminal end.
and myeloblastin cDNA sequences are silent. The codon for serine 219 is TCC in the genomic and PR-3 cDNA sequences; in myeloblastin, it is reported as TCT. PR-3 has a 200-hp untranslated tail, and the genomic sequence is identical to the PR-3 cDNA sequence (Campanelli et al., 1990h) with the exception of a 2-bp discrepancy at position 6884; the genomic sequence reads CC, whereas the reported cDNA sequence reads T. PR-3 cDNA sequenced in o u r laboratory reads CC, which is identical to the genomic sequence.
Analysis of PR-3 introns reveals repetitive sequences. Common to many genes (Sun et al., 1984). the introns of PR-3 possess simple sequence repeats of poly(dA) and poly(dT). These are thought to occur through retroposition of poly(A)rich regions of HNA molecules (Deninger and Daniels. 1986).
In addition to these simple repeats, the PR-3 gene contains two large tracts of unusual sequence. Intron I contains an area (positions 790-1026) consisting of two repeating motifs: GAAT and TGAGT. Intron IV includes a region of 12 GA repeats and a region characterized by a single G residue followed by 2 or 3 A residues (positions 6300-6420). Studies by Boehm et al. (1989) suggest that these purine-pyrimidine tracts may cause chromosomal instability and perhaps predisposition to genetic rearrangements and deletions similar to those associated with leukemias (Solomon, 1991). There are no other large anomalous stretches in the PR-3 gene, and there are no multiple repeats of shorter elements, but there are several short stretches that repeat two or three times. In intron I, the sequence TGAATGAATGAGTGAAT repeats three times, and the sequences TCCTGGGTTCAAGCGAT-T C T C C T and GCTGGGATTACA repeat twice. Intron I1 contains several short repeats: CGCGCCTCT repeats twice; TGACCT repeats twice, as does TGAGCT; and TGAAT-GAAT repeats three times.
Evolution of PMNL Azurophil Granule Serine Proteinases-Serine proteinases can be classified according to codon usage at the active-site serine, which can be encoded by TCN or AGY. The choice of codon results in two classes with a distinct ancestral lineage (Brenner, 1988). Serine proteinases of the trypsin family have an active-site serine encoded by TCN, whereas those of the thrombin family have an active-site serine encoded by AGY. The catalytic serine of PR-3 is coded by TCN, making it a member of the trypsin family.
An alternative classification of mammalian serine proteinases is based on intron position. This subdivides the serine proteinases into five classes (Irwin et al., 1988). In this classification, PR-3 is in group 2 along with trypsin, chymotrypsin, elastase, kallikrein, the a-and y-subunits of nerve growth factor, tissue plasminogen activator, and factor XI1 (Irwin et al., 1988). The criteria for membership in group 2 is the exon organization around the catalytic triad. In group 2, each amino acid of the catalytic triad is coded by sequences in separate exons. Moreover, the relative positions of the codons are similar: the histidine and aspartic acid residues fall close to the 3'-end of exons 2 and 3, respectively, and the serine residue occurs close to the 5'-end of exon 5.
Recently, Jenne et al. (1991) subdivided group 2 and created a sixth class of serine proteinases by using the phase and position of the introns as the criteria for the division. In this system, PR-3 would be a member of this sixth class because of conservation of intron phase and distribution between PR-3 and the following hemopoietic serine proteinases: human lymphocyte granzymes B and H (Haddad et al., 1990a(Haddad et al., , 1990b; murine granzymes B, C, and F (Lobe et al., 1988;Jenne et al., 1991); cathepsin G (Hohn et al., 1989); human leukocyte elastase (Takahashi et al., 1988b;Farley et al., 1989); rat mast cell-associated chymases I and I1 (Benfey et al., 1987); adipsin (Min and Spiegleman, 1986); and human chymase (Caughey et al., 1991). Thus, despite similarities at the catalytic serine, the gene structures of the hemopoietic serine proteinases, including PR-3, have diverged from those of other trypsinlike enzymes.
Comparison of PR-3 with PMNL Azurophil Granule Constituents-Comparing each intron/exon junction of the PR-3, human leukocyte elastase, and cathepsin G genes shows that the intron placement within the coding sequence is highly conserved. In all three genes, the first intron occurs between the first and second G of a glycine codon, and it is the glycine prior to the beginning of the mature protein that is interrupted. The placement of intron 111 is also conserved and begins between codons for glutamine and leucine. Intron IV falls between a phenylalanine and a glycine in PR-3 and human leukocyte elastase. Thus, it is likely that the genes for PR-3, human leukocyte elastase, and cathepsin G have recently shared a common progenitor.
Since PR-3 shows the greatest homology to human leukocyte elastase at the cDNA level, we searched PR-3 for the three repetitive elements identified within the human leuko-cyte elastase gene (Farley et al., 1989), but none were found. A comparison of the two genes for other homologous sequences revealed no large stretches. The longest homologous stretch is 30 bp (TCCTGGGTTCAAGCGATTCTCCTGCC TCAG), which occurs twice in intron I in both genes. The next largest areas of homology are of 15-20 bp. Altogether, 15 homologous stretches occur all within introns I, 111, and IV. In the 5'-flanking sequence, the TATAA promoter element also lies within a 14-bp area of homology, the sequence GGCTATAAGAGGAG occurring in both PR-3 and human leukocyte elastase. There are no other conserved regions within the 5"flanking sequence.
Results of the Northern analysis suggest that PR-3 expression is confined to the promyelocytic/myelocytic stage of bone marrow development (Fig. 4). Similarly, expression of human leukocyte elastase is limited to this developmental stage (Takahashi et aL, 1988a;Fouret et al., 1989), and cathepsin G transcripts are highest during the promonocytic and promyelocytic stages of differentiation (Hanson et al., 1990a). Expression of the myeloperoxidase gene, another component of the azurophil granule, is also confined to the beginning stages of myeloid differentiation (Fouret et al., 1989). Since transcripts of azurophil granule proteins are tightly linked to a specific stage of myeloid differentiation, analysis of the 5"flanking regions of PR-3, human leukocyte elastase, cathepsin G, and myeloperoxidase may reveal areas of homology that are involved in the coordinate expression of the four genes. The 14bp region surrounding the promoter TATAA box, conserved between PR-3 and human leukocyte elastase, does not occur in the promoter region of cathepsin G or myeloperoxidase. None of the well-characterized enhancer elements are present in the 400 bp of 5'-flanking sequence of PR-3. The glucocorticoid-response element is the only transcriptional control element that is present in human leukocyte elastase, cathepsin G, and myeloperoxidase, but it is not present in PR-3. To date, it is not known whether glucocorticoids can affect the expression of these genes. It is known that the expression of human leukocyte elastase and cathepsin G can be modulated by retinoic acid and phorbol esters (Takahashi et al., 1988a;Hanson et al., 1990b), but a retinoic acid-response element (Umesono et al., 1991) has not been detected in the 5"flanking sequence of either gene, and a phorbol ester-response element was detected 5' only to the human leukocyte elastase gene. Clearly, the regulation of these genes is more complex than noting the presence or absence of a specific regulatory element. Takahashi et al. (1988b) reported a 19-base pair pyrimidine-rich sequence in the 5"flanking region of human leukocyte elastase and myeloperoxidase. This sequence was not found in the promoters for PR-3 or cathepsin G. Given the current state of knowledge, any conserved 5"promoter sequence could act as a regulatory control. One potential sequence is CCCACCC in human leukocyte elastase and cathepsin G (Hohn et al., 1989), which also occurs in PR-3 (base +62) and in the promoter sequence of myeloperoxidase at base -47 (Johnson et al., 1989). Thus, although PMNL serine proteinase expression occurs primarily during the promyelocytic stage of myelopoiesis and the expression of individual proteinases shows coordinate cell and developmental specificity, the mechanisms directing their regulation are unknown.
After synthesis, PR-3 is transported to azurophil granules, where it is stored. PR-3 has a classic NH,-terminal signal peptide that aids translocation of the protein into the lumen of the endoplasmic reticulum (Blobel and Dobberstein, 1975). The signal peptide comprises a 4-residue polar amino terminus, followed by a 16-amino acid hydrophobic core and a signal peptidase cleavage site (von Heijne, 1984). The signal peptides of PR-3, human leukocyte elastase, and cathepsin G are encoded in exon 1, and the intron/exon boundary occurs between the first and second codon base of the last glycine prior to the mature protein. In PR-3 and human leukocyte elastase, this is prior to the signal peptidase cleavage site, whereas in cathepsin G, the junction splits the propeptide glycine codon. Cleavage of PR-3 by signal peptidase yields a proenzyme with a 2-amino acid propeptide: alanine-glutamic acid. Such a short propeptide is typical of this group of enzymes. Human leukocyte elastase and cathepsin G propeptides are serine-glutamic acid and glycine-glutamic acid, respectively.
Localization of PR-3 to Chromosome 19-Using a human/ hamster hybrid panel, we find that the PR-3 gene is on human chromosome 19. All the human chromosomes are represented in the human/hamster hybrid panel, and each chromosome is present in at least two hybrid lines. The data show that there is a single PR-3 gene in the human genome, and Southern blot hybridization supports this result (Campanelli et al., 1990b).3 No other trypsin-like serine proteinase has been shown to reside on chromosome 19. Until the sublocalization of PR-3 is resolved, the linkage of PR-3 to nearby genes cannot be deduced.
Bories et al. (1989) present data suggesting that PR-3 plays an active role in regulating proliferation and differentiation of HL-60 cells. By inhibiting PR-3 protein content, they differentiated HL-60 cells to monocytes, which indicates that a change in PR-3 expression is part of the cause of differentiation rather than the effect of differentiation. If PR-3 is intimately involved in myeloid proliferation and differentiation, it is conceivable that acute myelogenous leukemias reflect an abnormality in the PR-3 gene, which contains purinepyrimidine tracts that lead to chromosomal instability (Boehm et al., 1989). A simple scenario would be a genetic rearrangement/translocation/deletion of the PR-3 gene in these leukemias. In chronic myelogenous leukemia, 18% of patients who convert to blast crisis and acute myelogenous leukemia have trisomy of chromosome 19 (Ohyashiki et al., 1990). A translocation at p13 of chromosome 19 has also been observed in patients with acute myelogenous leukemia (Mitani et al., 1989).
In summary, we have defined the genomic structure of the human serine proteinase PR-3, which may help regulate myeloid cell differentiation and may contribute to the development of emphysema and Wegener's granulomatosis. The organization of the PR-3 gene is similar to that of the genes for the other myeloid serine proteinases, human leukocyte elastase and cathepsin G. Our results indicate the presence of a single PR-3 gene in the human genome on chromosome 19. PR-3 expression appears to be tightly controlled and confined to the promyelocytic/myelocytic stage of hemopoiesis.