Structure and Expression of the Human Gene for the Matrix Metalloproteinase Matrilysin*

Matrilysin, a member of the matrix metalloproteinase family, is structurally different from the other matrix metalloproteinases by virtue of the absence of a conserved COOH-terminal protein domain. In addition, matrilysin mRNA is regulated in a specific and distinct manner in normal and malignant tissues. Analysis of the genomic structure of the human matrilysin gene revealed that the organization of the first five exons is highly conserved among the different members of the matrix metalloproteinase family, but that matrilysin contains an atypical sixth exon. The promoter region of the matrilysin gene has several features that are conserved among several other matrix metalloproteinase family members, including the presence of TAT4 AP-1, and PEAS elements. Comparison of the expression of the human matrilysin promoter with rat stromelysin promoter/chloramphenicol acetyltransferase constructs in HeLa cells revealed that constructs containing AP-1 and PEAS elements respond similarly to epidermal growth factor and tumor promoter (12-0-tetradecanoyl-phorbol-13-acetate) induction, but that the addition of upstream stromelysin sequences results in an increased transcriptional activity not observed with upstream matrilysin sequences. The similarities and differences observed between the promoters of matrilysin and the other metalloproteinases may provide insights into the molecular mechanisms that regulate the expression of this family of enzymes as a whole and the factors that distinguish the expression patterns of individual family members. The metalloproteinases form a family of structurally related enzymes capable of degrading specific components of the extracellular matrix. Remodeling of the extracellular matrix is a critical event in normal processes such as tissue morphogen-esis, differentiation, and wound healing, and the matrix metalloproteinases are believed to


Structure and Expression of the Human Gene for the Matrix Metalloproteinase Matrilysin*
Matrilysin, a member of the matrix metalloproteinase family, is structurally different from the other matrix metalloproteinases by virtue of the absence of a conserved COOH-terminal protein domain. In addition, matrilysin mRNA is regulated in a specific and distinct manner in normal and malignant tissues. Analysis of the genomic structure of the human matrilysin gene revealed that the organization of the first five exons is highly conserved among the different members of the matrix metalloproteinase family, but that matrilysin contains an atypical sixth exon. The promoter region of the matrilysin gene has several features that are conserved among several other matrix metalloproteinase family members, including the presence of TAT4 AP-1, and PEAS elements. Comparison of the expression of the human matrilysin promoter with rat stromelysin promoter/chloramphenicol acetyltransferase constructs in HeLa cells revealed that constructs containing AP-1 and PEAS elements respond similarly to epidermal growth factor and tumor promoter (12-0-tetradecanoylphorbol-13-acetate) induction, but that the addition of upstream stromelysin sequences results in an increased transcriptional activity not observed with upstream matrilysin sequences. The similarities and differences observed between the promoters of matrilysin and the other metalloproteinases may provide insights into the molecular mechanisms that regulate the expression of this family of enzymes as a whole and the factors that distinguish the expression patterns of individual family members.
The metalloproteinases form a family of structurally related enzymes capable of degrading specific components of the extracellular matrix. Remodeling of the extracellular matrix is a critical event in normal processes such as tissue morphogenesis, differentiation, and wound healing, and the matrix metalloproteinases are believed to participate in these events. In addition, these enzymes are also involved in pathological conditions such as tumor invasion, metastasis, and arthritis, which reinforces the interest in understanding their regulation. Nine members of the matrix metalloproteinase family CA46843 and DK39776 and by Syntex Research. The costs of publica-* This work was supported by National Institutes of Health Grants tion of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The nucleotide seqwnce(s) reported in this paper has been submitted to the GenBankTMIEMBL Data Bank with accession nurnbeds) L2.2519 to L2.25.25.
1 To whom correspondence should be addressed. Tel.: 615-343-3413; Fax: 615-343-4539. have been identified so far and can be divided into at least three subclasses by substrate specificity: the collagenases degrade fibrillar interstitial collagens; the gelatinases (Type l V collagenases) recognize basement membrane and denatured collagens; and the stromelysins hydrolyze proteoglycans and extracellular matrix glycoproteins (see Refs. 1 and 2 for review). In addition, a murine metalloelastase has been recently identified and may represent a new subclass of elastinolytic enzymes (3). Matrilysin (EC 3.4.24.23; also known as MMP-7, pump-1 (41, small uterine metalloproteinase (9, and matrin (6)) is a member of the stromelysin subclass of enzymes and is the smallest member of the matrix metalloproteinase family. In fact, activated matrilysin is distinct in that it contains only the catalytic domain with the zinc-binding region required for proteolytic activity, in contrast to the other members of the family, which comprise additional carboxyl-terminal domains (4). Matrilysin has a wide range of substrates, including proteoglycans, fibronectin, gelatin, elastin, and casein (5)(6)(7). The matrilysin protein was first identified as the small uterine metalloproteinase expressed in the postpartum rat uterus (5). The cDNA was isolated from human malignant tissues (41, and the corresponding mRNA has since been detected in human prostate (81, gastric (91, colon (91, breast (lo), and rectal (6) adenocarcinomas. Matrilysin expression has also been detected in normal tissues, including the glomerular mesangium (ll), normal kidney cells (121, and the glandular epithelium of the human endometrium during the proliferative and menstrual phases of the cycle (13).
The expression of matrilysin in tumors and the cycling human endometrium suggests that this metalloproteinase, like many other matrix metalloproteinase family members, is regulated by oncogenes, growth factors, and hormones (see Ref. 1 for review). Matrilysin mRNA levels are increased by treatment of glomerular mesangial cells with tumor necrosis factor and interleukin-1 (111, similar to the effect of interleukin-1 on stromelysin, stromelysin-2, and collagenase (14)(15)(16)(17)(18). In the human endometrium, however, the pattern of matrilysin mRNA expression is temporally similar to, but spatially distinct from, that of other metalloproteinases in that matrilysin expression is restricted to the glandular epithelium, whereas the other metalloproteinases are observed in the stromal component of the tissue (13).l Matrilysin mRNA was also observed in tumor cells in a series of human colon adenocarcinomas, whereas the expression of the other metalloproteinases, when detected, was restricted to surrounding normal stroma (9h2 Therefore, there are both similarities and differences in matrilysin expression patterns as compared with other members of the matrix metalloproteinase family. of the human matrilysin gene and its promoter region. Matrilysin mRNA is stimulated by the epidermal growth factor (EGF)3 and the tumor promoter TPA, and this response appears to be at least partially attributed to the conserved AF"1 and PEA3 elements identified in the promoters of matrilysin and several other metalloproteinase genes. Differences in EGF and TPAresponsiveness to these factors were observed with the addition of upstream promoter sequences, suggesting that this gene may have distinct transcriptional elements in addition to common elements that determine the specific expression pattern of matrilysin as compared with other metalloproteinase family members.

MATERIALS AND METHODS
Isolation of Genomic Clones-The genomic libraries used for these studies were prepared from human placenta and cloned into ADash (a giR from Dr. Michael Dean, National Cancer Institute-Frederick Cancer Research Facility, Frederick, M D ) and A M I (Stratagene 946203). The libraries were screened with 32P-labeled human matrilysin cDNA probes (random-primed DNA labeling kit, Boehringer Mannheim) using the full-length cDNA for the AFXI library and a fragment containing nucleotides 1-790 for the ADash library. Ten positive clones were identified in the ADash library after stringent washing (50 "C; 0.2 x SSC, 0.1% SDS). DNA from three selected clones (A6, A l l , and A12a) was purified and mapped by EcoRI digestion, followed by Southern hybridization with the matrilysin cDNA probe. The four EcoRI fragments containing exon sequences were subcloned into the EcoRI site of the plasmid pGEM"IZf(+) (Promega Biotec) to give pGEMHP5. 5,pGEMHP3.4,pGEMHP2.9,and pGEMHP2.0,in which the numbers indicate the size of the EcoRI fragment in kilobases. From the AFixII library, eight positive clones were obtained. All information on the matrilysin gene derived from this library was obtained from one phage clone, ,412.2.3, or from an 8.5-kbp BglII fragment from this phage that was subcloned into the BamHI site of pGEM-7ZA+).
DNA Sequencing-DNA sequencing of the subclones derived from the ADash library was performed by the dideoxy chain termination method (53) using a Sequenase version 2.0 DNA sequencing kit (United States Biochemical Corp.) and CX-~~S-~ATP. Subclones obtained from the AFXI library were sequenced using Taq DNA polymerase according to the procedure recommended by the manufacturer (Taq Track, Promega Biotech Either universal primers (T7 and Sp6) or specific oligonucleotide primers synthesized on an Applied Biosystems DNA synthesizer were used in the sequencing reactions.
To sequence the promoter region, a series of nested deletions spanning this region were generated in pGEMHP5.5 using exonuclease I11 (Erase-a-Base system, Promega Biotec.). The DNA sequence was obtained from both strands of the promoter using either commercial or synthesized primers.
Exon / Intron Boundary Mapping-Exonlintron boundaries were identified by direct sequencing using specific primers corresponding to the human matrilysin cDNA sequence. The primers were selected by extrapolating the potential boundaries based on the genomic structure of the rat stromelysin gene (19). With the exception of the exon Yintron 1 boundary, the sequences were confirmed in subcloned DNA derived from both genomic libraries. The sizes of the introns were determined by restriction enzyme digestion and Southern blot analysis of the four pGEM-7Zf(+) derivatives (pGEMHP5. 5,pGEMHP3.4,pGEMHP2.9,and pGEMHP2.O). The size of intron 2 was confirmed by sequencing and that of introns 1,4, and 5 by the polymerase chain reaction (PCR), in which exon primers were used to amplify the intron regions within phage and/or plasmid clones. The PCR amplification was performed using 0.5-400 ng of DNAtemplate, dNTPs (200-320 p~ each), and 1-2.5 units of Taq polymerase (Perkin-Elmer Cetus Instruments). Amplification cycles ranged from 35 to 40. PCR amplification products were separated on agarose gels and stained with ethidium bromide, and sizes were determined by comparison with a standard ladder.
Southern Blot Analysis-Genomic DNA was purified from human newborn foreskin tissue derived from four unrelated individuals. A 16-pg sample of genomic DNA was digested with EcoRI and electrophoresed on a 0.8% agarose gel. The DNA was denatured in 0.5 M NaOH The abbreviations used are: EGF, epidermal growth factor; TPA, 12-0-tetradecanoylphorbol-13-acetate; kbp, kilobase paids); bp, base paids); PCR, polymerase chain reaction; CAT, chloramphenicol acetyltransferase; TGF-61, transforming growth fador-pl. and transferred to Zeta-Probe blotting membrane (Bio-Rad). The hybridization probe was prepared from the full-length gel-purified human matrilysin cDNA and was 3zPradiolabeled to a specific activity of -1 x lo9 cpdpg by random priming .
Primer Extension-Poly(A)+ RNA was isolated from MDA-MB-468 human breast carcinoma cells by standard methods (20) and was kindly provided by Dr. Bruce Ennis. The oligonucleotide primer (5"ACACAG-CACGGTGAGTCG-3', complementary to nucleotides +51 to +68 in the human matrilysin cDNA) was 5"end-labeled with polynucleotide kinase and [ Y -~~P I~A T P and purified on a 20% urea-acrylamide gel. The specific activity was 2.4 x lo8 cpdpg. Poly(A)+ RNA(16 pg) was primed and reverse-transcribed with 15 units of reverse transcriptase (Seikagaku America) for 45 min at 37 "C as described (20). The primer extension product was analyzed on a 6% sequencing gel, and the size of the product was determined by comparison with sequencing reactions that were simultaneously loaded. Northern Blot Analysis-The WiDr cell line (obtained from a primary human rectosigmoid colon adenocarcinoma), the SW480 cell line (derived from a primary human grade 11-IV colon carcinoma), and the SW620 cell line (derived from a metastatic nodule in the lymph node from the same patient as the SW480 cell line) were all obtained through the American Type Culture Collection. The cells were cultured in Dulbecco's modified Eagle's medium containing 10% Nu-Serum IV (Collaborative Research Inc., Bedford, MA) in a 95% air, 5% CO, atmosphere. Cultures were serum-starved for 16 h, followed by the addition of EGF (50 ng/ml) or TPA (100 ng/ml) for 8 h. Poly(A)+ RNA was isolated, separated by formaldehyde gel electrophoresis, blotted onto nitrocellulose paper, and hybridized as previously described (9). The human matrilysin cDNA insert was radiolabeled by random priming using [a-32PldCTP to a specific activity of -1 x lo9 cpdpg. The matrilysin RNA bands on an appropriately exposed autoradiograph were scanned using a Pharmacia LKB Biotechnology Ultrascan XL densitometer, and the -fold stimulation was determined by dividing the area of the bands obtained following EGF or TPA treatment by those obtained in the control lane.
PromoterlReporter Gene Constructs-The plasmid pA2207, which was generated by exonuclease 111 digestion of pGEMHP5.5 and contains the 5"flanking region of the matrilysin promoter and 35 bp of transcribed sequence, was used to create the HP-CAT reporter plasmids.
Three fragments containing different lengths of the promoter were isolated by digestion of pA2207 with AccI (position -95), KpnI (position -2951, and SnaBI (position -933) in conjunction with ApaI, which is located in the polylinker sequences. The three fragments were incubated with T4 DNApolymerase to generate blunt ends and inserted into the SmaI site of pGEM-7Zff+) to create plasmids pGEM-95HPAS, pGEM-295HPAS, and pGEM-933HPAS, respectively. The XhoYBamHI fragment from each of these plasmids was subcloned into XhoIIBglIIrestricted pG7-754TRCAT (21), replacing the rat stromelysin promoter sequences with matrilysin promoter sequences so that the promoter was in the correct orientation relative to the CAT gene. The resulting plasmids were confirmed by sequencing and called p-95HPCAT, p-295HPCAT, and p-933HPCAT, respectively, in which the numbers indicate the most 5'-nucleotide in the human matrilysin (pump-1) promoter contained in the construct.
The plasmid containing 754 bp of the rat stromelysinltransin promoter and the CAT reporter gene, referred to as p-754TRCAT in this report, was previously described as pG7-754TRCAT (21). p-208TRCAT was prepared by PCR amplification of p-754TRCAT using a 5'primer corresponding to sequences ending at position -208 and containing HindIII sequences (5'-TCMGC'ITGCAGGMGCA'I"CCT-3') and a 3"primer spanning the BglII site located at the junction of the promoter and CAT sequences (5'-TACCAGATCTCCAGCT-3'). The amplified fragment was restricted with HindIII and BglII and subcloned into the HindIII/BglII fragment of p-754TRCAT that remained following the removal of the stromelysin promoter sequences. p-84TRCAT was created by blunting the ends of BstXI-digested p-754TRCAT, restricting with BamHI, and subcloning the 1735-bp fragment into p-754TRCAT that had been digested with SalI, blunted, and restricted with BamHI. The resulting plasmid contains 84 bases of the rat stromelysin promoter linked to the CAT coding sequences in the same plasmid background as that of p-754TRCAT and p-208TRCAT. All constructs were confirmed by DNA sequencing. Dansfection Analysis-HeLa cells were grown in Dulbecco's modified Eagle's medium supplemented with 10% calf serum and gentamicin (5 pglml). The cells were plated at 1.5 x lo6 celldl0-cm culture dish the day before transfection, and the medium was changed immediately prior to transfection. The cells were transfected by the calcium phosphate coprecipitation method as previously described (21)  the test construct and 2 pg of the pCHllO plasmid (the lac2 gene under control of the SV40 early promoter (22)) to control for variations in transfection efficiencies. All plasmids were isolated by the alkaline lysis method and purified by double banding through a CsCl gradient (20). Eight hours after tne addition of the precipitates, cells were glycerolshocked for 2 min (10% (v/v) glycerol) and refed with hlbecco's modified Eagle's medium containing 10% calf serum. ARer 24 h, the cells were deprived of serum for 16 h and stimulated with EGF (50 ng/ml) or TPA (100 ng/ml). Cell lysates were prepared 8 h aRer the addition of growth factors and analyzed for P-galactosidase (23) and CAT (24) activities as previously described. The amount of cell lysate assayed for CAT activity was normalized for equivalent P-galactosidase activity. The percent acetylation of chloramphenicol was determined by thinlayer chromatography and scintillation counting. Each experiment was repeated four to eight times with two different plasmid preparations. Values were compared using the Student's t test.

RESULTS
To characterize the human matrilysin gene, two different human placenta genomic libraries were screened using the human matrilysin cDNA as a probe. Following initial restriction enzyme analysis and hybridization with the cDNA probe, three clones from the ADash library (A6, A l l , and A12a) and one clone from the AFixII library (A12.2.3) were selected for further characterization (Fig. m). To facilitate the sequencing of the exons and the determination of exodintron boundaries, EcoRI fragments hybridizing with the matrilysin cDNA were subcloned into pGEM-7Zf(+) and named according to their fragment length as indicated in Fig. 1C. Exodintron boundaries were determined by direct sequencing using specific primers corresponding to the human matrilysin cDNA sequence. The analy-sis demonstrated that the human matrilysin gene contains six exons distributed between four genomic EcoRI fragments (Fig.  lA). The size of the intervening introns was determined by direct sequencing, by restriction enzyme analysis and Southern blotting, or by PCR amplification using specific primers contained in different exons with determination of the product size by agarose gel electrophoresis. The exons and introns range in size from 129 to 299 bp and from 88 to 2600 bp, respectively ( Table I). The exodintron boundaries conform to the GT/AG rule for splice site junctions as described by Breathnach and Chambon (25). Based on this analysis, the size of the transcribed gene is -9.65 kbp.
The assignment of exons to the genomic EcoRI fragments was confirmed by Southern blot analysis of genomic DNA derived from four individuals and probed with the human matrilysin cDNA (Fig. 1 D ) . Each sample contained a prominent 3.4kbp band corresponding to the genomic EcoRI fragment containing exons 2 4 . A 2.9-kbp fragment representing the EcoRI fragment containing exon 5 is clearly detectable, whereas bands at 5.5 and 3.25 kbp corresponding to fragments encompassing exons 1 and 6, respectively, weakly hybridize with the cDNA probe. These fragments are more clearly detected using shorter 5'-or 3"specific cDNA probes (data not shown).
Sequence analysis of both strands of the six exons in the genomic subclones from the ADash library revealed several differences from the partial human matrilysin (pump-1) cDNA reported by Muller et al. (4) (Fig. 2 (4) was reproduced in this figure and named cDNA-1. The nucleotide changes observed in the human matrilysin genomic subclones derived from the ADash library as indicated in Fig. 1C are shown below the cDNA sequence and are labeled GENE-1. The base change resulting in a coding difference is marked. Nucleotide changes identified in the 3'-untranslated region of the subclone derived from A12.2.3 (from the AFixII library) are identified as GENE-2. The cDNA-2 sequence is derived from a cDNA isolated from a human mesangial cell library (11). The start site of transcription is labeled nucleotide 1 in the GENE-1 seauence. The nucleotide numbering therefore differs from the pub-

11
--A 1). In the coding region, the G residue at position 277 in the cDNA is an A residue in the genomic sequence, resulting in a change in the deduced amino acid sequence at codon 77 from an arginine in the cDNA to a histidine in GENE-1. A cDNA isolated from human mesangial cells (cDNA-2 (11)) and the sequence of this region in genomic clones isolated from the hFixII library (GENE-2) do not show this change. However, a cDNA isolated from a human colon cancer-derived cell line (WiDr) also shows the same G-to-A sub~titution.~ Since this is not a conservative amino acid change, it is not clear if this polymorphism has any effect on the conformation andor the function of the matrilysin protein.
In addition, 13 nucleotide changes were observed in the 3'untranslated region of the matrilysin gene between positions 933 and 1067 in GENE-1 as compared with cDNA-1 (Fig. 2). These changes were not seen in GENE-2, but another nucleotide change in the untranslated region was observed at position 1107. The additional bases on the 3'-end of the matrilysin cDNA derived from human mesangial cells (cDNA-2 (11) to the EcoRI site in the vector) were also observed in both GENE-1 and GENE-2, with a few exceptions. It is not clear if the difference in the lengths of the two cDNAs is a result of incomplete cloning of cDNA-1 or a consequence of alternative polyadenylation sites in the gene. The latter alternative is supported by the M. Smith and M. Navre, personal communication. presence of two consensus polyadenylation sites in the genomic sequence (underlined in Fig. 2). The nucleotide sequence differences in the 3"untranslated region of this gene in the various cDNAs and genomic clones that have been characterized suggest that there is little pressure to conserve these sequences.
The start site of transcription was determined by primer extension of mRNA isolated from a human breast cancer cell line expressing endogenous matrilysin. An oligonucleotide primer complementary to nucleotides 51-68 in the matrilysin sequence ( Fig. 2) was radiolabeled and extended, and the product was analyzed by denaturing gel electrophoresis as described under "Materials and Methods" (Fig. 3). A major product was detected as well as a minor product corresponding to a fragment 1 nucleotide shorter in length, which was detected only with a very long exposure of the gel. The major product corresponded to a 68-base extended fragment as deduced from the sequencing reactions that were loaded onto the gel at the same time. This result localizes the start site 47 nucleotides from the ATG translation initiation codon and 20 nucleotides upstream from the 5'-end of the partial cDNA sequence that was previously published (4).
The analysis of the 5"flanking region revealed a TATA box at positions -32 to -25, centered at position -30 (Fig. 4). This is within the preferred region relative to the transcription initiation site as defined by Bucher (26) and further supports the identification of the nucleotide indicated at position +1 as the cap site. The sequence CA at the transcription start site corresponds to the sequence found in the majority of eukaryotic promoters analyzed (26). The matrilysin promoter contains an AP-1 motif between positions -67 and -61 and two inverted PEA3 elements relative to the consensus sequence ((C/G)AGGAAG(T/C)) at positions -170 to -163 and positions -146 to -139 (Fig. 4). The AP-1 motif is known to confer responsiveness to a variety of oncogenes, growth factors, and tumor promoters and is recognized by a transcriptional complex composed of members of the c-fos and c-jun families (see Ref. 27 for review). The PEA3 motif, first recognized in the polyomavirus enhancer, is also an oncogene-, growth factor-, and phorbol ester-responsive element and can serve as a binding site for the products of the ets gene family (see Ref. 27 for review). The combination of these elements has been referred to as a tumor promoter-and oncogene-responsive unit and has been identified as a functional transcriptional element in several genes, including human interstitial collagenase (28). The positions of the TATA box and AP-1 site in the matrilysin promoter are very similar to those seen in human interstitial collagenase, stromelysin, and stromelysin-2 and rat stromelysin and stromelysin-2 (Fig. 5). The number, orientation, and positions of the PEA3 elements, however, vary between these different family members. The matrilysin promoter also contains sequences at positions -475, -500, and -820 with a high homology to the TGF-P1 inhibitory element originally identified in the rat stromelysin promoter for matrilysin mRNA expression following stimulation with EGF and the tumor promoter TPA. Two of the three cell lines tested constitutively expressed detectable levels of matrilysin mRNA (Fig. 6). EGF treatment stimulated matrilysin mRNA by -5-fold in the WiDr cell line and by 2-fold in the SW620 line. TPA was more potent in both cell lines, stimulating matrilysin mRNA by -9-and 6-fold in the WiDr and SW620 cell lines, respectively. In contrast, the SW480 line did not express matrilysin mRNA and could not be induced to express the message by treatment with either EGF or TPA. The stimulation of matrilysin mRNA by TPA and EGF is qualitatively similar to the response observed for the metalloproteinases stromelysin (15,29,30) and collagenase (31, 32). Studies have shown that the AP-1 site is necessary for TPA and growth factor induction of several metalloproteinase genes (14,18,29,(33)(34)(35). It is not always sufficient, however, for this activity. Acombination ofAP-1 and PEA3 elements is necessary for maximal TPA stimulation of human (28,361 and rabbit (37) collagenases, and AP-1 and PEA3 elements cooperate in the murine urokinase plasminogen activator gene to confer responsiveness to TPA and EGF (38). To determine if these elements in the matrilysin promoter are functionally similar to those in other metalloproteinase genes, the transcriptional activities of several human matrilysin and rat stromelysin promoter/ reporter gene constructs were compared. These studies were performed in HeLa cells, a human carcinoma-derived cell line in which the transcriptional complex has been well-characterized. HeLa cells do not express detectable levels of endogenous matrilysin or stromelysin2 and therefore present a system in which the expression of these metalloproteinases can be compared without bias. Promoter constructs containing the AP-1 site (p-95HPCAT and p-84TRCAT in human matrilysin and rat stromelysin, respectively), the AP-1 and PEA3 sites (p-295HP-CAT and P-~O~TRCAT, respectively), and these sites plus additional upstream sequences (p-933HPCAT and p-754TRCAT, respectively) linked to the chloramphenicol acetyltransferase reporter gene were prepared (Fig. 7A). The constructs were transiently transfected by CaP04 coprecipitation into HeLa cells in the presence of EGF or TPA, and CAT activity was assayed (Fig. 7, B and C). Matrilysin and stromelysin promoter constructs containing only the AP-1 site demonstrated no significant increase in CAT activity following EGF stimulation (115 and 95% of control values without EGF stimulation for p-95HPCAT and p-84TRCAT, respectively) (Fig. 7B). The addition of PEA3 elements to the minimal matrilysin or stromelysin promoter constructs resulted in a small increase in CAT activity in response to EGF stimulation (1.9 ( p < 0.07) and 1.6 ( p < 0.07) times the control value for p-295HPCAT and p-208TRCAT, respectively). Constructs containing the entire 933 nucleotides of the matrilysin promoter demonstrated no significant difference in the response to EGF compared with p-295HPCAT (1.6-fold induction for p-933HPCAT), whereas the addition of upstream stromelysin sequences results in an increase to 3.2 times control levels for p-754TRCAT. The human matrilysin and rat stromelysin promoter constructs containing the AP-1 site or the AP-1 and PEA3 elements therefore responded to EGF stimulation in a similar manner, but the stromelysin promoter contains additional EGF-responsive elements located upstream from position -208 that are not detected in the p933HPCAT matrilysin promoter construct.
TPA treatment of the human matrilysin and rat stromelysin promoter constructs produced similar results (Fig. 7  in an -2-fold increase in CAT activity (2.55 times control levels for p-295HPCAT and 2.35 times control values for p-933HP-CAT). The addition of the PEA3 elements to the minimal stromelysin promoter increased the -fold induction with TPA to an average of 2.7 times the control value for p-208TRCAT and to 6.6 times the control value for p-754TRCAT. The constructs containing an AP-1 element therefore responded slightly to TPA treatment, and this response was enhanced by the addition of sequences containing PEA3 elements. Sequences upstream from the stromelysin promoter, however, appear to cooperate with the stromelysin promoter to respond to TPA treatment, whereas upstream sequences in the human matrilysin promoter have little or no effect on TPA inducibility of this gene.

DISCUSSION
We examined the structural characteristics of the human matrilysin gene to gain insights into the relationship of this metalloproteinase to other members of the matrix metalloprokinase family. Sequence analysis of the gene revealed that it spans -9.65 kbp and is composed of six exons. Like the other members of the matrix metalloproteinase family, matrilysin has several domains in the protein structure (see Refs. 1 and 2 for review). The pre-domain targets the protein for secretion and is removed at the time of release from the cell. The prodomain is autocatalytically removed upon activation of enzymatic activity through a "cysteine switch" mechanism (39). The catalytic domain is characterized by a highly conserved sequence (HEXGmGXXHS) that, based on crystallographic analysis of related metalloproteinases (40) and site-directed mutagenesis studies (41), appears to be the zinc-binding site that is required for proteolytic activity. In the matrilysin gene, these sequences are contained within five exons that do not correspond precisely to the functional protein domains. The pre-domain and a portion of the pro-domain are contained in exon 1; exon 2 encodes the remainder of the pro-domain and the NH2-terminal portion of the catalytic domain; and the catalytic domain is spread over exons 2-5, with the zinc-binding region located in exon 5 (Figs. 2 and 8). "his precise arrangement is conserved in other members of the matrix metalloproteinase family for which the genomic structure has been determined, including rat stromelysin-1 and -2 and human interstitial collagenase, gelatinase A, and gelatinase B (Fig. 8). Matrilysin has a sixth exon (encoding the final 9 amino acids of the coding sequence and the 3"untranslated sequences) that is not conserved in any of the other matrix metalloproteinase family members. However, the other metalloproteinases contain a carboxyl-terminal hemopexin-like domain lacking in matrilysin that is encoded in five exons with sizes that are relatively conserved, with the exception of the 3"untranslated region (see Fig. 8). Gelatinase subfamily members have an additional domain encoded by three exons that divide the catalytic domain, and gelatinase B contains a unique domain encoded by a portion of the first exon in the hemopexin-like domain (42).
The presence and diversity of protein domains and their corresponding exon structures have led to the suggestion that the matrix metalloproteinase family members arose by the shuffling of exons into duplicated genomic sequences (1,431. It is tempting to speculate that matrilysin is the present-day representative of the primordial gene since it is the smallest and most fundamental member of the family. The additional domains found in the other family members, which are believed to be responsible for conferring substrate specificity to the vari-  Comparison of matrilysin promoter with other members of metalloproteinase family. Potential transcriptional elements in the matrilysin promoter are compared with sequences in human gelatinase A (50) and B (42), human interstitial collagenase (31, 33), human stromelysin (15,16), rat stromelysin (30), human stromelysin-2 (E), and rat stromelysin-2 (19) promoters. The rabbit collagenase promoter was not included due to the limited sequence information available. Bent arrows indicate the transcription initiation sites. For gelatinase A, interstitial collagenase, and matrilysin promoters, a second minor transcription start site has been identified (smaller bent arrows). TATAboxes, AP-1 motifs, PEA3 elements, TGF-p1 inhibitory element-like sequences (TIE), AF"2 sites, and GC bodSP-1-binding sites (  in human colon adenocarcinoma cell lines. The indicated human colon carcinoma-derived cell lines were treated with serum-free medium (CON), 50 ng/ml EGF, or 100 ng/ml TPA for 8 h. Poly(A)+ RNA was isolated, and 5 pgAane was analyzed for matrilysin expression by Northern blot analysis as previously described (9). The blot was stripped and reprobed with the 1B15 probe, which hybridizes with the mRNA for cyclophilin, a constitutively expressed mRNA (51). ous enzymes (44,45), may have been added evolutionarily as the complexity of the matrix increased. It is also possible, however, that primitive enzymes may have contained the hemopexin-like sequences, which were lost and replaced by a novel exon in the case of matrilysin. This view is supported by the observation of Murphy et al. (46) that the amino acid sequence of the catalytic domain of stromelysin-3, a matrix metalloproteinase that possesses a hemopexin domain, is most closely related to bacterial metalloproteinases. Resolution of this issue will depend on the isolation of metalloproteinases from lower species.
We also examined the promoter region of the human matrilysin gene for comparison with other matrix metalloproteinase genes. The human matrilysin gene contains a TATA box at position -32 and an AP-1 site at position -67. This arrangement is similar to that seen in the promoter region of all the matrix metalloproteinase family members sequenced thus far, with the exception of human gelatinase A (Fig. 5). In addition, the matrilysin promoter contains two PEA3 elements at positions -146 and -170. The rat and human stromelysin genes also contain two PEA3 elements, although their position relative to the AP-1 site and their orientation differ (see Fig. 5).
Human interstitial collagenase and stromelysin-2 each have one PEA3 element.
The identification of the AP-WEA3 combination of transcriptional elements as a tumor promoter-and oncogene-re- sponsive unit (28) suggests that matrilysin, like these othermetalloproteinase genes, may be regulated by TPA, growth factors, and oncogenes. We have shown that matrilysin is expressed in a high percentage of human gastric and colon adenocarcinomas (9) and in several colon adenocarcinoma-derived cell lines (Fig. 61, suggesting that its expression is regulated by oncogenic transformation. Matrilysin, like human interstitial collagenase (31, 33) and human and rat stromelysin-1 and -2 (15,19,301, is responsive to the phorbol ester tumor promoter TPA (Figs. 6 and 7). Similarly, EGF also positively regulates matrilysin (Figs. 6 and 7), human interstitial collagenase (32), and human and rat stromelysins (15,47). The similarities in the response of members of the metalloproteinase family to biologically active agents such as growth factors and oncogenes may therefore be related to the similarities in the AP-1 and PEA3 transcriptional elements in their promoter sequences. The AP-1 site is required for TPA and growth factor induction of several metalloproteinase genes (14,18,29,(33)(34)(35), but, in many cases, has been shown to be insufficient for full activity. Our results confirm this observation for the matrilysin and rat stromelysin-1 genes in HeLa cells in that promoter/CAT constructs containing the AP-1 and TATA sequences demonstrated no or only very modest (5 1.6-fold) stimulation with EGF or "PA (Fig. 7). The PEA3 element has been shown to cooperate with the AP-1 element in regulating TPA stimulation of human (28, 36) and rabbit (37) collagenases. In the human matrilysin and rat stromelysin-1 promoters, the addition of sequences including the PEA3 elements resulted in a modest but significant increase in TPA and EGF stimulation of transcriptional activity (Fig. 7). The difference in the orientation and location of the two PEA3 sites in the human matrilysin as compared with the rat stromelysin genes did not appear to significantly affect the responsiveness of this element since the stimulation of the two constructs was very similar (1.9-versus 1.6-fold for EGF stimulation and 2.5-versus 2.7-fold for TPA stimulation). These results are consistent with the suggestion that the PEA3 element cooperates with the AP-1 element in the matrilysin and rat stromelysin promoters to mediate TPA and EGF responsiveness in a manner similar to that shown previously for several other metalloproteinase family members.
The addition of upstream matrilysin promoter sequences to the AP-1-and PEA3-containing construct did not result in any enhancement of EGF or TPA stimulation of transcriptional activity (Fig. 7). This is in contrast to the results observed with rat stromelysin, where the p-754TRCAT construct demonstrated a significant increase in both EGF and "PA inducibility compared with the p-208TRCAT construct (Fig. 7). The rat stromelysin promoter therefore contains additional EGF-and TPA-responsive elements between positions -208 and -754 that are lacking in the corresponding matrilysin sequences. The existence of upstream tumor promoter-and growth factorresponsive elements has been previously reported in the human stromelysin gene. Buttice et al. (48) suggest that upstream elements are required for TPA induction of human stromelysin gene expression, and  have demonstrated that TPA and interleukin-1 induction of this gene requires the AP-1 site in cooperation with an upstream regulatory sequence containing at least two distinct factor-binding sequences. Although our data suggest that the matrilysin and rat stromelysin promoters differ in the existence of upstream regulatory element(s), it is likely that additional EGF-and TPA-responsive sequences are located distal to position -933 in the matrilysin promoter since matrilysin mRNA was induced 2-5-fold with EGF and 6-9-fold with TPA ( Fig. 61, whereas only 1.6-and 2.3-fold inducibility was observed with the p-933HPCAT construct (Fig. 7). However, we cannot rule out the possibility that either post-transcriptional regulation of the matrilysin mRNA or tissue-specific transcription factors are responsible for the additional stimulation observed in the colon adenocarcinoma cells. We conclude that, although the AP-1 and PEA3 elements are conserved among the different metalloproteinase family members and contribute to their induction by growth factors and tumor promoters, there are also differences in upstream elements in the matrilysin gene as compared with other family members that may be responsible for "fine-tuning" the expression of this gene.
Matrilysin expression is restricted to specific tissues, and the pattern of matrilysin expression is different from that of stromelysin and some other members of the metalloproteinase family. For example, matrilysin is expressed in the glandular epithelium of the cycling human endometrium, whereas stromelysin-1, -2, and -3 and interstitial collagenase are found in the stromal component of this tissue (13).l Human colon cancer samples express matrilysin in epithelium-derived tumor cells, whereas stromelysin and collagenase mRNAs are detected in a subset of these samples in the surrounding normal stromal tissue (91.' To our knowledge, the simultaneous expression of matrilysin and stromelysin in the same cell or tissue type has not been observed. The molecular mechanism for the tissue-specific regulation of these genes is not known. We have characterized the human matrilysin gene and promoter region in HeLa cells and described both similarities and differences between this gene and rat stromelysin. However, since neither the endogenous matrilysin nor stromelysin gene is expressed in these cells, tissue-specific elements cannot be addressed with this system. Further characterization of the regulation of the matrilysin gene in systems such as the cycling endometrium or colon cancer-derived cells and comparison with the regulation of other members of the metalloproteinase gene family may result in important insights into the molecular mechanisms that regulate the induction of this enzyme and influence its expression in both normal and pathological conditions.