Sequence specificity of human skin fibroblast collagenase. Evidence for the role of collagen structure in determining the collagenase cleavage site.

The sequence specificity of human skin fibroblast collagenase has been investigated by measuring the rate of hydrolysis of 16 synthetic octapeptides covering the P4 through P4' subsites of the substrate. The choice of peptides was patterned after potential collagenase cleavage sites (those containing either the Gly-Leu-Ala or Gly-Ile-Ala sequences) found in types I, II, and III collagens. The initial rate of hydrolysis of the P1-P1' bond of each peptide has been measured by quantitating the concentration of amino groups produced upon cleavage after reaction with fluorescamine. The reactions have been carried out under first-order conditions ([S] much less than KM) and kcat/KM values have been calculated from the initial rates. The amino acids in subsites P3 (Pro, Ala, Leu, or Asn), P2 (Gln, Leu, Hyp, Arg, Asp, or Val), P1' (Ile or Leu), and P4' (Gln, Thr, His, Ala, or Pro) all influence the hydrolysis rates. However, the differences in the relative rates observed for these octapeptides cannot in themselves explain why fibroblast collagenase hydrolyzes only the Gly-Leu and Gly-Ile bonds found at the cleavage site of native collagens. This supports the notion that the local structure of collagen is important in determining the location of the mammalian collagenase cleavage site.

The triple helical region of interstitial collagens is highly resistant to all proteinases except specific collagenases (1,2). In all higher organisms, these collagens are catabolized by the so-called tissue collagenases which have a characteristic and highly specific mode of action. They cleave all three a chains of native types I, 11, and 111 collagens at a single locus by hydrolyzing the peptide bond following the Gly residue of the partial sequences Gly-[Ile or Leu]-[Ala or Leu] located approximately three-fourths from the NH2 terminus (3)(4)(5). It has been pointed out, however, that there are Gly-[Ile or Leu] sequences at other sites within the triple helical domain of these collagens that are not cleaved (3,6). In fact, it is possible to identify a total of 10 different sites in the al(1) chains of chick, rat, and calf skin (7)(8)(9)(10)(11), the a2(I) chain of calf skin (4, 12), and the al(II1) chain of human liver collagens (13) that contain Gly-[Ile or Leu]-Ala sequences that are not cleaved by tissue collagenases in the native collagens.
The basis for the selective hydrolysis of the Gly-[Ile or Leu] bond at the cleavage site must lie in the influence of the surrounding residues in each a chain. On the one hand, it is possible that tissue collagenases have a large active site and a very restrictive sequence specificity so that only the extended sequences found at the cleavage site are recognized by the enzyme. One postulate to explain the cleavage site specificity in terms of collagen sequence has also considered the symmetry in the distribution of imino acids around the scissile bond (14). Alternatively, the sequence surrounding the cleavage site might indirectly endow the scissile Gly-[Ile or Leu] bond with hyper-reactivity by altering the local conformation in this region. For example, a local deficiency of imino acids could destabilize a short segment of the triple helix, allowing the enzyme access to this bond (Refs. 15-18 and references cited therein). Another possibility is that the cleavage site region of collagen adopts a presently unknown, but specific secondary structure distinct from the triple helix that is recognized by the enzyme. It is of fundamental importance to determine whether it is the sequence specificity of tissue collagenases or a local conformational feature of the collagenase cleavage site that is responsible for this unique substrate specificity.
In the present study, the action of human skin fibroblast collagenase on a series of synthetic octapeptides has been investigated. The sequences of these peptides have been specifically chosen with reference to those in the potential, but non-cleavable, collagenase cleavage sites in native rat, calf, and chick type I and human type I11 collagens. By quantitating the effects of single amino acid substitutions on the rates of hydrolysis of these synthetic octapeptides, definitive information on the sequence specificity of the enzyme has been obtained. The results of these single substitutions on the hydrolysis rates allow us to assess the degree to which the sequence specificity of this enzyme alone determines the location of cleavable sites in native collagens.

EXPERIMENTAL PROCEDURES
Materials-Procollagenase was purified to homogeneity from serum-free cultures of human skin fibroblasts in a three-step procedure involving consecutive chromatography over zinc-chelate-sepharose, heparin-Sepharose, and Ultrogel AcA 44 columns.' The enzyme consisted of a 57/54-kDa doublet and was free from gelatinase activity. Fmoc-Arg(Mtr)2 and Fmoc-His(Boc)OPfp were purchased from ' H. Birkedal-Hansen, B. Birkedal-Hansen, R. E. Taylor, and H. Y. Lin, manuscript in preparation.

6221
Cambridge Research Biochemicals. All other Fmoc-amino acids, except for Fmoc-Hyp (see below), were purchased from Bachem. Alkoxybenzyl alcohol resin and 1-hydroxybenzotriazole were obtained from Vega-Fox Biochemicals, Tucson, AZ. Dicyclohexylcarbodiimide and 9-fluorenylmethyl chloroformate were purchased from Aldrich. N-Hydroxysuccinimide, Hyp, dansyl chloride, and fluorescamine were purchased from Sigma. Tricine was obtained from Behring Diagnostics and high pressure liquid chromatography-grade acetonitrile from American Scientific Products. Constant boiling HCl and Sequanalgrade trifluoroacetic acid were purchased from Pierce Chemical Co.
Peptide Synthesi.s-Al1 peptides except for 7 and 12 were synthesized using the solid-phase method with Fmoc-blocked amino acids according to the procedures described by Stewart and Young (19). Peptide 7 was synthesized by the same procedure, except that the coupling reaction with Fmoc-Arg(Mtr) was allowed to proceed for 8 h and removal of the intact peptide from the resin was carried out using thioanisole as a scavenger (20). Peptide 12 was prepared using Fmoc-His(Boc)p-alkoxybenzyl alcohol resin, whose synthesis has not been reported previously and is described below.
Fmoc-His(Boc)-OPfp was coupled to the alkoxybenzyl alcohol resin by a variation of the method of Wang (21). Resin (0.591 g, 0.52 mmol) was suspended in CH2Clp (20 ml) at 4 'C and Fmoc-His(Boc)-OPfp (1 g, 1.55 mmol) and 4-dimethylaminopyridine (0.194 g, 1.55 mmol) added. The mixture was stirred for 30 min at 4 "C and 8 h at 25 "C after which the resin was washed several times with CHzCl2 and resuspended in 20 ml of CHzClz at 4 "C. It was treated with 0.376 ml of pyridine and 0.54 ml of benzoyl chloride at 4 "C for 15 min. The resin was collected by filtration, washed with CHzCIP and dimethyl formamide, and lyophilized to yield 0.624 g. The Fmoc-His(Boc) content was determined to be 0.33 mmol/g of resin by spectrophotometric analysis (22).
Fmoc-Hyp, also synthesized here for the first time, was prepared by a variation of the method of Chang and associates (23). Hyp (15.3 mmol) was dissolved in 10% Na2C03 (30 ml) and cooled in an ice bath. Dioxane (10 ml) was added, followed by the slow addition of a solution of 9-fluorenylmethyl chloroformate (15.5 mmol in 23 ml of dioxane). The mixture was first stirred for 1 h at 0 "C and then for 20 h at 25 'C. The reaction mixture was poured into ice-water (500 ml) and extracted twice with ether. The aqueous layer was chilled in an ice bath and acidified with 1 N HCl to pH 2.0. The precipitating oil was taken up in ether and washed with 0.1 N HCl and H20. The organic phase was dried over MgS04 and evaporated to dryness. The ensuing residue was recrystallized from CH2C12/petroleum ether to yield 3.55 g (10.0 mmol, 71%) of Fmoc-Hyp (m.p. 151-154 "C).
Peptide Characterization-All peptides were purified by high pressure liquid chromatography using a Beckman instrument equipped with a semipreparative Altex Ultrasphere ODS 5-pm reverse-phase column (10 X 250 mm). Peptides were eluted isocratically with 15% acetonitrile (Baker, HPLC grade)/H20 containing 0.1% Sequanal grade trifluoroacetic acid and recovered by lyophilization. Amino acid compositions were determined with a Dionex Model D-300 Analyzer after hydrolysis in constant boiling HCI at 110 "C for 22 h. Prior to hydrolysis, the samples were repeatedly freeze-thawed and degassed under high vacuum. The amino acid compositions of all peptides were within experimental error of the theoretical values.
Kinetic Measurements-Prior to all assays, procollagenase was activated by incubation with assay buffer (50 mM Tricine, 10 mM CaC12, 0.2 M NaC1, pH 7.5) containing 0.65 mM p-chloromercuribenzoate in a final volume of 100 pl for 1 h at 30 "C. The reactions were carried out at 30 "C in Microfuge tubes by addition of 25 p1 of 1.0 mM peptide dissolved in assay buffer to give an enzyme concentration of 0.56 p M and a substrate concentration of 200 pM. The initial rate of hydrolysis (u;) of all synthetic peptides was determined by measuring the appearance of amino groups. At various time intervals, 12.5-pl aliquots of the incubation solution were withdrawn and added to Microfuge tubes containing 37.5 pl of 1.4 mM 1,lO-phenanthroline to quench the reaction. After diluting to 1 ml with assay buffer, a 100pl aliquot was withdrawn from each tube and added to culture tubes containing 1.8 ml of assay buffer. Each tube was then agitated while 200 pl of 21.6 mM fluorescarnine (in acetone) was added.
The relative concentration of amino groups was measured fluorimetrically using a Perkin-Elmer LS-5 Fluorometer using excitation at 387 nm and monitoring emission at 480 nm. Initial rates were obtained from plots of fluorescence uersus time using only data points corresponding to less than 40% of full hydrolysis by dividing the initial slopes by the fluorescence change corresponding to full hydrolysis and multiplying by 200 p~ to give units of micromolar/h.  [E0] was measured spectrophotometrically using tm = 6.8 X lo' M" cm" which assumes a molecular weight of 51,929 for procollagenase (24).
The site of hydrolysis of all peptides was determined by dansylation of the reaction products followed by high pressure liquid chromatographic analysis. Aliquots (100 pl) of the reaction mixtures were removed at 0 and 48 h and reacted with 100 p l of an 8 mM solution of dansyl chloride in acetonitrile. The samples were applied to a Rainin Microsorb C I~ 5-pm column (4.6 X 250 mm) and eluted with a linear gradient prepared by mixing 50 m M sodium phosphate, pH 6.5, and 50% acetonitrile in water. In all cases, only two peptides were observed. The site of hydrolysis was determined by comparing the retention time of the two peptides with that of a dansylated tetrapeptide of known composition that corresponded to either the first or last 4 residues of the substrate. All substrates were hydrolyzed exclusively at the P1-P,' bond.
The kinetic parameters for the hydrolysis of rat tendon type I collagen by fibroblast collagenase were measured at 30 "C in the same buffer by using a newly developed radioassay (25,26).

RESULTS
The choice and placement of the amino acids in the peptides that have been synthesized have been guided by sequences found in types I, 11, and I11 collagens, all of which are substrates for human skin fibroblast collagenase. The whole or partial sequences surrounding the tissue collagenase cleavage sites in the al(I), &!(I), al(II), and al(II1) chains of several collagens have been determined (3-7, 10, 13, 27, 28) and are listed in Table I. For those cases in which the entire chain has been sequenced, the residues that form the scissile bond have been numbered, where the number indicates the position in the chain starting from the Gly residue at the NH2 terminus of the first Gly-X-Y triplet that is believed to be the start of the triple helical region. For the al(1) and a2(I) chains, this first triplet is taken to be Gly-Pro-Met (4, 7, 10) while for the al(II1) chains it is assumed to be Gly-Tyr-Hyp (13). The labeling of the subsites of the substrate (P, . . . P,') shown in Table I is that of Schechter and Berger (29). In the al(I), a2(I), and al(II1) chains, either a Gly775-Ile776 or Gly775-Le~776 bond is hydrolyzed. It is known that tissue collagenases cleave a Gly-Ile bond in al(I1) chains, but not enough of the sequence has been completed to establish the location in the chain. In all of these cases, the scissile bond is followed by either an Ala or a Leu residue. Thus, the sequences of the chains of types I, 11, and 111 collagens immediately surrounding the tissue collagenase cleavage site are Gly-[Ile or Leu]-[Ala or Leu].
The complete sequences of several collagen chains have been examined in order to locate loci that are not in the collagenase cleavage site, but which contain the partial sequences Gly-[Ile or Leu]-[Ala or Leu]. Examination of the partial sequence of the al(1) chain from rat (10,11) and of the complete sequences of the al(1) chains from calf and chick skin (7-lo), the a2(I) chain from calf skin (4, 12), and the cul(1II) chain from human liver (13) reveals 10 loci that contain either the Gly-Ile-Ala or Gly-Leu-Ala sequences that are not cleaved by tissue collagenases. The sequence of 8 residues centered on each of these sites is listed in Table 11, where amino acid substitutions relative to the cleavage site in the d ( 1 ) chain of chick or calf skin are underlined. These sequences span the P4 through P4' subsites of these potential substrates. No loci containing the Gly-Ile-Leu or Gly-Leu-Leu sequences were found. By comparing the 10 sequences

TABLE I1
Types I and III collagen sequences containing the Gly-Ile-Ala or Gly-Leu-Ala triplets but not cleaved by tissue collagenases Amino acid substitutions relative to the al(1) chain of chick or calf skin collagen are underlined.

Collagen chain
Ref. listed in Table I1 to that of the cleavage site in the d ( 1 ) chain, 13 substitutions have been identified Ala, Asn, or Leu for Pro in subsite Ps; Hyp, Asp, Val, Leu, or Arg for Gln in subsite P,; Leu for Ile in subsite P1', and Ala, Pro, His, or Thr for Gln in subsite P4'. In order to investigate the specificity of fibroblast collagenase toward these sequences, a series of octapeptides (2-14), each constituting a single substitution relative to Gly-Pro-Gln-Gly-Ile-Ala-Gly-Gln (l), have been synthesized. Two peptides (15 and 16) with multiple substitutions were also prepared. Earlier work by Nagai and associates (30,31) indicated that an octapeptide is a suitably sized synthetic substrate for studies with a vertebrate collagenase.

Rat
The proper kinetic parameter with which to assess sequence specificity is kcat/KM. In an initial series of kinetic measurements, the initial rates of hydrolysis of several peptides were measured as a function of substrate concentration and kcat and KM values were determined individually from doublereciprocal plots. The maximum substrate concentrations (approximately 2 mM) studied were limited by the solubilities of the peptides. An example of a double-reciprocal plot is shown in Fig. 1A for peptide 1 to demonstrate that the plot is linear with no kinetic anomalies and that Michaelis-Menten kinetics are obeyed. For this substrate, kcat and KM are found to be 730 h" and 3.3 mM, respectively. The parameters for this and six other peptides are summarized in Table 111. In all cases, the KM values are high (>2 mM) and the kcat values fall in the 260-1200 h" range. Thus, these parameters were determined from experiments in which the substrate concentration did not exceed K M and the accuracy of the numbers listed in Table  I1 must be viewed in the light of this limitation.
Since the KM values for the hydrolysis of these octapeptides by fibroblast collagenase are uniformly high, accurate values of kCat/KM for all 16 peptides were determined by measuring the initial rates at a substrate concentration of 0.2 mM at

Kinetic parameters for the hydrolysis of octapeptides by human skin fibroblast collagenase
Amino acid substitution relative to the d(I) chain of chick or calf skin collacen are underlined.   Table 11, however, differ from the cleavable sequences by two or three substitutions. Only 15 and 1 6 are directly comparable to these noncleavable sequences. In order to estimate the effects of the multiple substitutions shown in Table I1 on the rates of hydrolysis of these sequences relative to that of peptide 1, it is of interest to determine whether the effects of the single substitutions on the rates are independent of one another ( i e . the effects on the rates are multiplicative). From transition state theory, this amounts to assuming that the alterations of A@ for the reaction produced by each amino acid substitution are additive.

Peptide Substrate
To test this assumption, the rates of hydrolysis of the multiply substituted peptides 15 (Leu for Gln in subsite P, and Pro for Gln in subsite P4') and 16 (Hyp for Gln in subsite Pp, Leu for Ile in subsite PI', and Ala for Gln in subsite P4') have been determined. The relative hydrolysis rates predicted for peptides 15 and 16 based on the assumption that the effects of single substitutions are multiplicative are 120 and 12% of that of 1 and are in reasonably good agreement with the measured values of 120 and 16%, respectively. Thus, this assumption appears to be reasonable and has been used to calculate the relative rates of hydrolysis of the noncleavable sequences in collagen listed in Table 11. The results are expressed relative to the cleavable sequence from the al(1) chain (peptide 1) and are summarized in Table V.
The calculated effects of these multiple substitutions range from changes of 1.2 to 120% of the rate observed for peptide 1. Based on these results, ityis apparent that sequence specificity alone cannot explain the substrate specificity of human skin fibroblast collagenase toward native types I and I11 collagens. If the sequence specificity of the enzyme was the sole factor in determining the location of the cleavage site in collagens, the data in Table V indicate that (Y chains in at least four of the 10 loci would be hydrolyzed at a rate that is at least 10% of the rate for the cleavable sequence and the other six at more than 1.2% of this rate. In fact, there is no evidence to indicate that the loci listed in Table I1 are cleaved at all in intact, triple helical collagens. Thus, the sequence specificity of the enzyme is not restrictive enough to account for its hydrolysis of native interstitial collagens at a single site. Instead, it appears that some local conformational feature of collagen at the cleavage site controls the specificity of this enzyme, since potentially susceptible sequences that are cleaved in small synthetic peptides are protected in the native collagens. In order to provide a point of reference for the kinetic parameters measured here for the octapeptides, the values for the hydrolysis of soluble rat tendon type I collagen have also been determined. Using a newly developed soluble radioassay (25,26), the double-reciprocal plot shown in Fig. 1 2 3    Gly-Pro-Gln-Gly-Ile-Ala-Gly-Gln (1) 3300 730 0. 22 30 This study Gly-Pro-Gln-Gly-Leu-Ala-Gly-Gln (10) Gly-Pro-Leu-Gly-Ile-Ala-Gly-Gln (5) 3600 1200 0. 33 30 This study genase cleavage site. For comparison, the parameters for the hydrolysis of types I and I11 collagens (32) and the al(I), a2(I), and d(II1) gelatins (33) taken from the literature are listed in Table VI. Also shown are the data for the octapeptides with sequences that most closely resemble those of the cleavage sequence in the al(I), a2(I), and al(II1) chains of types I and I11 collagens. It is clear from these data that the kccet/KM values for the hydrolysis of collagens and gelatins by fibroblast collagenase are much higher than those for the octapeptides primarily because K M is much lower. Thus, there is clearly a major difference in the way the enzyme binds to a cleavable sequence when it is embedded in an intact a chain in either gelatin or collagen compared to an octapeptide.

DISCUSSION
This study presents data on the sequence specificity of human skin fibroblast collagenase toward synthetic octapeptides. The 16 peptides that have been synthesized have been carefully chosen with reference to loci found in types I and I11 collagens that contain the Gly-Ile-Ala or Gly-Leu-Ala sequences in subsites PI-P2', but which are not cleaved by tissue collagenases. Octapeptide 1 has a sequence identical to the portion of the d ( 1 ) chain that is hydrolyzed by tissue collagenases. All of the other peptides differ from 1 by single amino acid substitutions, except for 16 and 16 which differ by 2 and 3 amino acids, respectively. The kc,,/& values are markedly influenced by amino acid changes at all four of the subsites where substitutions occur in collagen (P3, Pz, PI', and P4'). This implies that fibroblast collagenase has an extended active site, although the possibility that the active site is small and that these substitutions change the rates by altering the conformation of the substrate cannot be ruled out.
Among the substrates studied here, Pro is the preferred amino acid in subsite P3. Substitution of Ala lowers the rate 2-fold, but when Leu or Asn are at this subsite, the rates are greatly reduced. Substitution of Leu for Gln at subsite Pz creates a more reactive substrate. All other substitutions at Pz, whether hydrophobic (Val) or hydrophilic (Asp, Arg), cause a significant decrease in kcat/&. Hyp causes the largest (nearly 10-fold) decrease. Substitution of Leu for Ile at subsite P,' results in a slightly better substrate. Substitution of Thr or Pro for Gln at subsite Pq' also increases the rate. Other substitutions (His, Ala) at this subsite have little effect.
The rates of hydrolysis of the octapeptides that correspond to the noncleavable sequences found in native types I and I11 collagens vary markedly (Table V). Yet these rates are unable to account for the observation that these loci are not susceptible to hydrolysis in native, triple helical collagens. While it is often stated that these loci are totally impervious to attack in collagen, a reasonable estimate based on experimental observation is that these loci are hydrolyzed minimally 100fold more slowly than the primary cleavage site at temperatures well below denaturation (e.g. below 30 "C). Thus, the data in Table V show that the rates of hydrolysis predicted based solely on the sequence specificity of fibroblast collagenase vary too little to explain why these sites are not cleaved in the native collagens. Thus, it is clear that the local structure of collagen near the cleavage site must be a major factor in directing its own proteolysis.
Welgus and associates (33) have provided convincing evidence for this supposition from a different line of experimentation. They showed that human skin fibroblast collagenase makes multiple proteolytic cleavages in types I, 11, and I11 gelatins at Gly-[Ile or Leu]-Y-Gly loci, where Y can be many residue types. These experiments clearly show that at least several loci (some of which are probably those listed in Table  11) that are hydrolyzed by the enzyme in gelatin are protected in native collagen. The data presented here confirm that fibroblast collagenase is capable of cleaving many loci containing the Gly-[Ile or Leu]-Ala sequences. More importantly, however, our data show that the identity of the amino acids directly adjacent to this core does not play a dominant role in determining the location of the cleavage site in native collagen through sequence specificity.
The foregoing conclusions suggest that the sequence surrounding the noncleavable loci determines their susceptibility to collagenase in native collagens indirectly by altering the local conformation of collagen itself. Since all of the noncleavable sequences identified have the repeating Gly-X-Y primary structure, all have the potential to adopt the traditional triple helix. The hypothesis has been put forth, however, that there are "locally unstable" regions of the triple helix brought about by a local deficiency of imino acids (15). The presence of a cleavable sequence in an unstable region could expose the scissile bond to the enzyme and account for the observed specificity. However, the local imino acid content of the collagen chains surrounding the noncleavable sequences listed in Table I1 are not very different from that in the cleavage r e g i~n .~ Thus, this local deficiency per se cannot be the sole basis for the specificity of the enzyme toward native collagens. Apparently, there is a presently unrecognized local conformational feature of collagen that endows the cleavage region with hyper-reactivity toward collagenase.
It is of interest to compare the kinetic parameters for the hydrolysis of the al(I), a2(I), and al(II1) chains in both collagen and gelatin with those for the octapeptides which have homologous sequences. While all of the desired data are G. B. Fields, K. A. Mookhtiar, and H. E. Van Wart, unpublished data. not available, the closest set that can be assembled to date is summarized in Table VI. Several problems arise with respect to interpretation of these data. First, not all of the parameters were obtained under the same conditions or with collagen types from the same source. Second, the parameters for the hydrolysis of the al(1) and al(II1) chains in gelatin were obtained by measuring the rate of disappearance of these chains as the result of hydrolysis at multiple sites. Thus, the parameters do not reflect a single proteolytic event. Third, the different sequences reported for the a2(I) chains (Table  I) makes direct comparison with a single octapeptide problematic. In spite of these limitations, one trend is clear. It is obvious that the KM values of fibroblast collagenase toward the disordered al(1) and a2(I) gelatin chains are at least 40fold lower than those for peptides 1 and 10 which contain similar cleavage sequences, respectively. The kcat values, however, are similar in magnitude. This makes the kat/KM values much greater for the gelatin chains. It can also be seen that the KM values toward the native collagens are even lower than for the gelatins.
It is not immediately obvious why the K M values of collagenase are so much lower for the long (-1000 residue) gelatins and collagens. It is possible that fibroblast collagenase binds nonproductively to many sites along these substrates, thus lowering KM compared to the octapeptides. Another possibility is that binding to these substrates is enhanced because of their secondary structures. The long gelatin chains that are generally rich in imino acids might induce the sequences around the cleavage sites to adopt the poly-Pro(I1) secondary structure and this might enhance binding. The triple-helical structure of the native collagens might likewise be a specific recognition feature. It will be necessary to examine the action of collagenase on longer substrates with alternate secondary structures to investigate these possibilities.