Proteomics Discovery of Metalloproteinase Substrates in the Cellular Context by iTRAQ™ Labeling Reveals a Diverse MMP-2 Substrate Degradome*S

Elucidation of protease substrate degradomes is essential for understanding the function of proteolytic pathways in the protease web and how proteases regulate cell function. We identified matrix metalloproteinase-2 (MMP-2) cleaved proteins, solubilized pericellular matrix, and shed cellular ectodomains in the cellular context using a new multiplex proteomics approach. Tryptic peptides of intact and cleaved proteins, collected from conditioned culture medium of Mmp2−/− fibroblasts expressing low levels of transfected active human MMP-2 at different time points, were amine-labeled with iTRAQ™ mass tags. Peptide identification and relative quantitation between active and inactive protease transfectants were achieved following tag fragmentation during tandem MS. Known substrates of MMP-2 were identified thereby validating this technique with many novel MMP-2 substrates including the CX3CL1 chemokine fractalkine, osteopontin, galectin-1, and HSP90α also being identified and biochemically confirmed. In comparison with ICAT-labeling and quantitation, 8–9-fold more proteins and substrates were identified by iTRAQ. “Peptide mapping,” the location of multiple peptides identified within a particular protein by iTRAQ in combination with their relative abundance ratios, enabled the domain shed and general location of the cleavage site to be identified in the native cellular substrate. Hence this advance in degradomics cell-based screens for native protein substrates casts new light on the roles for proteases in cell function.

protease activity might modify signaling networks and cell function it is essential to identify protease substrates and to understand the functional consequences of normal and dysregulated proteolysis in pathology on the proteome (1). This will also facilitate the identification of candidate protease drug targets in disease. Despite the importance of this, native substrate discovery screens are few. With 566 members of the human protease degradome, most of which have only recently been genomically identified (2), elucidating the substrate repertoire, termed the substrate degradome, of each protease requires new high content techniques (3).
Traditionally matrix metalloproteinases (MMPs) 1 were simply thought to degrade all components of the extracellular matrix, an important role nonetheless and one that has implicated this family of 23 proteinases in many pathologies such as arthritis, inflammatory bowel disease, and cancer metastasis (4). For example, the gene for matrix metalloproteinase 2, MMP-2 (also known as gelatinase A), a secreted protease, is one of only four genes in the genomic signature associated with the most highly virulent breast cancer metastases in the lung (5), but its role here is not clear. Recent data have shown the substrate degradome of MMPs to be far more extensive than matrix components alone, and rather than degradationto-completion, MMPs precisely process a large number of bioactive molecules involved in cell adhesion and migration (6), angiogenic switching (7), cell growth (8,9), and regulation of innate immunity (10,11). In view of these important new roles elucidated through the discovery of new substrates, MMPs have been proposed to be key homeostatic regulators of the extracellular environment both in terms of the extracellular matrix and the signaling milieu controlling cell function (12). By these two key global roles, MMPs can modulate cell function in many normal physiological processes and, in so doing, buffer against pathological changes. Conversely MMPs might effect pathological disruption of tissues through dysregulated expression leading to highly elevated activity and new substrate degradomes accordingly. Hence the roles of MMPs when expressed at normal and highly elevated levels are important to understand and characterize to elucidate the function of MMPs as drug targets and antitargets (1).
Most techniques for the identification of protease substrates involve artificial systems that cannot be adapted to cell-based analysis. Bioinformatics searches for cleavage site sequences of individual proteases determined by phage display and peptide libraries (13)(14)(15) lead to the identification of relatively few natural substrates and a large number of false positives. This is because exosite interactions cannot yet be predicted (6,16), preferred cleavage sites are not always used for proteolysis, and most protease substrates are in the native folded state and not denatured. Cleavage patterns of native substrates in vivo can also differ from proteolysis of substrate candidates in vitro (9) due to restricted access of the protease to the substrate in vivo and the presence of ancillary binding proteins, cofactors, and cell receptors that might not be modeled in in vitro screens. This points to the need to screen for native substrates in the cellular context. Toward this difficult goal Guo et al. (17) identified shed proteins on SDS-PAGE gels that were isotopically labeled, and Hwang et al. (18) utilized two-dimensional PAGE to identify plasma protein substrates of membrane type 1 matrix metalloproteinase (MT1-MMP). Gel standardization was improved by Bredemeyer et al. (19) who applied fluorescence two-dimensional DIGE to identify substrates for granzyme B in mouse lymphoma cell lysates. Utilizing cysteine-targeted ICAT labeling of proteins and MS/MS, Tam et al. (9) proteomically identified novel substrates of MT1-MMP in the cellular context using MDA-MB-231 human breast cancer cells. However, ICAT only labels Cys-containing peptides, reducing proteome coverage, and commonly relatively few peptides are labeled and identified per protein, thereby reducing confidence in protein identification and quantitation. Several techniques are in development for the proteomics identification of proteolytically generated neo-N termini (15,20,21) that can be applied to protease substrate identification, such as for caspases (21).
Here we describe the development and implementation of a new proteomics approach using isobaric tag labeling in a cell-based screen to improve proteome coverage, protein identification, and relative quantitation for the system-wide analysis of the effects of a transfected protease on the cell proteome. The data obtained can be further interrogated as a screen for protease substrates. An amine-targeted iTRAQ TM tag labels tryptic peptides generated from the proteins and protease cleavage products of secreted proteins, protein domains shed from the cell membrane, or pericellular matrix of protease-transfected cells that accumulate in the conditioned medium; a second iTRAQ tag is used for control cells. MS/MS fragmentation enables sequencing of the pooled pairs of differently labeled but identical peptides and generates a low mass signature ion peak unique for each label. This signature ion peak identifies the peptides originating from the proteasetransfected or control cells; comparison of the peak areas enables relative quantitation. With four unique iTRAQ tags up to four experimental conditions, such as time courses, drug doses, or cellular replicates, can be analyzed simultaneously.
Using this strategy we identified multiple changes in the extracellular proteome that were induced by low levels of active MMP-2 expressed in the cellular context. Proteolytic modification of signaling pathways led to the altered expression of many proteins. In addition, we could determine whether proteins were proteolytically shed from the cell membrane or pericellular matrix or whether secreted proteins were degraded by MMP-2. We identified known MMP-2 substrates and candidate bioactive and extracellular matrix substrates that were confirmed in secondary assays thereby validating this proteomics screen for the discovery of native protease substrates in the cellular context. With the same cell transfectants, we compared iTRAQ with ICAT labeling and found a 9-fold increase in the number of proteins identified by iTRAQ, an 8-fold increase in known substrates, and a 5-fold increase in substrate candidates. Furthermore we found that analysis of the relative abundance ratios of iTRAQ-labeled peptides within proteins in relation to their location in the protein structure correlated with the site of cleavage and domain shed from the cell surface or pericellular matrix.
Protein Concentration-The conditioned medium was acidified with TFA (0.1%, v/v) and applied to C 4 and C 18 solid phase extraction cartridges (VYDAC) equilibrated with 0.1% TFA and connected in tandem. After application of medium (30 ml), the cartridges were separated and washed with 5% ACN, 0.1% TFA to remove bound riboflavin, a yellow culture medium additive (55). The cartridges were re-equilibrated with 0.1% TFA and reconnected in tandem. After application of up to 240 ml medium in 30-ml aliquots, the cartridges were separated, and the bound proteins were eluted with 1 ml of 75% ACN, 0.1% TFA. The eluate was concentrated (100 l) by centrifugation under vacuum and diluted to 1 ml with 50 mM Hepes, pH 8.0, and the protein concentration was determined by BCA assay (Pierce).
iTRAQ Labeling-Protein (100 g) from ⌬ppMMP-2 and ⌬ppE375AMMP-2 transfectants was acetone-precipitated overnight at Ϫ20°C and resuspended in 30 l of iTRAQ Dissolution Buffer (Applied Biosystems, Foster City, CA). iTRAQ Denaturant, with 2% SDS, was used to completely dissolve the precipitate. Proteins were first reduced with 3.5 mM tris(2-carboxyethyl)phosphine hydrochloride at 60°C for 1 h, cysteines were then blocked with 6.7 mM methyl methanethiosulfonate at room temperature for 10 min, and then proteins were digested overnight at 37°C with sequence grade modified trypsin (Promega) (1:10) in 0.5 M triethylammonium bicarbonate (80 g/ml). Digests were dried by centrifugation under vacuum, resuspended in 0.5 M triethylammonium bicarbonate (30 l), and aminolabeled with one of the four iTRAQ (Applied Biosystems) mass tags at 25°C 1 h, and then equal amounts were pooled (55).
Multidimensional Liquid Chromatography-iTRAQ-or ICAT-labeled samples were diluted to 2 ml with 10 mM KH 2 PO 4 , pH 2.7, 25% ACN before HPLC on a Polysulfoethyl A (Poly LC, Columbia, MD) 100 ϫ 4.6-mm, 5-m, 300-Å strong cation-exchange column at 0.5 ml/min. The column was allowed to equilibrate for 20 min in 10 mM KH 2 PO 4 , pH 2.7, 25% ACN before a 30-min gradient was applied to 35% 10 mM KH 2 PO 4 , 25% ACN, 0.5 M KCl with 1-min fractions collected. These were then reduced in volume by centrifugation under vacuum, and each was injected in 95% solvent A (2% ACN, 0.1% TFA) and allowed to equilibrate on the trapping column for 10 min to wash away any contaminants. Upon switching in line with a QStar Pulsar mass spectrometer (Applied Biosystems), a linear gradient from 95 to 40% solvent A was developed for 40 min. In the following 5 min the composition of the mobile phase was increased to 80% solvent B (98% ACN, 0.1% TFA) before decreasing to 95% solvent A for a 15-min equilibration before the next sample injection (55).
Mass Spectrometry-MS data were acquired automatically using Analyst QS 1.0 software Service Pack 8 (Applied Biosystems/MDS Sciex, Concord, Canada). An information-dependent acquisition method consisting of a 1-s TOF MS survey scan of mass range 400 -1,200 or 300 -1,500 amu and two 2.5-s product ion scans of mass range 100 -1,500 amu. The two most intense peaks over 20 counts with a charge state of 2-5 were selected for fragmentation, and a 6 amu window was used to prevent the peaks from the same isotopic cluster from being fragmented again. Once an ion was selected for MS/MS fragmentation it was added to an exclusion list for 180 s. Curtain gas was set at 23, nitrogen was used as the collision gas, and the ionization tip voltage was 2,700 V. If the A 215 was greater than 0.1 for any fraction collected during the strong cation-exchange fractionation a 2.5-h gradient (95-40% solvent A) was used to compensate for the higher peptide concentration in that fraction.
Data Analysis-Ratios of the 114.1, 115.1, 116.1, and 117.1 amu signature mass tags generated upon MS/MS fragmentation from the iTRAQ-labeled tryptic peptides were calculated using ProQuant software (Version 1.0) (Applied Biosystems) in Analyst. The MS and MS/MS tolerances were set to 0.2 Da. The Mass Spectrometry Protein Sequence Database (July 13, 2005) (Imperial College, London, UK) or National Centre for Biotechnology Information non-redundant protein database were used for searching iTRAQ-and ICAT-identified peptides. Methyl methanethiosulfonate modification of cysteines was used as a fixed modification, and one missed tryptic cleavage was allowed. All results were written to a Microsoft Access database. To reduce protein redundancy, experimental software ProGroup viewer (1.0.6, Applied Biosystems) was used to assemble and report the data. All proteins identified at Ն99% confidence were then manually reconfirmed using the Swiss-Prot sequence database.
ICAT ratios between isotopically heavy and light tryptic peptides were calculated using PROICAT (Applied Biosystems) software and averaged if multiple peptides for a single parent protein were found. Peptides that contained an Arg or Lys amino acid within the fragment (incomplete tryptic digest) or had a 'confidence level' below 99% were removed. Protein identification was as described for iTRAQlabeled peptides.
Substrate Cleavage Assays-The concentration of active MMP-2 after p-aminophenylmercuric acetate activation (1 mM, 15 min) was determined by active site titration against TIMP-2 (24). Active MMP-2 was incubated with the candidate substrates in 50 mM Tris-HCl, 200 mM NaCl, 5 mM CaCl 2 , and 0.025% NaN 3 for 16 h at 37°C. Reaction products were analyzed by Tris-glycine or Tris-Tricine SDS-PAGE and Western blotted or silver-stained. The mass of each cleavage product was determined following MALDI-TOF MS on a Voyager-DE TM STR Biospectrometry Workstation (Applied Biosystems). MS data were deconvoluted to identify the substrate cleavage sites and confirmed by Edman sequencing.

RESULTS
Active MMP-2 Expression-We proteomically examined the native extracellular proteome molded by a secreted protease, MMP-2, in the cellular context. Mmp2 Ϫ/Ϫ murine embryonic fibroblasts were transfected with human MMP-2, an experimental design selected to improve the signal to noise due to the non-MMP-2 exposed proteome in the control cells. MMP-2, like all MMPs, is expressed as an inactive zymogen. Proteolytic removal of the propeptide, leading to active MMP-2, is initiated in vivo by active MT-MMPs, in particular MT1-MMP, in a TIMP-2-dependent manner (22,26,27). Experimentally all pro-MMPs can be activated by organomercurials such as APMA, but APMA is cytotoxic. Concanavalin A induces the cellular activation of MMP-2 but also elevates the expression of multiple MMPs including the MMP-2 activator MT1-MMP (25,26,28). Transfection of MT1-MMP leads to MMP-2 activation (22), but proteomics analysis of such a system will also reveal the effects of MT1-MMP activity on the proteome as we showed previously (9). Therefore, to simplify the system we deleted by protein engineering the MMP-2 propeptide (⌬ppMMP-2) and compared the secreted and shed proteome of these cells with that conditioned by the catalytically inactive ⌬ppE375AMMP-2 mutant. Hence this enables the study of active MMP-2 proteolysis in the absence of active MT1-MMP or artificial agents such as concanavalin A, which also induces apoptosis and hence many changes in the cellular proteome.
Equivalent amounts of ⌬ppMMP-2 and ⌬ppE375AMMP-2 protein expressed in the transfectants were shown by Western blotting (Fig. 1A). Zymography confirmed MMP-2 activity and its absence in ⌬ppE375AMMP-2 or vector controls (Fig.  1B). TIMP-2, a regulator of MMP-2 activity, did not change in expression (Supplemental Table S1). In aggressive breast carcinoma lung metastases highly elevated MMP-2 expression is a hallmark feature (5) as it is for many other cancers and other pathologies (1, 8, 12, 27, 29 -31). Nonetheless to avoid a system that has unnaturally high enzyme:substrate ratios that might lead to cleavage of nonpreferred substrates in physiological processes (1) we selected a clone that expressed very low amounts of active MMP-2 (136 ng/1 ϫ 10 6 cells/24 h) (Fig.  1D); this level was lower than the level of active MMP-2 in Mmp2 ϩ/Ϫ cells (4,123 ng/1 ϫ 10 6 cells/24 h). Further confirmation that the system was in the physiological range of activity was provided by the levels of active enzyme following APMA activation of the conditioned medium from Mmp2 ϩ/Ϫ cells (Fig. 1D). Notably the levels of activated enzyme in these cells approached the levels naturally expressed by tumors such as fibrosarcoma HT1080 cells (Fig. 1, C and D), and both were ϳ100-fold higher than the levels expressed by the active MMP-2 transfectants. Indeed the activity of this low expression of the transfected protease could not be detected by conventional quenched fluorescent synthetic MMP peptide substrate cleavage activity in the conditioned medium. This also indicates that other MMPs were not active or were expressed at very low levels, which may have confounded our results. Indeed the related gelatinase MMP-9 was present entirely in the latent zymogen form (data not shown).
iTRAQ Analysis-Serum-free conditions were used to in- crease specific iTRAQ labeling of cellular proteins. Cell morphology at the end of experiments was essentially unchanged from that at the start. Data are reported from three separate iTRAQ experiments covering five time points as follows: two multiplex time course cell culture experiments (3 ϩ 24 h and 3 ϩ 48 h; experiments 1 and 3, respectively) and one comparison after 24-h culture (experiment 2). The relative abundance of iTRAQ-labeled tryptic peptides, generated from the conditioned medium proteins in the protease versus control cells, was achieved upon MS/MS fragmentation. Comparison of the areas under the ion peaks of the fragmented 114.1, 115.1, 116.1, and 117.1 amu iTRAQ labels (32, 55) enabled the relative abundance of identical peptides from two to four samples to be determined simultaneously. Total numbers of proteins identified from one or more peptides are presented in Table I; full data sets for every experiment are presented in Supplemental Table S3, A, B, and C. Relative abundance ratios of proteins identified by only one peptide could not be confirmed by proteomics analysis of other peptides, and so only proteins identified by two or more peptides, each with a confidence level Ն99%, were analyzed further.
Abundance ratio trends were very reproducible between experiments and within an experiment. In experiments 1 and 3 only 5 of 347 (1.4%) and 9 of 519 (1.7%) proteins, respectively, showed iTRAQ ratios Ͻ1.0 at 3 h but Ͼ1.0 at 24 or 48 h, but this was likely due to undersampling (33); indeed most of these were identified by only one peptide. Comparing the total number of proteins identified at 48 h in experiment 3 with those also found at 24 h in experiment 1 only 6 of 215 (2.8%) proteins showed inconsistent changes in abundance trends.
Comparing 48-h data with the single 24-h time point in experiment 2, this was 4 of 172 (2.3%), and for proteins identified in all three experiments only 2 of 84 (2.4%) proteins were inconsistent with high ratios in one experiment but low ratios in the other. The number of inconsistencies is further reduced when only the common peptides identified in both experiments are used to generate the mean ratio. For example, peptide 150 -165 of osteonectin was only identified in experiment 1 (ratio of 10.1 at 3 h that increased to 27.6 at 24 h, indicating release of this portion of the molecule to the medium as discussed later), and the other four peptides had a mean of 0.9 at 3 h. This is the same as the mean for these same peptides identified in experiment 3 at 3 h (mean ϭ 0.9).
We hypothesized that cleaved proteins would accumulate in the culture medium if proteolytically shed from the cell surface or released from the pericellular matrix by MMP-2 proteolysis and so would have an iTRAQ ratio Ͼ1.0 for ⌬ppMMP-2:⌬ppE375AMMP-2 (MMP-2:E/A). On the other hand, protein levels would decrease if degraded by MMP-2 or if processed and subsequently cleared by the cell; here the MMP-2:E/A iTRAQ ratio would be Ͻ1.0. Because ribosomal proteins exhibited iTRAQ ratios between 0.4 and 1.9, only proteins identified by two or more peptides with iTRAQ ratios Յ0.4 (5-9%) or Ն2.0 (28 -44%) were considered to have reliably altered abundance levels. Peptides having an iTRAQ ratio Ͼ30 (1-7%) were considered singletons as one peptide of the pair could not be detected above the background noise resulting in an inability to accurately determine the area of such a very low ion peak. A ratio of 30 therefore depicts a large unquantifiable change and indicates a very strong sub-

TABLE I Comparison of the number of proteins identified by one or more tryptic peptides in iTRAQ and ICAT isotopic mass tagging experiments
Shown is the number of proteins identified by iTRAQ-labeled or ICAT-labeled (shaded) tryptic peptides in the conditioned medium from Mmp2 Ϫ/Ϫ fibroblasts transfected with ⌬ppMMP-2 or ⌬ppE375AMMP-2. Conditioned medium proteins were collected at 3 and 24 h (experiment 1), 24 h alone (experiment 2), 3 and 48 h (experiment 3), and 48 h alone (experiment 4). Data are presented from experiments 1 and 2 for the 24-h time points, and experiments 3 (iTRAQ) and 4 (ICAT) for the 48-h time points as indicated. An increased number of protein identifications resulting from peptide ion peak amplification were observed in iTRAQ experiments 1 and 3 due to multiplexing four different samples compared with a single sample pair in experiment 2. A confidence level of Ն99% was the inclusion criterion for identification of tryptic peptides. The number of proteins identified in all multiplex samples by one (1) or by two or more (Ն2) isotopically labeled tryptic peptides is shown. Because more peptides from a protein are labeled with amine-based iTRAQ tags than can be labeled by cysteine-based ICAT labeling, more proteins were identified by Ն2 peptides with iTRAQ. Therefore, the protein identification for iTRAQ is made with a greater degree of confidence in comparison with ICAT. The three iTRAQ experiments consistently identified 9-fold more unique proteins, and by multiple peptides more times, than seen by ICAT using the same transfected cell pairs. From the 519 proteins identified at 48 h by iTRAQ, 36 were also identified by ICAT (Supplemental Table S4B). Full data sets of experiments 1-3 (iTRAQ) and experiment 4 (ICAT), which this table summarizes, are presented in Supplemental Tables S3, A-C, and S4A.

TABLE II
Temporal changes in abundance levels of known and candidate MMP-2 substrates identified by iTRAQ proteomics in conditioned medium from active MMP-2-transfected cells Shown are relative abundance ratios of iTRAQ-labeled tryptic peptides identified in the conditioned medium from Mmp2 Ϫ/Ϫ fibroblasts transfected with ⌬ppMMP-2 (MMP-2) or ⌬ppE375AMMP-2 (E/A). Conditioned medium proteins were collected at 3 and 24 h (experiment 1), 24 h (experiment 2), or 3 and 48 h (experiment 3), and the mean of the MMP-2:E/A ratios for each iTRAQ-labeled peptide at these time points was calculated. A ratio of 1.0 indicates no change in abundance of protein detected in the MMP-2 and E/A samples; ratios Ͼ1.0 indicate relative accumulation of protein in conditioned medium; ratios Ͻ1.0 indicate relative depletion of protein from conditioned medium and were reflected by the smaller numbers of peptides identified per protein due to MMP-2 activity. A confidence level Ն99% was the inclusion criterion for identification of tryptic peptides. The number of unique iTRAQ-labeled tryptic peptides is shown in parentheses with different forms of the same peptide being counted as one peptide; multiple identifications of the same peptide were also counted as a single identity. Peptides having an iTRAQ ratio Ͼ30 were considered singletons as one peptide of the pair could not be detected above the background noise resulting in an inability to accurately determine the area of such a very low ion peak. This ratio depicts a large unquantifiable change and indicates a very strong substrate candidate. Full data sets from which these selected proteins were extracted are presented in Supplemental Tables S3, A e Proteins identified with one unique tryptic peptide at 95% confidence level and one at 99%. f Proteins having highly dispersed peptide ratios and with one or more peptide with a ratio Ն4.0 were subjected to peptide mapping and are presented in Table III.
g Heat shock protein-90␤ is included to highlight the specific response of MMP-2 compared with heat shock protein-90␣. h NCAM-1 was designated as a candidate substrate because other family members were known substrates despite being identified by only one peptide. strate candidate. We analyzed conditioned medium samples from transfectants at 3 h. To improve coverage of proteins present in low amounts, we also compared samples at 24 or 48 h where cleavage products might have further accumulated with the caveat that indirect effects on cells might be more apparent. These include proteolysis by other proteases in the protease web (1) or altered protein synthesis induced by proteolytic switching of signaling circuits.
Known Substrates-To validate the relative quantitation of peptides identified by MS/MS as a bottom-up screen for protease substrates, we first searched for known substrates of MMP-2 in the proteins identified in the conditioned medium of the protease-transfected cells having iTRAQ ratios Յ0.4 or Ն2.0 (Table II). The chemokine monocyte chemotactic protein-3 was identified (10), but most were extracellular matrix substrates including collagens, decorin, and fibronectin. By 24 and 48 h, their cleavage products continued to accumulate from that observed at 3 h in the conditioned medium.
Validation of Substrate Prediction-The known MMP-2 substrates had MMP-2:E/A iTRAQ relative abundance ratios Յ0.5 or Ն3.9 at 24 h (Table II). Therefore, we considered that only proteins identified with a relative abundance ratio Յ0.25 or Ն4.0 were strong MMP-2 substrate candidates. To confirm this hypothesis we first selected such proteins for secondary biochemical validation that were substrates of other MMPs or that have family members that are known MMP substrates (Table II). Cytoplasmic proteins were not considered.
The cell surface galactose-binding protein galectin-1, which is involved in the tumor hypoxia response, exhibited iTRAQ peptide ratios Ͼ4.0 at 3 and 24 h in experiment 1 and ratios Ͼ30 at 3 and 48 h in experiment 3. Galectin-3 is a known substrate of MMP-2 (34) that was also identified here with ratios Ͼ4.0 (Table II). We confirmed MMP-2 cleavage of galectin-1 ( Fig. 2A) near the N and C termini ( 13 PGQ1CLR and 129 KCV1AFD, respectively) by MALDI-TOF MS analysis of the cleavage products (Fig. 2B). This is the first known example of a protease cleavage site with a P 1 Ј cysteine. Notably consensus cleavage sites derived from positional scanning peptide libraries do not include cysteine because it is excluded during peptide synthesis (13)(14)(15), thereby underestimating its contribution in substrate recognition. Osteopontin peptide levels were greatly increased in the conditioned medium after 3 h (Table II) when active MMP-2 was expressed compared with the inactive control. Osteopontin is a substrate of MMP-3 and -7 (35), and we found that osteopontin was also cleaved by MMP-2 into two major ϳ20-kDa fragments (Fig. 2C). ␣ and ␤ isoforms of HSP90 have been reported to interact with MMP-2 leading to its activation. Both forms were identified in conditioned medium of cells expressing active MMP-2, but only HSP90␣ had increased iTRAQ ratios at 24 h (Table II). HSP90␣ was efficiently cleaved by MMP-2, generating two major ϳ80-kDa fragments and then a ϳ50-kDa fragment before final clearance (Fig. 2D).
Although our criteria for strong substrate candidates are those proteins with iTRAQ ratios Յ0.25 and Ն4.0, we do not expect that all these proteins will be substrates, but they do reflect the indirect effects of MMP-2 proteolysis on the proteome. Macrophage migration-inhibitory factor was the only protein characterized with MMP-2:E/A ratios Ͼ4.0 that was not cleaved in vitro even at enzyme:substrate ratios of 1:10. In the cellular context, other binding molecules might facilitate its cleavage by MMP-2 or by other proteases regulated by MMP-2. However, analysis of the 31 proteases identified (Supplemental Table S2) showed that, in general, protease levels were not markedly altered in the transfectants. Of the extracellular proteases, six were up-regulated, and five were down-regulated, but their activation state is unknown except for MMP-9, which was entirely in the latent zymogen form (data not shown). MMP-2 might regulate other proteases by modulating protease inhibitor activity. Nine protease inhibitors showed changes in iTRAQ ratios when active MMP-2 was present (Supplemental Table S1). We found that cystatin C, an inhibitor of cathepsins B, H, and L, was cleaved by MMP-2 at 6 PPR1LVG reducing its inhibition of cathepsin L 2-fold (data not shown). In a complex cellular environment signaling circuits that are regulated by MMP-2 processing of bioactive molecules might result in altered cellular responses that were proteomically detected. To mitigate indirect effects, the earliest time point where sufficient protein could be purified for proteomics analysis (3 h) was assessed. In the case of macrophage migration-inhibitory factor, its levels were elevated at all time points making it difficult to determine the reason for its altered expression. This highlights the necessity for secondary validation of functional proteomics experiments.
"Peptide Mapping" Predicts Substrate Domain Cleavage and Release-Some extracellular matrix proteins, such as PCPE and perlecan, and cell surface proteins like fractalkine exhibited dispersed relative abundance ratios for the peptides used to identify the protein (Table III). By mapping the location of the multiple peptides identified in a protein, distinct partitioning of their iTRAQ ratios was often observed. We hypothesized that portions of the protein that had peptides with high iTRAQ ratios represented the cleavage product that was released by MMP-2 proteolytic activity, whereas the peptides that did not show such great changes might be in the remnant protein that was mostly retained on the cell membrane or in the pericellular matrix. Mapping of iTRAQ-labeled tryptic peptides from PCPE highlighted four peptides in the C-terminal region with high iTRAQ ratios (ϳ4.0 to Ͼ30) compared with five peptides near the N terminus that showed no change (mean ratio of 1.2) (Fig. 3, A and B). Because PCPE binds to procollagen C-propeptide via its N-terminal CUB domain (36) the iTRAQ ratios and peptide mapping are consistent with the shedding of the C-terminal netrin domain of PCPE upon proteolysis with the N-terminal CUB domain remaining bound to the procollagen. We confirmed this using C-terminal FLAGtagged PCPE in culture medium that was cleaved by MMP-2 to release a 22-kDa C-terminal FLAG-tagged fragment encompassing the netrin domain (Fig. 3C), consistent with a cleavage site between the two sets of mapped peptides. CX 3 CL1 has an N-terminal chemokine domain linked to a transmembrane domain via an extended mucin-rich stalk (37) (Fig. 4A). All three peptides identified in the chemokine domain showed high iTRAQ ratios at 3 h that further increased over 48 h up to Ͼ30, indicating shedding by MMP-2. On the other hand, only one peptide from the stalk was identified, and it had a much lower ratio (1.6 and 3.3 at 3 and 48 h, respectively). This indicates that the stalk was mostly retained on the cell membrane but could be released to the conditioned medium in low amounts, possibly by cell lysis or by shedding of the stalk by other proteases such as ADAM10 and ADAM17 at a cleavage site in the mucin repeats near the cell membrane (38,39). We biochemically confirmed the cleavage of the chemokine domain using recombinant CX 3 CL1 ectodomain protein (Fig. 4B) with MALDI-TOF MS data and N-terminal sequencing showing that the major 7.7-kDa cleavage product extended from residues 5 to 71 of the 76-residue chemokine domain (Fig. 4C). A fragment of similar molecular mass (ϳ8-kDa) was immunoreactive with a ␣CX 3 CL1 chemokine domain monoclonal antibody in concentrated conditioned medium from ⌬ppMMP-2 but not from Shown are relative abundance ratios of iTRAQ-labeled tryptic peptides identified in the conditioned medium from Mmp2 Ϫ/Ϫ fibroblasts transfected with ⌬ppMMP-2 (MMP-2) or ⌬ppE375AMMP-2 (E/A). Conditioned medium proteins were collected at 3 and 24 h (experiment 1), 24 h (experiment 2), or 3 and 48 h (experiment 3), and the individual MMP-2:E/A ratios for each iTRAQ-labeled peptide identified (confidence level of Ն99%) at these time points are presented. Different forms of the same peptide are counted as one peptide. The number of amino acids in the protein identified is shown in brackets. A ratio of 1.0 indicates no change in abundance of protein detected in the MMP-2 and E/A sample; ratios Ͼ1.0 indicate relative accumulation of protein in conditioned medium. Peptides having an iTRAQ ratio Ͼ30 were considered singletons as one peptide of the pair could not be detected above the background noise resulting in an inability to accurately determine the area of such a very low ion peak. This ratio depicts a large unquantifiable change. Shaded peptides were markedly increased with iTRAQ ratios Ͼ4.0 in comparison with other peptides identified from the same protein. Full data sets from which these selected proteins were extracted are presented in Supplemental Tables S3, A, B, and C. -, no peptides detected.
1 Data collected from experiment 1 using four iTRAQ labels. 2 Data collected from experiment 2 using two iTRAQ labels. 3 Data collected from experiment 3 using four iTRAQ labels. 4 Abbreviations used in protein list: PEDF, pigment epitheliumderived factor; HDGF, hepatoma-derived growth factor. 5 Peptides are shown with N-and C-terminal residues numbered. 6 iTRAQ-labeled tryptic peptides identified at Ն95% confidence level. 7 Predicted to have both N-and C-terminal cleavages.
vector or ⌬ppE375AMMP-2 control cells (Fig. 4D). The detection of four peptides of CX 3 CL1 highlights the sensitivity of this proteomics approach as the conditioned medium from the cell cultures was concentrated 1000ϫ to detect the shed chemokine domain by Western blotting. Improved Protein Coverage by iTRAQ Versus ICAT 9-fold more proteins were identified by iTRAQ (519) than by ICAT (56) at Ն99% confidence in 48 h-conditioned medium (Table I). 33 proteins were identified by both iTRAQ and ICAT in the 48-h samples (Supplemental Table S4B). Of the three that showed significantly increased abundance by ICAT (ratios Ն2.0), two had inconsistent ratios compared with the iTRAQ samples; and of the 12 proteins showing reduced abundance by ICAT (ratios Յ0.4), four were inconsistent with iTRAQ. For procollagen C-proteinase enhancer, two peptides were identified, and these were localized in the N-terminal region of the protein, which is not shed (Fig. 4C), explaining the low ratios (0.2) found for the two peptides identified by ICAT. However, for the other 11 proteins only one peptide was identified per protein by ICAT, highlighting the need for a technique that identifies more than one peptide per protein for reliable analysis.

DISCUSSION
Multiple effects on the extracellular proteome that were wrought by MMP-2 proteolysis were revealed using a new proteomics strategy. Despite the complexity of cell-based screens, the use of stable isotope labeling in conjunction with tandem MS proteomics enables high content analyses to be performed that is required for system-wide analysis of proteolysis. Ours is the first application of iTRAQ labeling with tandem MS as a screen for protease substrates; furthermore it provides the first analysis of iTRAQ-labeled secreted proteins in cell-conditioned medium. Proteins that upon MMP-2 cleavage shed ectodomains or peptides from the cell surface microenvironment to the conditioned medium showed high iTRAQ ratios for their tryptic peptides versus control samples. Secreted proteins that were degraded showed low iTRAQ ratios and were identified by fewer peptides reflecting proteo- Peptides having an iTRAQ ratio Ͼ30 were considered singletons as one peptide of the pair could not be detected above the background noise resulting in an inability to accurately determine the area of such a very low ion peak. This ratio depicts a large unquantifiable change. 1, data collected from experiment 1 using four iTRAQ labels; 2, data collected from experiment 2 using two iTRAQ labels; 3, data collected from experiment 3 using four iTRAQ labels; 4, iTRAQ-labeled tryptic peptide identified at Ն95% confidence level. C, Western blot analysis of MMP-2 processing of human C-terminal FLAG-tagged PCPE. The 22-kDa fragment was immunoreactive with the ␣-FLAG antibody confirming the release of the C-terminal netrin domain upon MMP-2 cleavage. Molecular mass markers (ϫ10 Ϫ3 Da) are shown.
lytic clearance. By mapping identified peptides to the corresponding protein sequence and comparing their iTRAQ abundance ratios we could also determine the location within a protein of the released cleaved fragment, a technique we term peptide mapping. Other proteins were altered in abundance that were not substrates and most likely were due to knock-on effects of MMP-2 activity executed by altered signaling circuits. We conclude that perturbing a cellular system by expression of even low amounts of active MMP-2 led to many changes in the levels of proteins in the conditioned medium. This is an important caveat to correctly interpret all protease transgenic or genetic knock-out cell and animal studies that often focus on only one or a few substrates or mechanistic pathways.
Mmp2 Ϫ/Ϫ cells were used to ensure that the control cell proteomes were not exposed to any endogenous MMP-2. In this way, we expected that the signal-to-noise ratio would be highest and so represent an optimal system to develop this proteomics strategy. However, using knock-out cells is not a necessary requirement for further studies of other proteases. Human MMP-2 only differs from murine MMP-2 by 10 amino acids in the catalytic domain, none of which are in the active site, and so this is a good system to identify new substrates of human and murine MMP-2. As a variant screen, recombinant active proteases could also be added to cell cultures or proteomes, but we favor stable low level expressing transfectants ensuring consistent low levels of protease expression at the natural location in the cellular microenvironment.
MMPs are translated as zymogens requiring proteolytic removal of the ϳ80-residue propeptide for activation. However, activation of pro-MMP-2 does not occur in unstimulated normal cell cultures, and all known stimuli activate multiple MMPs (25). Consequently proteases secreted as zymogens that are not endogenously activated cannot be compared using wild type and protease-null cells without experimental manipulation to activate the enzyme, a strategy that risks multiple and unpredictable indirect effects that will confound interpretation of the proteomics analyses. Therefore, to avoid experimental activation of MMP-2 in cell culture and to ensure consistent but low levels of active enzyme, we generated stable cell transfectants in which the MMP-2 propeptide was deleted by protein engineering, an expression system not reported previously for any MMP. Nonetheless, it is possible that dominant negative effects might be apparent in the control ⌬ppE375A MMP-2 transfectants. Although highly elevated MMP activity is a hallmark of many diseases (4), to avoid artificially high enzyme to substrate ratios, we selected a low expressing clone for these studies that expressed ϳ30fold less active MMP-2 than Mmp2 ϩ/Ϫ cells and 100-fold less active enzyme than activated Mmp2 ϩ/Ϫ medium or expressed by HT1080 tumor cells. Therefore, proteins that are not physiological substrates of MMP-2 are unlikely targeted for proteolysis in this screen. However, such low expression systems may not detect less preferred substrates that might be important in pathologies characterized by abnormally high enzyme substrate ratios.
Known substrates of MMP-2 were identified as having iTRAQ ratios Ͼ or Ͻ4-fold in the conditioned medium from Mmp2 Ϫ/Ϫ fibroblasts transfected with active protease, vali- A, schematic representation of full-length CX 3 CL1 using Swiss-Pdb-Viewer (Protein Data Bank code 1B2T). For mature CX 3 CL1 the positions are shown of the iTRAQ-labeled tryptic peptides identified at Ն99% confidence in 3-and 48-h conditioned medium from Mmp2 Ϫ/Ϫ cells transfected with ⌬ppMMP-2 (MMP-2) or the catalytically inactive MMP-2 mutant ⌬ppE375AMMP-2 (E/A). Tryptic peptide sequences and their position in the protein are indicated with the corresponding iTRAQ ratio for the peptide. Peptides having an iTRAQ ratio Ͼ30 were considered singletons as one peptide of the pair could not be detected above the background noise resulting in an inability to accurately determine the area of such a very low ion peak. This ratio depicts a large unquantifiable change. 1, iTRAQ-labeled tryptic peptide identified at 95% confidence level. B, Western blot analysis of MMP-2 cleavage of recombinant murine CX 3 CL1 (amino acid residues 1-337). Serum albumin was present as a carrier in the recombinant CX 3 CL1 preparation and is visualized as a negative stain at 65 kDa. Molecular mass markers (ϫ10 Ϫ3 Da) are shown. C, amino acid sequence from 1 to 76 is the functional mature CX 3 CL1 chemokine domain. Arrows indicate the site of MMP-2 cleavage. Numbered shaded boxes highlight iTRAQ-labeled tryptic peptides with increased ratios as shown in A. D, 24-h conditioned medium from MMP-2 or control cells was concentrated 1000ϫ and analyzed by Western blotting for CX 3 CL1 chemokine domain using ␣CX 3 CL1 monoclonal antibody. The shed CX 3 CL1 domain was detected in conditioned medium of ⌬ppMMP-2 transfectants but not in the vector or ⌬ppE375A medium samples. dating this proteomics approach for substrate discovery. As for all screens, candidate substrates require experimental validation to confirm their classification as a natural substrate. We selected candidate substrates having iTRAQ ratios Ͼ4fold and biochemically confirmed a number of these including osteopontin, galectin-1, and HSP90␣. In addition, connective tissue growth factor (CTGF), follistatin-related protein-1, insulin-like growth factor-binding protein-6, pleiotrophin, and cystatin C were also validated as substrates (data not shown). Although levels of HSP90␤ were not altered in MMP-2 transfectants, HSP90␣ iTRAQ ratios were high indicating a specific response to MMP-2 (Table II). Activation of MMP-2 occurs in a trimolecular complex with MT1-MMP and TIMP-2 to generate an intermediate-activated form of MMP-2 before final activation by autocatalysis (22,26). Extracellular HSP90␣ has been proposed to increase MMP-2 activation (40). However, we could not confirm this, 2 consistent with the extensive fragmentation observed of HSP90␣ by MMP-2.
Using peptide mapping we predicted and confirmed the cleaved domains of PCPE and CX 3 CL1 that were shed upon proteolysis by MMP-2. PCPE stimulates procollagen processing, thereby triggering assembly of collagen fibrils (36). Interestingly the C-terminal netrin domain of PCPE is a weak inhibitor of MMP-2 (41) that might homeostatically regulate the levels of PCPE. A similar fragment has been detected in conditioned medium from invasive breast and brain tumor cells, but the protease was not identified (41). MMP-2 is one of four signature genes of the most highly virulent, lungspecific advanced breast cancer metastases (5) and thus might process PCPE at the breast stromal interface to negate protective fibrotic walling off responses by the stroma. Notably we also identified and confirmed CTGF as a new MMP-2 substrate. CTGF increases angiogenesis (42) and the production of extracellular matrix. It is also in the bone metastasis signature profile expressed by human breast cancer cells (43). Hence the cleavage and inactivation of CTGF by MMP-2 is complementary to the effects of PCPE cleavage and matrix degradative activities of MMP-2. CX 3 CL1 occurs as two forms: membrane-anchored where it acts as an adhesion molecule or a soluble chemoattractant extracellular form consisting of its mucin stalk and chemokine domain that is shed from the cell membrane by ADAM10 and ADAM17 (38,39). We found a new mechanism for release of the chemokine domain from the cell membrane by MMP-2 cleavage at 69 AAA1LTK. In addition we found an N-terminal tetrapeptide truncation of the chemokine domain that causes loss of chemotactic activity and converts the chemokine to a potent antagonist of the CX 3 CL1 receptor CX 3 CR (44). Nterminal truncation of monocyte chemoattractant proteins by MMP-2 also generates antagonists of the CC chemokine receptors that dampen inflammation (10,45). Hence our new data show that MMP-2 regulates multiple chemokine activi-ties by proteolytic processing. So, in addition to identifying a protein in a proteome, these examples of proteolytic processing of bioactive substrates highlight the need for post-translational modification analyses to understand the functional state of the proteome.
Every screen has its advantages and limitations. Identification of proteolysis cleavage products of native protein substrates is the most direct method of substrate discovery (3,9). Although altered substrate cleavage in protease-null or transgenic animals satisfies these criteria, these models are not necessarily applicable to human proteases, and phenotyping animal models can be slow. The major advantage of cellbased systems over other experimental screens, such as yeast two-hybrid, phage display, and peptide libraries, is that the proteases and native substrates can be assessed in their natural cellular context, and so it is a less artificial screen (9,15). Altered proteolysis in response to changes in growth conditions, such as treatment with growth factors and cytokines or drugs, can also be studied in cellular systems as opposed to other technologies such as phage display and peptide libraries. With the power of proteomics the analysis of proteolysis in complex cell-based systems on a system-wide basis is now possible, and therefore degradomics is a rapidly evolving field.
Whereas the proteomics identification of protease-generated neo-N termini of cleavage fragments in complex samples has the advantage of directly identifying the cleavage site, protein identification can be problematic as it is only based on a single truncated peptide (15,20,21). iTRAQ enables substrates to be identified by more than one peptide, and when peptide mapping can be applied, the general location of the cleavage site can be determined. iTRAQ tagging is still a nascent field, and although relative quantitation of peptide levels between experiments can vary, the reproducibility of the trend in ratios for hundreds of proteins is remarkable with only 2-4% of peptides showing inconsistencies. Notably up to four samples can be analyzed simultaneously by iTRAQ, improving peptide ion peak heights, proteome coverage, and confidence in substrate identification and providing an internal biological replicate for substrate identification through analysis of abundance ratio trends.
ICAT has been used to discover protease substrates (9) but identifies only cysteine-containing peptides and proteins, thereby reducing the coverage possible (7% of proteins have no cysteine, and 35% contain only one). Moreover only two samples can be compared. We found that iTRAQ enables ϳ9-fold more proteins to be identified with higher confidence and with multiple peptides in comparison with ICAT using the same cell transfectants (Table I). Comparing the peak lists from the two mass tagging procedures, iTRAQ identified more known substrates (8-fold), protease inhibitors (4-fold), and proteases (31-fold). Therefore this validates iTRAQ as an improved high content proteomics technique for substrate degradomics and one that should be generally applicable for other families of proteases and in different cellular and subcellular contexts. By moving beyond in vitro biochemical and peptide or phage library approaches, the application of this new proteomics strategy for native substrate discovery has the potential to greatly increase our understanding of the roles of proteases in complex cellular systems and to thereby identify and validate new drug targets.