Quantitative Multiplex Substrate Profiling of Peptidases by Mass Spectrometry

Proteolysis is an integral component of life and has been implicated in many disease processes. To improve our understanding of peptidase function, it is imperative to develop tools to uncover substrate specificity and cleavage efficiency. Here, we combine the quantitative power of tandem mass tags (TMTs) with an established peptide cleavage assay to yield quantitative Multiplex Substrate Profiling by Mass Spectrometry (qMSP-MS). This assay was validated with papain, a well-characterized cysteine peptidase, to generate cleavage efficiency values for hydrolysis of 275 unique peptide bonds in parallel. To demonstrate the breath of this assay, we show that qMSP-MS can uncover the substrate specificity of minimally characterized intramembrane rhomboid peptidases, as well as define hundreds of proteolytic activities in complex biological samples, including secretions from lung cancer cell lines. Importantly, our qMSP-MS library uses synthetic peptides whose termini are unmodified, allowing us to characterize not only endo- but also exo-peptidase activity. Each cleaved peptide sequence can be ranked by turnover rate, and the amino acid sequence of the best substrates can be used for designing fluorescent reporter substrates. Discovery of peptide substrates that are selectively cleaved by peptidases which are active at the site of disease highlights the potential for qMSP-MS to guide the development of peptidase-activating drugs for cancer and infectious disease.

Discovery of peptide substrates that are selectively cleaved by peptidases which are active at the site of disease highlights the potential for qMSP-MS to guide the development of peptidaseactivating drugs for cancer and infectious disease.
quantitative and can take advantage of natural and non-natural amino acids. However, these substrates lack extended amino acid sequence on the carboxy-terminal side of the scissile bond, and the location of the fluorophore adjacent to the scissile bond can impede enzyme-substrate binding interactions.
Our group previously developed a peptide degradation assay that uses tandem mass spectrometry to identify cleavage products within an equimolar mixture of tetradecapeptide substrates (19). These peptides were rationally designed such that all possible neighbor and nearneighbor amino acid pairs are present in the library and hydrolysis of any of the 2,964 peptide bonds can be detected by tandem mass spectrometry. In addition, every substrate has a unique dipeptide sequence on the amino and carboxyl termini to facilitate the characterization of aminoand carboxy-peptidase substrate specificity. The amino acid sequences of the synthetic substrates are distinct from endogenous proteins and peptides; therefore, when complex biological samples are assayed with this library, cleaved products are easily discriminated from endogenous peptides. This assay has been used to uncover the substrate specificity profile of enzymes from every peptidase family, and the data generated from these assays has guided the development of optimized fluorescent reporters (20,21), peptide inhibitors (22,23), activity-based probes (24), therapeutic peptides (25) and activity-based biomarkers (26). All peptide substrates are present at equimolar concentration and this assay identifies cleavage products that appear at defined timeintervals following addition of enzyme.
In this study, we utilized isobaric tandem mass tags (TMT) (27,28) to quantify all cleaved and uncleaved peptides within the tetradecapeptide substrate library following addition of a peptidase or a complex sample containing multiple peptidases. The reaction was incubated for up to 20 hours and different TMTs were used to label each of the assay time-points. Changes in peptide abundance were quantified and proteolytic kinetic constants were calculated (Fig. 1). In addition to providing by guest on April 28, 2019 http://www.mcponline.org/ Downloaded from quantification, TMT labeling can also minimize experimental and instrument variation while allowing us to gain crucial temporal information. This combination greatly improves the reproducibility and accuracy, while also allowing for a reduction in instrument time by 10-fold.
We have named this approach Quantitative Multiplex Substrate Profiling by Mass Spectrometry (qMSP-MS).
First, we used a well characterized peptidase, papain, to validate the qMSP-MS assay. In addition, the substrate specificity of two integral membrane peptidases were characterized, and a fluorescent reporter was developed based on the preferred peptide substrate, which can be used for high-throughput biochemical assays. Finally, we analyzed lung cancer cell secretions using our workflow and found that they generally produced exopeptidases that can degrade the peptides at the amino and carboxyl termini. These results highlight the potential of applying quantitative mass spectrometry to proteolytic assays which can lead to the development of novel therapeutic delivery systems for important human diseases. 6 triplicate. Replicate lung cancer cell lines were labeled with different TMTs and assayed via mass spectrometry. The Aspergillus phoenicis extract was not labelled with TMT. Significance was assessed by ANOVA and Student's T-test; variance was assessed by an F-test to ensure the correct statistical assumptions were used. q-values of q£0.05 were considered significant.
Expression and purification of rhomboid proteins -The purification of all rhomboid proteins was similar to the previous report (29). Briefly, rhomboid genes were cloned into pBAD-Myc/HisA plasmid (Invitrogen), having a C-terminal tobacco etch virus (TEV) peptidase cleavage site, Myc-epitope and His6-tag, were expressed in TOP10 E. coli cells. The protein was induced with 0.002% w/v arabinose and expressed at 24 o C for 8 hr in LB media. The cells were harvested, resuspended in 50 mM Tris-HCl pH 8.0, 150 mM NaCl and lysed under high pressure using an EmulsiFlex-C3 (Avestin). The membranes were isolated by ultracentrifugation at 95,800 g for 2 hr, solubilized in 50 mM Tris-HCl pH 8.0, 300 mM NaCl, 10 mM imidazole, 20% glycerol, 1% (w/v) DDM and applied onto a Ni-NTA column (Qiagen). The proteins were eluted with 250-500 mM of imidazole, 50 mM Tris-HCl pH 8.0, 300 mM NaCl, 20% glycerol, 0.1% DDM. From 1L of cell culture, purified protein yield was 1-2 mg for PsAarA and 2-3 mg for HiGlpG. The His-tag was removed by TEV peptidase (1 mg per 100 mg of protein, overnight, 16 o C) and a subsequent 7 pyruvate (HyClone) and 100 U/mL penicillin-streptomycin (HyClone). All cells were maintained at 37 °C in an atmosphere of 5% CO2 and grown to ~80% confluence in triplicate T-175 flasks.
Insoluble debris was pelleted by centrifugation for 5 minutes at 18,000 x g at 22 °C. Supernatants were transferred into fresh tubes, and an equal volume of 8 M urea (Thermo) in 50 mM HEPES, pH 8.5 was added to each sample. Samples were then vortexed briefly and sonicated for 5 minutes in a sonicating water bath to maximize protein denaturation.
Proteins were reduced and alkylated as previously described (31). Proteins were then precipitated via a methanol-chloroform procedure (32). Precipitated proteins were re-solubilized in 300 μL of 1 M urea (Thermo) in 50 mM HEPES (Thermo), pH 8.5. Solubility was aided through vortexing, sonicating, and manual grinding. Proteins were then digested in a two-stage process.
First, 3 μg of LysC (Wako 129-02541) was added to each sample and allowed to incubate overnight at room temperature. Next, 3 μg of trypsin (Promega V5113) was added, and samples were allowed to digest for six hours at 37 °C. Digestion was quenched by addition of trifluoroacetic acid (TFA, Pierce). Peptides were desalted with C18 Sep-Paks (Waters) as previously described (33). Peptide concentration was determined with a PepQuant Assay (Thermo), and peptides were aliquoted into 50 μg portions which were dried under vacuum and stored at -80 °C until they were labeled with TMT reagents.
Peptides were labeled with 10-plex TMT reagents (Thermo) (27,28) as previously described (34). TMTs were reconstituted at a concentration of 20 μg/μL in dry acetonitrile (Sigma). Dried peptides were reconstituted in 30% dry acetonitrile in 200 mM HEPES, pH 8.5, and 8 μL of the appropriate TMT reagent was added to peptides. Reagents 126 and 131 were used to bridge between mass spectrometry runs (35) while the remaining TMT reagents were used to label samples as listed in Supplemental Table S2. Labeling was carried out for one hour at room temperature and was quenched by adding 9 μL of 5% hydroxylamine (Sigma). Samples were acidified with 50 μL of 1% TFA then pooled into appropriate 10-plex TMT samples, with pooled standard samples labeled with 126 and 131. Pooled 10-plex samples were desalted with C18 Sep-Paks.
Samples were separated into eight fractions by basic pH reverse-phase chromatography using spin columns (Pierce). Fractions were dried then reconstituted in 5% formic acid/5% acetonitrile and analyzed on an Orbitrap Fusion Tribrid Mass Spectrometer (Thermo) equipped with an EASY-nLC 1000 (Thermo) for identification and quantitation. MS 2 /MS 3 analysis for identification was carried out with chromatographic and mass spectrometry acquisition settings as previously defined (36,37). Briefly, peptides were separated on a 100 µm ID x 30 cm home-pulled home-packed column (0.5 cm C4 5µm, 100Å and 29.5 cm C18 1.8 µm, 120Å). A linear gradient from 11 to 30% Acetonitrile in 0.135% formic acid over 165 min at a flow rate of 300 nL/min with the column heated to 60°C was used for separation. The mass spectrometer was operated in data dependent mode, with a survey scan across the mass to charge range of 500-1,200 at 120,000 resolution in the Orbitrap. Automatic Gain Control (AGC) target was set to 5 x 10 5 with a max ion time of 100 ms. The s-lens was set at an RF of 60. Top speed mode was used to select the most abundant ions for MS 2 /MS 3 in a 5 s experimental cycle. For MS 2 analysis, precursors were isolated with the quadrupole at a width of 0.5 m/z. CID normalized energy was 30% and fragments were detected in the ion trap at rapid scan rate. AGC was set to 1 x 10 4 with a max ion time of 35 ms. For MS 3 analysis, SPS was used with a maximum of 10 ions isolated for MS 3 analysis. These ions were fragmented with HCD at a normalized energy of 50% and detected in the Orbitrap at 60,000 resolution with a low mass of 110 m/z. AGC was set to 5 x 10 4 with a maximum ion time of 150 ms. MS 2 precursors were excluded in a range of 40 m/z below and 15 m/z above the MS 1 precursor.
All data (MS 1 , MS 2 , MS 3 ) were centroided. Data were processed using Proteome Discoverer 2.1 (Thermo). MS 2 data were searched against Uniprot Human (05/11/2017) using the Sequest algorithm (38). A decoy search was also conducted with sequences in reverse order (39)(40)(41). A precursor tolerance of 50 ppm (42,43) and 0.6 Da for MS 2 fragments was defined. Static modifications of TMT 10-plex tags on lysine and peptide N-termini (+229.162932 Da) and carbamidomethylation of cysteines (+57.02146 Da) were specified. Variable oxidation of methionine (+15.99492 Da) was included in the search parameters as well. A maximum of two missed cleavages of trypsin was allowed. Data were filtered to 1% peptide and protein level false discovery rates with the target-decoy strategy through Percolator (44,45). TMT reporter ion intensities were extracted from MS 3 spectra for quantitative analysis, and signal-to-noise ratios were used for quantitation. Spectra were filtered, summed, and normalized as previously described (35,37).  Supplemental   Table S3. The tube containing a mixture of samples was labeled with TMT-131 and used to generate bridge channels for inter-run normalization as described previously (35). Labeled samples were combined into appropriate 10-plex experiments and desalted with Sep-Paks. 10-plex experiments were dried and stored at -80 °C until mass spectrometry analysis. Samples were reconstituted in 8 μL of 5% formic acid in 5% acetonitrile and analyzed on an Orbitrap Lumos Tribrid Mass Spectrometer (Thermo) equipped with an EASY-nLC 1000 (Thermo) with 3 μL of sample analyzed for each 10-plex. Acquisition parameters were the same as those outlined above for lung cancer secretion proteomics. Data were processed using Proteome Discoverer 2.1 (Thermo) as outlined above with the exception that MS 2 data were searched against the 228 synthetic peptides library (Supplemental data 4) with no enzyme digestion specified and a decoy search was conducted with peptide sequences in reverse order (39)(40)(41). Data were filtered to 1% peptide level false discovery rates with the target-decoy strategy. Cleaved peptides were clustered using Jenks natural breaks optimization and peptides in the group with highest proteolytic efficiencies were used to generate specificity profiles. Peptides in Group 2 are slowly formed cleavage products and proteolysis did not go to completion within the 20-hour assay time. The catalytic efficiency values for these peptides are less accurate as their progression curves are non-parabolic (linear) and cannot be well fitted with pseudo-first-order enzymatic kinetics. Peptidase activity correlates with peptidase stability and therefore it is likely that enzyme activity decreases during the 20-hour assay. For exopeptidases activities, a Student's t-test was performed on cleavage products only if peptides were detected in at least 2 of the 3 replicates and were significant by ANOVA (q < 0.05). Peptides with a significant increase (t-test q < 0.05) of at least 2-fold compared to controls were considered significantly increased and used to generate specificity profiles. Specificity profiles of peptidases were generated by iceLogo software (47) to visualize amino-acid frequency surrounding the cleavage sites. Amino acids that were most frequently observed (above axis) and least frequently observed (below axis) from P4 to P4ʹ positions were illustrated (p < 0.3). Norleucine (Nle) was represented as 'n' in the reported profiles.

qMSP-MS Data Analysis
Amino acids in opaque black text were statistically significant (p < 0.05).

Results
In our previous use of the substrate profiling assay, the peptide library was incubated with a peptidase and an aliquot of the reaction volume was removed at four time-intervals (T1 to T4), quenched with acid or denaturant, and analyzed by mass spectrometry. Although this assay has been successfully used to characterize many enzymes, the method is largely qualitative, where the time interval at which a cleavage product is first detected is correlated with cleavage efficiency.
For example, a cleavage product discovered at T3 but absent in T1 and T2 was considered to be the result of a slower peptidase cleavage reaction than a product that is first detected at T1. In this study, peptide substrates are incubated with a peptidase, however, a unique TMT reagent is used to label cleaved and uncleaved peptides at each time-interval. Each labeled sample can then be combined into a single mass spectrometry experiment. This multiplex approach allows for accurate comparison of peptide abundance at each time-point so that kinetic parameters such as kcat/KM can be directly calculated for the time dependent accumulation of each cleavage product (Fig. 1).
Validation of the method using papain -To validate this method, we incubated the 228peptide library with papain, a prototypical cysteine peptidase isolated from papaya fruit. This enzyme is homologous to all C1 family enzymes, the predominant cysteine peptidase found in microbes, plants, and animals (49). We detected cleavage at 275 of the 2,964 available peptide bonds within the tetradecapeptide library over the course of a 20-hour incubation. The use of TMT in this assay allowed us to determine cleavage efficiency for the formation of every peptide product.
The proteolytic constant kcat/KM was calculated from progress curves using the first-order kinetics formula: Y = (plateau-Y0) × (1-exp (-t × kcat/KM × [E0])) + Y0. The kcat/KM values ranged from 0.041 M -1 s -1 to 1.34×10 6 M -1 s -1 and were separated into 2 groups using the Jenks natural breaks algorithm ( Fig. 2A). The progress curves for peptides in Group 1 consisting of the fastest cleaved substrates were parabolic and therefore the reaction was generally complete within 20 hours (Fig.   2B). However, the progress curves for peptides in Group 2 (e.g. THATPGIHFVL*LRP) were linear and therefore these slow reactions were still progressing even after 20 hours incubation (Fig.   2B). All cleaved products were matched to the parent 14-mer substrate in order to determine the cleavage location and identify the amino acids at both sides of the scissile bond. Papain hydrolyzed peptide bonds at each of the 13 sites within the substrates. However, cleavage of bonds between residues 1 to 3, and 12 to 14 occurred at lower frequency than at all other sites, indicating that papain is an endopeptidase and has low preference for hydrolysis of bonds near the termini of peptides (Fig. 2C). For each cleavage site found in this study, the amino acids in the P4 to P4′ positions were identified, and two distinct substrate specificity profiles were generated for papain representing the fastest and slowest substrates (Fig. 2D) were assayed with papain, Gly was found to be the preferred amino acid in the P1 position (53).
On the prime side, Nle and Trp are preferred at P1′ and P2′ positions, respectively.
We next generated a specificity profile of papain using the amino acids surrounding the cleavage sites found in Group 2 (Fig. 2E). Peptides in Group 2 are cleaved slowly most likely due to one or more unfavorable amino acids present in the substrate sequence. In the P2 position, bulky hydrophobic amino acids, such as Tyr, Nle, Pro and Phe are found instead of the small aliphatic and hydroxyl amino acids present in the Group 1 profile. Asp in the P1ʹ was the most favored amino acid in Group 2 but this residue is strongly disfavored in Group 1. Likewise, Gln in P4ʹ was favored in Group 2 and disfavored in Group 1. At other positions many of the same residues are common between Group 2 and Group 1. These data show that papain has broad specificity and the rate of substrate cleavage is primarily driven by the amino acid in the P2 position. In the absence of small aliphatic and hydroxyl amino acids, peptide hydrolysis does occur, but the rate is considerable slower than when these preferred amino acids are present.
Taken together, the qMSP-MS assay data allowed us to rank the papain cleavage events based on efficiency and generate a substrate specificity profile corresponding to the bonds that are hydrolyzed rapidly compared to bonds that are slowly hydrolyzed. In addition, we show that papain is an endopeptidase and therefore preferentially hydrolyzes peptide bonds distal to the amino and carboxyl termini. These results demonstrate that qMSP-MS provides important kinetic information for peptidase mediated hydrolysis of peptide bonds.

Uncovering the substrate specificity of membrane embedded intramembrane peptidases -
After validation of qMSP-MS assay with papain, we studied an intramembrane serine peptidase from Providencia stuartii, PsAarA. This enzyme has been defined as a bacterial rhomboid peptidase, and related enzymes are also found in mammalian cells. Rhomboid peptidases are known to control many cellular functions and play important roles in human diseases (54), however, few studies have been conducted to elucidate their substrate preferences. PsAarA is known to cleave the transmembrane protein TatA between a pair of Ala residues in the following sequence, IATAAFGS. Strisovsky and colleagues expressed more than 130 mutant variants of the TatA protein substrate that have different amino acid sequence surrounding the scissile bond and discovered that the P4, P1 and P2′ positions were most important for substrate specificity (55). We expressed and isolated PsAarA from E. coli (Supplemental Figure S1) as described previously (29) and incubated it with the 228-member peptide library for 0.25, 1, 4 and 20 hours. In total, cleavage products from 29 hydrolyzed peptide bonds were identified and progress curves were generated.
PsAarA frequently cleaved at sites that have small aliphatic amino acids in the P1 position and bulky hydrophobic amino acids such as Phe in the P2ʹ and P4 positions. In addition, Leu was frequently found in the P2 position while Arg and Ser were present in P1ʹ (Fig. 3B). Although we were only able to quantify a low number of cleavage products, the overall substrate preference is similar to the substrate preference previously reported by Strisovsky and colleagues (55).
We next assayed a rhomboid peptidase from Haemophilus influenzae (HiGlpG) that had not previously been characterized for substrate specificity. This enzyme was also expressed and purified from E. coli however only two substrates were hydrolyzed following 20 hours of incubation with the peptide mixture. The kcat/KM values for PQnIGHVKLFRFN*W and KWLIHPTF*SYnRWP were calculated to be 8.97×10 3 M -1 s -1 and 5.61×10 3 M -1 s -1 , respectively, where * indicates the cleavage site. Interestingly, although the substrate preference of HiGlpG is more stringent than PsAarA, PQnIGHVKLFRFN*W was the most efficiently hydrolyzed substrate for both enzymes. This substrate likely contains sequence features that promote favorable interaction with the enzyme active site that is buried within the detergent micelle. A peptide corresponding to the P7 to P1ʹ amino acids (VKLFRFN*W) was synthesized flanked on the Nterminus with a fluorescent reporter molecule, 7-methoxycoumarin-4-acetic acid and on the Cterminus a quenching group, 2,4-dinitrophenol (Supplemental Table S4). This soluble fluorescent substrate was then assayed with HiGlpG and PsAarA and the rate of cleavage was determinedin a microplate assay. In the qMSP-MS assay, HiGlpG cleaved the 14-mer peptide substrate six times faster than PsAarA (Fig. 3C). This rate difference was also seen for the fluorescent substrate (Fig.   3D). In addition, we assayed a catalytically inactive HiGlpG, consisting of a Ser to Ala mutation at position 116, with the fluorescent substrate and failed to detect hydrolysis. This mutant protein was expressed and purified under identical conditions to the active enzyme, thereby providing a control sample to ensure that proteolytic activity associated with this wild-type enzyme was not due to a contaminating E. coli peptidase. Finally, we sequenced the cleavage products of the fluorescent substrate and determined that hydrolysis occurred between Asn and Trp confirming that addition of the fluorophore and quencher molecules does not affect the cleavage site specificity of this substrate (Fig. 3E, Supplemental Fig. S2). Taken together, these studies revealed that HiGlpG has narrow specificity but higher catalytic activity than PsAarA.

Characterization of proteolytic activities of a complex peptidase sample -We next
investigated the utility of the qMSP-MS assay for characterizing complex biological samples containing peptidases from diverse families. Using a commercial extract from Aspergillus phoenicis that is known to contain α-galactosidase and 1,3-β-D-glucanse activities (56,57)we confirmed by SDS-PAGE and proteomics that this sample contained many fungal proteins (Supplemental Figure S3). In total we identified 528 proteins of which, 49 were peptidases that could be classified into 4 families, namely, serine-, cysteine-, aspartyl and metallopeptidases (Fig.   4A). Serine peptidases were the most frequently found proteolytic enzymes in the extract while cysteine peptidases were the least frequent. However, by spectral counting, aspartyl peptidases were the most abundant proteolytic enzymes found (Fig. 4A). We assayed this sample at pH 5.0 with the synthetic peptide library and quantified 243 cleaved products. As the concentration of each peptidase within this sample is unknown, proteolytic efficiencies of each cleavage were calculated as k values where k = kcat/KM × [E0] and clustered into two groups (Fig. 4B). Peptidases in this extract rarely cleaved bonds between amino acids 1 and 3 indicting that no aminopeptidases were active under these assay conditions. In addition, cleavage of the peptide bonds between position 12 and 13 was slow and only found in Group 2. These data indicate that acid-acting endopeptidases rapidly hydrolyze peptide bonds between position 3 and 12 while one or more acid-acting carboxypeptidase cleaves single amino acids from the carboxyl terminus of these peptides (Fig. 4C).
We generated a specificity profile of the carboxypeptidase activity and found that these enzymes in the fungal extract remove C-terminal Val, Nle and Lys when Pro or Arg are in the penultimate position and Trp is in the P2 position (Fig. 4D). The cleavage profile of the endopeptidases showed a preference for hydrophobic and positively charged amino acids in the P1 positions and while hydrophobic amino acids were exclusively found in the P1ʹ position. In addition, positively charged amino acids were frequently present in the P3 position (Fig. 4E). This substrate profile was similar to the cleavage specificity generated by secreted aspartyl peptidases from Malassezia glabosa (48) and Candida albicans (58). Therefore, we assayed the Aspph extract with a fluorescent substrate IQ2 (Supplemental Table S4) used for aspartic acid peptidases (59) and found that it was efficiently hydrolyzed. Cleavage occurred between the Phe-Phe bond (Supplemental Fig. S4) and this activity was completely inhibited by pepstatin, an aspartyl peptidase inhibitor (Fig. 4F).
These studies show that the qMSP-MS assay can be used to distinguish between endo-and exopeptidases within complex biological samples. In addition, knowledge of peptidase class and understanding of the cleavage pattern facilitated the identification of fluorescent reporter substrates that can be used for subsequent biochemical assays.

Application of qMSP-MS in studying lung cancer pericellular peptidase activities -In-
depth transcriptional analysis of cell lines and tumors has identified candidate peptidases that may play a role in disease progression. However, such expression-based strategies to study peptidases cannot account for the effects of translational regulation, post-translational modifications, stability and the presence of endogenous inhibitors in the pericellular environment. Therefore, increased expression of a peptidase may not result in a concomitant increase in proteolytic activity. New, function-based technologies with high information content are needed to detect dysregulated proteolysis in the tumor microenvironment. Given that we have demonstrated qMSP-MS can be used to uncover the proteolytic activity in complex protein samples mixture, we sought to apply this method to study the secreted peptidase activity from a panel of cell lines representing four sub-types of lung cancer (Supplemental Table S1). The long-term goal is to identify key peptidase activities that can be targeted in the tumor microenvironment for diagnostics, imaging or prodrug activation.
Using qMSP-MS, we characterized the secreted proteolytic activity in conditioned media from these lung cancer cell lines. However, unlike previous assays with papain and Aspergillus phoenicis extract, we found that most peptides were sequentially hydrolyzed at the amino and carboxyl terminus and the initial cleavage products were subsequently degraded at later time points.
For example, the substrate FRIHGFDEAHNAWL was hydrolyzed to GFDEAHNAWL by sequential removal of F, R, I and H from the amino terminus by peptidases secreted from DMS273 cells (Fig 5A). Therefore, the progression curve of each cleavage product was not monotonic and therefore did not conform to pseudo-first order kinetics. To address this issue, we calculated the change in abundance of each cleavage product between two time intervals (Supplemental Fig. S5).
Our results showed that most cleavage events occurred at the N-terminus of the tetradecapeptides, indicating the presence of strong aminopeptidases activities in lung cancer secretions (Fig. 5B). In addition, BEN, H661, H1944 and SHP77 secrete peptidases with a preference for removing single amino acids from the C-terminus of the tetrapeptides. In parallel, quantitative shotgun proteomics of secretions from each of the cell lines was performed to identify peptidases that were present in the conditioned media. In total, 6,343 unique proteins were identified, 55 of which were peptidases, including 14 aminopeptidases, 12 carboxypeptidases and 29 endopeptidases (Fig. 5C). The qMSP-MS assay data clearly indicated that aminopeptidases were more active than the endopeptidases and carboxypeptidases in these lung cancer secretions and therefore we focused on developing fluorescent substrates to further characterize the amino peptidase activities.
We generated substrate specificity profiles of proteolysis between residue 1 and 2 corresponding to mono-aminopeptidases activity for all cell lines (Fig. 6A) and discovered that these human aminopeptidases frequently removed Ala, Trp, Phe, Tyr and Arg from the amino terminus while rarely or never cleaving Asp, Pro and Gln. We subsequently evaluated a selection of monoaminopeptidase substrates consisting of Ala, Tyr and Arg linked to the fluorescent reporter molecule, 7-amino-4-methylcoumarin (AMC). These substrates were hydrolyzed by aminopeptidases in conditioned media from all cell lines (Fig. 6B). The rate of cleavage of Ala-AMC and Tyr-AMC was similar in all cell lines while hydrolysis of Arg-AMC was generally 1.5 to 3-fold faster. In the substrate profiling assay, tetradecapeptides with amino terminal Asp are rarely hydrolyzed. Therefore, we designed Asp-AMC as a negative control to assay these secretions and found there was no detectable cleavage of this substrate.
In some tetradecapeptide substrates, removal of two amino acids from the amino terminus was commonly observed. In most cases, this appears to be due to di-aminopeptidase activity rather than two sequential mono-aminopeptidase cleavage events because some 12-mer cleavage products were found but not the related 13-mer intermediate product. In addition, we cannot rule out that these substrates are also cleaved by endopeptidases that have little or no specificity beyond the P2 position. We generated a substrate specificity profile of peptide cleavage sites between residue 2 and 3 (Fig. 6C). From this specificity profile, we designed two dipeptide substrates that are unlikely to be hydrolyzed by mono-aminopeptidases. To achieve this, we included a terminal Gly residue because this amino acid is significantly enriched in the P2 position but not preferentially cleaved by mono-aminopeptidases. In the P1 position, we included Nle or Arg, two amino acids that are by guest on April 28, 2019 http://www.mcponline.org/ Downloaded from frequently enriched in the substrate profile. The results showed that both fluorescent substrates were commonly hydrolyzed by lung cancer cell secretions but at different rates (Fig. 6D). In addition, H460, H661 and H1944 cell secretions had reduced specific activity for both mono-and di-amino fluorescent substrates, indicating the aminopeptidases composition in these cell secretions might be lower compared to the others. Taken together, the fluorescent substrate screening assays supported the qMSP-MS specificity profile and demonstrates that this assay is not limited to characterizing purified peptidases but can be applied to studying complex biological samples. The specificity information generated from these lung cancer cell lines will be utilized for developing activity-based diagnostics or prodrugs that will be activated in the tumor microenvironment by one or more peptidases.

Discussion
Several studies have used isobaric labels to quantify peptides that are modified by proteolysis, phosphorylation, or methylation (60)(61)(62)(63)(64)(65)(66). These labels are generally used to directly compare the peptide abundance between multiple biological samples or within a single biological sample that has been subjected to different treatment regimes. However, in this study, we utilize isobaric labels to quantify and rank hundreds of peptide cleavage reactions that take place simultaneously.
Previously, we developed a library of 124 tetradecapeptides composed of all neighbor and nearneighbor amino acid pairs and monitored the time dependent appearance of cleavage products upon addition of a peptidase (19). This method used tandem mass spectrometry to identify new products but similar to many other substrate profiling methods (10,14), it was largely qualitative.
We therefore incorporated TMT labels into our MSP-MS workflow to minimize experimental and instrumentation derived variance while improving throughput of the assay. More importantly, by labeling samples at multiple time intervals, we were able to perform accurate quantification of peptides, allowing us to calculate the turnover rate of each proteolytic event.
To validate our workflow, we demonstrated that qMSP-MS is able to accurately characterize papain substrate specificity. We highlighted the necessity of calculating the cleavage efficiency of different peptides as it can vary by eight orders of magnitude. This information is crucial when studying peptidases as it allows us to identify individual amino acids within the substrate sequence that promote rapid hydrolysis of the peptide. Previously, papain was found to preferentially cleave many hydrophobic amino acids at the P2 position (51,52). However, we demonstrated that Val, Ile, Ala and Leu are strongly favored over other hydrophobic amino acids such as Tyr, Nle, Phe and Pro.
To further demonstrate the versatility of this method, we assayed two related intramembrane peptidases whose active sites are buried in the lipid bilayer of cell membranes. It is unknown how many of the 228 peptides are able to access the active site and therefore substrate specificity may be influenced by sequence features of the peptide that promote or prevent accessibility to the active site. Using qMSP-MS, we found that HiGlpG has stricter specificity and higher catalytic activity than PsAarA and a single fluorescent substrate was designed that could monitor activity of both enzymes. Compared to the standard gel-based methods for detecting rhomboid protease activity (55,67), this fluorescent substrate can be used in microplate assays and therefore facilitates rapid and quantitative activity assays for integral membrane peptidases. Five members of this rhomboid peptidase family are found in mammalian cells and these enzymes are associated with tumor metastasis, tumor angiogenesis, type 2 diabetes and neurodegenerative disorders (68). In general, these integral membrane serine peptidases are much less characterized than their soluble serine peptidase counterparts and no substrates have been discovered for RHBDL1 and RHBDL3.
Therefore, quantitative substrate specificity technologies like qMSP-MS can be used to uncover the substrate specificity of these enzymes, facilitating the discovery of their biological substrates.
From these findings, we predicted that qMSP-MS would have an impact on designing substrates for monitoring proteolytic activities and developing peptidase-activated drugs where potency and selectivity are essential. The qMSP-MS assay can pinpoint which sequences are preferred by peptidases, allowing us to design optimal substrates for peptidase screening. Our studies on the Lung cancer represents 25% of all cancer diagnoses, corresponding to 200,000 new cases in the United States each year. Furthermore, it is the second most deadly cancer with only an 18% 5-year survival rate (58). Increased pericellular proteolysis is a hallmark of aggressive cancers (59).
Secreted peptidases can modulate tumor progression and metastasis through highly regulated processes that involve shedding of cell adhesion molecules and processing of growth factors and cytokines (60)(61)(62)(63)(64)(65). Various strategies have been developed to utilize peptidases for drug delivery for cancer chemotherapy (69,70) and surgery (71). Traditionally, the discovery of peptidase targets for prodrugs or antibody-drug conjugate activation is largely on the expression level and further downstream characterizations are usually required. However, our qMSP-MS method allows us to functionally characterize peptidases specificities, which narrows the gap between biological discovery and drug design, offering valuable information for translational studies. In this study, we showed the application of qMSP-MS to this panel of lung cancer cell lines, demonstrating the feasibility of this approach for studying dynamic diseases.
Aminopeptidase activity has previously been detected in blood (63) and inflamed skin (72) using N-terminal enrichment strategies, while carboxypeptidase activity has been detected in mammalian (73) and bacterial (74) cell lysates using C-terminal enrichment strategies. In the lung cancer secretions, we detected both N-terminal and C-terminal trimming of many peptide substrates that we predict are the result of aminopeptidase and carboxypeptidase activity. In addition, we developed a panel of fluorescent substrates to quantify mono-and di-aminopeptidase activity in microplate assays. Our findings, together with previous studies, indicated that aminopeptidases and carboxypeptidases are associated with tumor invasion and metastasis, but the development of exopeptidase inhibitors for cancer treatment is in a relatively early stage (75)(76)(77)(78).
The logical next step from the outcomes of our study would be to design peptidase-activated prodrugs targeted to global pericellular proteolysis in tumor microenvironment. As MS technology and multiplexing abilities continue to advance, driving down costs and instrumentation time requirements, it is easy to foresee qMSP-MS as a potential first step in the development of personalized medicine geared towards not only tumors, but other peptidase-related diseases as well.

DATA AVAILABILITY
All mass spectrometry data have been deposited in MassIVE and proteomeXchange with accession numbers: MassIVE: MSV000082187; ProteomeXchange: PXD009269. Quenched samples are labeled with specific TMTs and pooled prior to mass spectrometry analysis. Cleavage products are identified by MS/MS and quantified by TMT labels. Proteolytic progression curves or relative abundance changes of each cleavage product are calculated, and peptides are ranked by their proteolytic efficiency. Cleavage products with highest proteolytic efficiencies are selected to generate specificity profiles and design fluorescent substrates. Fig. 2. Quantitative multiplex substrate profiling of papain. A, Proteolytic constant kcat/KM of each cleaved peptide was calculated, ranked, and clustered into Group 1 (red) and Group 2 (black). B, Progress curves for sample cleavage products from Group 1 (Brown, Green and Blue) and Group 2 (Pink). The sequence of the full-length substrate is shown, and the cleaved product quantified in this assay is colored. C, Distribution of Group 1 and 2 cleavage sites within 14-mer peptides. D, A frequency plot showing the P4 to P4ʹ specificity profile of substrates that a rapidly cleaved by papain (Group1). E, A frequency plot showing the P4 to P4ʹ specificity profile of substrates that a slowly cleaved by papain (Group 2). Lowercase 'n' corresponds to norleucine and non-grayed residues have p ≤ 0.05. Cleavage sites are indicated with *. Fig. 3. Substrate specificities and fluorescent assays of rhomboid peptidases. A, Proteolytic constant kcat/KM of each cleaved peptide was calculated and ranked. B, PsAarA specificity profile was generated using 29 cleavages (p<0.05). Due to the low number of cleavage sites, no amino acid was found to be significantly de-enriched. Text in red was previously described to be the specificity of PsAarA (55). C, PQnIGHVKLFRFNW was cleaved by PsAarA and HiGlpG between Asn and Trp and cleavage efficiencies were calculated. D, Fluorescent substrate screening of PsAarA, HiGlpG and HiGlpG S116A with the rationally designed internally quenched substrate IQ1. E, Cleavage of the fluorescent substrate between Asn and Trp was confirmed by mass spectrometry.