Quantitative Profiling of Post-translational Modifications by Immunoaffinity Enrichment and LC-MS/MS in Cancer Serum without Immunodepletion

A robust method was developed and optimized for enrichment and quantitative analysis of posttranslational modifications (PTMs) in serum/plasma samples by combining immunoaffinity purification and LC-MS/MS without depletion of abundant proteins. The method was used to survey serum samples of patients with acute myeloid leukemia (AML), breast cancer (BC), and nonsmall cell lung cancer (NSCLC). Peptides were identified from serum samples containing phosphorylation, acetylation, lysine methylation, and arginine methylation. Of the PTMs identified, lysine acetylation (AcK) and arginine mono-methylation (Rme) were more prevalent than other PTMs. Label-free quantitative analysis of AcK and Rme peptides was performed for sera from AML, BC, and NSCLC patients. Several AcK and Rme sites showed distinct abundance distribution patterns across the three cancer types. The identification and quantification of posttranslationally modified peptides in serum samples reported here can be used for patient profiling and biomarker discovery research.

Biomarker identification is a key step for illustration of disease mechanisms, drug development, and diagnostics. Diagnostics research has focused on identifying biomarkers from viable biofluids, including serum/plasma, saliva, cerebrospinal fluid, and urine. Due to ease of collection and richness in proteins and metabolites, serum/plasma has been the preferred choice for diagnostic studies (1)(2)(3). Advancement of mass-spectrometry-based proteomic technologies has allowed identification and quantification of thousands of proteins in serum/plasma samples. Typically, these methods combine isotopic labeling, offline fractionation, and LC-MS/MS analysis. Facilitated by high-throughput proteomics analysis, researchers have collected vast amounts of comparative information about protein abundance in serum/plasma of patients of various types of diseases that accelerated the identification of potential biomarkers (4).
To date, the majority of serum/plasma proteomic work has been conducted to analyze total protein level abundance, with only a few studies to analyze posttranslational modifications (PTMs) 1 , usually glycosylation (5,6). As one of the most important mechanisms for regulating protein function, PTMs, including phosphorylation, acetylation, ubiquitination, and methylation, have been identified and validated as critical for signaling transduction, protein degradation, and transcriptional regulation (7,8). Currently, there exists very limited data about PTMs in serum/plasma beyond glycosylation. The abundant serum protein albumin has long been known to be acetylated by aspirin, and this reaction can occur in vitro without the presence of any acetyltransferase (9). Fibrinogen, another abundant serum protein, is also acetylated by aspirin both in vivo and in vitro (10,11). These previous findings and the known importance of PTMs in cellular signaling provided the impetus for a large-scale survey of PTMs other than glycosylation by immunoaffinity enrichment of PTM-containing peptides.
One challenge for proteomic analysis of serum/plasma is the broad dynamic range of the serum/plasma proteome (12), including a high percentage of the total protein content of serum/plasma represented by only 12 proteins. This limitation can be partially overcome by immunodepletion of abundant proteins prior to enzymatic digestion (4,13), however, generation of the large quantities of materials necessary for PTM enrichment with an immunodepletion workflow could be cost prohibitive. It was therefore of interest to develop a PTM enrichment workflow from serum/plasma without the need for depletion of the abundant proteins. This method allows PTM profiling from a reasonable volume of serum (ϳ250 l for multiple PTM enrichment) followed by LC-MS/MS analysis. Among the PTMs surveyed, lysine acetylation (AcK) and arginine mono-methylation (Rme) were identified as the more prevalent PTMs in cancer patients' sera. These PTMs were profiled in sera from patients with acute myelogenous leukemia (AML), breast cancer (BC), and nonsmall cell lung cancer (NSCLC). At 1% FDR, we have identified 796 unique AcK sites and 808 unique Rme sites in the sera of 12 cancer patients.
The abundant serum protein human albumin was identified acetylated at 59 different sites, while other abundant proteins were also found to be acetylated, including A2M and serotransferrin. About 25% of the identified AcK sites (190 out of 796) were from the 12 most abundant serum proteins. In contrast, the Rme sites identified were from a more diverse complement of proteins, including transcriptional regulators and RNA processing proteins. Quantitative analysis identified a subset of peptides in each enrichment with differential abundance across the three cancer types surveyed. For example, the abundance of a Lys155-containing peptide from the complement component 3 protein was higher in the sera of NSCLC patients compared with AML and BC patients. Conversely, the abundance of an Arg1593 mono-methylated peptide from protein ARID1A was lower in the sera of NSCLC than the other two cancer types. Clustering of the quantitative data for the AcK and Rme enrichments revealed patterns of modification specific to cancer type as well as patient pathology. Together, these data demonstrate the utility of PTM profiling of human serum samples for disease characterization and the potential for biomarker discovery.

EXPERIMENTAL PROCEDURES
Cancer Patient Serum-Serum samples of four patients of AML, BC, and NSCLC were purchased from Proteogenex (Culver City, CA). Patient information and the concentration of total protein of each serum sample are provided in Supplemental Table S1.
Sample Preparation-Serum samples were processed using the PTMScan method as previously described (14). Equal volumes of serum (250 l for individual sample) were mixed with urea lysis buffer (9 M sequanol-grade urea, 20 mM HEPES, pH 8.0, 1 mM ␤-glycerophosphate, 1 mM sodium vanadate, 2.5 mM sodium pyrophosphate) to a final concentration of 6 M urea. For the technical triplicate experiment, 150 l serum from four nonsmall cell lung cancer patients were pooled together and split into three aliquots for independent processing. The samples were centrifuged at 16,000 ϫ g for 15 min at 4°C. Supernatants were collected and reduced with 4.5 mM DTT for 30 min at 55°C. Reduced lysates were alkylated with iodoacetimide (0.095g per 5 ml H 2 O) for 15 min at room temperature in the dark. Samples were diluted 1:4 with 20 mM HEPES, pH 8.0, and digested overnight with 10 g/ml trypsin-TPCK (Worthington, #LS003740, Lakewood, NJ). Digested peptide lysates were acidified with 1% TFA and peptides were desalted over 360 mg SEP PAK Classic C18 columns (Waters, #WAT051910, Milford, MA). Peptides were eluted with 40% acetonitrile in 0.1% TFA, dried under vacuum, and stored at Ϫ80°C.
Immunoprecipitation-Enrichment of posttranslationally modified peptides was performed using the antibodies listed in Table I. following protocols described previously (14,15). Briefly, saturating amounts of the indicated antibodies were bound to 30 l packed Protein A agarose beads (Roche, Indianapolis, IN) overnight at 4°C. Lyophilized serum peptides were resuspended in MOPS IAP buffer (50 mM MOPS, pH 7.2, 10 mM KH 2 PO 4 , 50 mM NaCl) and centrifuged 5 min at 10,000 ϫ g. Supernatants were mixed with antibody bead slurries for 2 h at 4°C. Beads were pelleted by centrifugation 30 s at 2,000 ϫ g at 4°C. Beads were washed three times with 1.5 ml IAP buffer containing 1% Nonidet P-40 and three times with 1 ml water (Burdick and Jackson, Morristown, NJ). Peptides were eluted from beads with 0.15% TFA (sequential elutions of 40 l followed by 35 l, 10 min each at room temperature). Eluted peptides were desalted IMAC-IMAC enrichment was performed as previously described (16). Nickel-agarose magnetic beads (Qiagen, Valencia, CA) were treated with EDTA to remove the nickel, washed three times with H 2 O, loaded with aqueous FeCl 3 for 30 min, and washed. For phosphopeptide enrichment, 10 l Fe 3ϩ -agarose slurry were added to peptide digested from 10 l of serum in 1 ml 0.1% TFA/80% acetonitrile for 30 min at room temperature. Unbound peptides were removed by washing three times with 0.1% TFA/80% MeCN. Bound peptides were eluted with 2X 50 l of 2.5% ammonia/50% acetonitrile solution for 5 min. The eluent was immediately acidified by 20% TFA and dried in a speed-vac. Samples were resuspended in 50 l 0.15% TFA, desalted over C18 tips, and redried in a speed vac.
LC-MS/MS Analysis-Immunoprecipitated peptides were resuspended in 0.125% formic acid and separated on a reversed-phase C 18 column (75 m inner diameter ϫ 10 cm) packed into a PicoTip emitter (ϳ8 m inner diameter) with Magic C 18 AQ (100Å ϫ 5 m, New Objective, Woburn, MA). Each sample was split, and analytical replicate injections were run to increase the number of identifications and provide metrics for analytical reproducibility of the method. A standard peptide mix (MassPREP TM Protein Digestion Standard Mix 1, Waters) was spiked in each sample vial in a total quantity of 100 fmol (33 fmol per injection) prior to LC-MS/MS analysis. For antibody enrichment, peptides from 120 l serum were run per injection; for IMAC, peptides from 5 l serum were run per injection. Replicate injections were run nonsequentially to reduce artificial changes in peptide abundance due to changes in instrument performance over time. One replicate of each sample was injected then the second replicate in reverse order. Peptides were eluted using a 120-min or 150-min linear gradient of acetonitrile in 0.125% formic acid delivered at 280 nl/min from 3% to 30% acetonitrile. Tandem mass spectra were collected in a data-dependent manner with an LTQ-Orbitrap ELITE mass spectrometer (Thermo Fisher Scientific, Waltham, MA) running XCalibur 2.0.7 SP1 using a top-20 MS/MS method, a dynamic repeat count of one, and a repeat duration of 30 s. The isolation window was set at 1.0 Da with a normalized collision energy of 35%. Real-time recalibration of mass error was performed using lock mass (17) with a singly charged polysiloxane ion m/z ϭ 371.101237. The data associated with this manuscript including labeled MS2 spectra in Skyline library format may be downloaded from the ProteomeXchange Consortium via PRIDE with project accession numbers: PXD002931 [username: reviewer42095@ebi.ac.uk; password: wpPsK3wN] and PXD002932 [username: reviewer 39066@ebi.ac.uk; password: qT4I4DrS].
MS/MS spectra were evaluated using SEQUEST and the Core platform from Harvard University (18 -20). Files were searched against the NCBI homo sapiens FASTA database updated on June 27, 2011 containing 34,899 forward and 34,899 reverse sequences. A mass accuracy of Ϯ5 ppm was used for precursor ions and 1 Da for product ions. Enzyme specificity was limited to trypsin, with at least one tryptic (K-or R-containing) terminus required per peptide and up to four miscleavages allowed. Cysteine carboxamidomethylation was specified as a static modification; oxidation of methionine residue and the appropriate PTM were allowed as variable modifications for each enrichment sample set. Reverse decoy databases were included for all searches to estimate false discovery rates, and filtered using a 1% FDR in the Linear Discriminant module of Core.
All quantitative results were generated using Progenesis V4.1 (Waters Cooperation) and Skyline Version 3.1 to extract the integrated peak area of the corresponding peptide assignments according to previously published protocols (21,22). Extracted ion chromatograms for peptide ions that changed in abundance between samples were manually reviewed to ensure accurate quantitation in Skyline. Statistical analysis of the quantitative data was done using a two-tailed t test between two cancer groups. The maximum negative log-p value from three comparison pairs was used to indicate significance for abundance changes of a certain peptide between two cancer groups. False discovery rate for each binary comparison was further controlled by applying the Benjamini-Hochberg procedure. Heat maps of the quantitative data were generated and clustered in Spotfire Deci-sionSite (TIBCO Software AB) version 9.1.2.

Enrichment Workflow for PTM Peptide Identification-To
demonstrate an optimal workflow for analysis of posttranslational modifications in serum samples, we first tested enrichment using various PTM antibodies and IMAC-Fe 3ϩ with pooled serum samples as outlined in Table I. Serum samples were directly processed for PTM enrichment without any prior depletion of abundant serum proteins. The enrichments performed included IMAC-Fe 3ϩ , phosphotyrosine enrichment, phosphotyrosine enrichment followed by IMAC-Fe 3ϩ , acetyllysine enrichment (AcK), arginine mono-methylation (Rme), and lysine pan-methylation. Of these enrichments, acetyllysine and mono-methyl-arginine showed the most promising results, with the highest number of peptides per sample (Table I). The number of identifications for the phospho-enrichment was low and failed to identify proteins/sites known to be important disease drivers or substrates for the cancer types profiled (data not shown).
PTM Profiling from Individual Patient Samples-Having established the ability to profile PTMs directly from serum samples, we then applied the immunoaffinity enrichment method to profile lysine acetylation and arginine mono-methylation in patient sera of acute myeloid leukemia (AML), breast cancer (BC), and nonsmall cell lung cancer (NSCLC), respectively (n ϭ 4 for each type of cancer, 12 samples total). As different PTMs were being profiled, PTM enrichment was performed sequentially on the same sample as previously described (23). To minimize interference from factors such as sex, age, and ethnic background, the study included only female patients of Caucasian background (Supplemental Table S1). The general workflow of the sequential enrichment is outlined in Fig. 1. Prior to immunoaffinity enrichment, Western blotting using pan-AcK and Rme motif antibodies were performed on the 12 patient samples (Supplemental Fig. S1). Most of the AcK signal in the Western blotting fell into a molecular weight range between 58 -80 kDa, corresponding to human albumin (MW ϭ 69kDa), with other lower intensity bands visible throughout the MW range. In the Rme blot, the strong signal from albumin was not observed, with other bands detected at various molecular weights.
Enrichment for AcK generated a range of 214 to 486 unique AcK peptides from each serum sample, while enrichment for Rme generated a range of 199 to 257 unique Rme peptides from each serum sample (Supplemental Tables S2 and S3, Details Tab). Many AcK and Rme sites identified were repre-sented by multiple peptides due to methionine oxidation, miscleavage, and different charge states. Tables were made  nonredundant by unique protein site (Supplemental Tables S2  and S3, Summary Tab) using the peptide with the largest number of MS/MS identifications (Count in Details Column, Summary Tab, Supplemental Table S2 and S3) as the best representative for a particular PTM site. In total, 796 and 808 unique sites were identified for AcK and Rme, respectively. Of these, 672 AcK sites and 619 Rme sites were previously unidentified and will be curated into the PhosphoSitePlus database as a public resource (24).
Classification of Proteins with Identified PTMs-This study identified 520 and 688 unique proteins from AcK and Rme enrichment, respectively (Supplemental Table S4). The top 10 protein classes represented in each enrichment are shown in the pie charts in Figs tor/channel/transporter/cell surface protein, adhesion/extracellular protein, chromatin/DNA-binding/DNA repair/DNA replication protein, and transcriptional regulator. Peptides from albumin were identified acetylated at a total of 59 unique AcK sites, which is consistent with the Western blotting signal shown in Supplemental Fig. S1A. High numbers of acetylation sites were also identified on other serum-abundant proteins, including alpha-2-macroglobulin and serotransferrin with a total number of 35 and 29 unique AcK sites identified, respectively. Overall, about 25% of all AcK sites identified (190 out of 796) were from the top 12 abundant serum proteins. Conversely, for Rme, only 4 out of 808 unique Rme sites from top 12 abundant serum proteins were identified. The top five protein categories for Rme were receptor/channel/transporter/cell surface protein, RNA processing, transcriptional regu-lator, adhesion/extracellular protein, and adaptor/scaffold. A large number of unique Rme sites was identified from various heterogeneous nuclear ribonucleoprotein (hnRNP) isoforms, most of which have been identified before in a previous study (14). The presence of these posttranslationally modified hnRNP peptides in serum/plasma has not been previously reported. The data for the two enrichments were largely complementary, with only 35 proteins identified in common (Supplemental Table S4, Supplemental Fig. S2). There was also a low degree of overlap between these results and a recent large-scale plasma proteome study using iTRAQ labeling and offline fractionation prior to LC-MS/MS analysis, which identified over 5300 proteins with high confidence (4), with 226 out of 520 proteins in common for AcK and 212 out of 688 for Rme (Supplemental Fig. S2).
Reproducibility Assessment of the Method by Technical Triplicate Analysis-A pooled mixture of NSCLC patient serum was split into three aliquots and subject to parallel, independent trypsin digestion, sequential immuno-enrichment for AcK and Rme, and LC-MS/MS analysis. We have identified 555, 502, and 515 unique AcK peptides and 373, 357, and 377 unique Rme peptides from the three independent samples, respectively. A total of 778 unique AcK peptides and 564 unique Rme peptides were identified across the triplicate runs, of which, 361 (46%) AcK peptide identifications and 226 (40%) Rme peptide identifications were shared by all three samples (Supplemental Fig. S3). Of the unique peptides identified, 380 AcK and 356 Rme were quantified using the labelfree quantification approach (Supplemental Tables S5 and  S6). The distributions of %CV between technical triplicates for the two PTMs are shown in Supplemental Fig. S4. The median %CVs were 23 and 17% for lysine acetylation and arginine mono-methylation, respectively. About 78% of acetyl-lysine peptides and 94% of mono methyl-arginine peptides had %CVs lower than 40%, indicating good technical reproducibility of the method (Supplemental Fig. S4).
Quantitative Analysis of AcK and Rme Peptides across Three Cancer Types-A label-free quantification strategy was applied to analyze PTM site-specific abundance changes across the three cancer types profiled. Analytical replicates of each enriched peptide mixture were run to allow greater opportunity for identification of unique peptides and to provide data on analytical reproducibility of the method. To monitor the stability of the LC, ionization efficiency, and sensitivity, a standard peptide mix of HPLC-purified peptides from four proteins was added to each sample prior to LC-MS/MS. Sequences of the four standard proteins were added to the human database to allow identification of these peptides in each LC-MS/MS run. Standard peptides with retention times spanning the entire gradient range were chosen to monitor instrument performance. Standard peptide retention times and integrated MS1 peak areas across all 48 LC-MS/MS runs were consistent, indicating good performance of the instrument throughout the analysis. (Supplemental Fig. S5). Both Skyline and Progenesis were used for the label-free quantification of posttranslationally modified peptides across all samples. Both programs perform retention time alignment across LC-MS/MS runs by either common MS2 identifications (Skyline) or common ion features (Progenesis) so that a narrow and accurate precursor window can be applied to each unique peptide or ion feature (21). This methodology allows determination of peptide peak intensities across all samples even if a particular peptide was not MS/MS identified in every sample run (14,21,25).
Other important quality control metrics for label-free quantification include coefficients of variation (%CV) between analytical replicates and chromatographic shape of the extracted ion chromatograms, both of which were monitored in Skyline. In total, we have quantified 612 and 621 unique sites for AcK and Rme, respectively (Summary Tab of Supplemental Tables S2 and S3). The median %CV was 14% for AcK out of 6983 analytical replicate measurements across all samples; and 15% for Rme out of 7200 analytical replicate measurements across all samples. Histograms plotting the distribution of %CVs for AcK and Rme data are shown in Supplemental  Fig. S6.
The fold changes of each unique PTM site across three cancer types were calculated by comparing the average intensity of the chosen modified peptide across all four patients of each cancer type. Statistical analysis of each binary comparison was performed using a two-tailed t test (Summary Tab of Supplemental Tables S2 and S3). Those PTM sites with any fold change greater than 2.5 or less than -2.5 and a corresponding p value less than .01 were manually examined in Skyline and are indicated with bold intensity values in the Summary Tabs of Supplemental Tables S2 and S3. Peptide intensities in the 12 individual patient samples were compared with the median intensity across all samples to generate log2 ratios. Log2 ratios were used to generate heat maps in Spotfire and hierarchically clustered. For Rme data, samples from the same cancer type clustered together, with higher correlation between breast cancer and acute myeloid leukemia than with nonsmall cell lung cancer (Fig. 3B). For AcK data, not all patients of the same type of cancer were clustered in the same group. Specifically, patients #1, #2, and #3 of nonsmall cell lung cancer were clustered together, while patient #4 was an obvious outlier that exhibited a significantly different pattern of abundance for most unique AcK sites (Fig. 3A). Consistent with the clustering results of Rme data, the correlation between breast cancer group and acute myeloid leukemia is closer than those with nonsmall cell lung cancer. Specific examples of differential relative abundance between cancer types included a complement component 3 peptide acetylated at Lys155 (Fig. 4A). This peptide was 4.4and 8.1-fold higher intensity in NSCLC samples than the average intensity in breast cancer and AML samples, respectively (Fig. 4B). An Arg1593-monomethylated peptide from the protein ARID1A was identified (Fig. 4C) and its abundance was 4.3-and 5.4-fold lower in NSCLC samples than the average intensity in breast cancer and AML samples, respectively (Fig. 4D). Some peptides showed an inconsistent intensity pattern even within the same cancer type, such as a peptide acetylated at Lys298 of albumin (Fig. 5A). This peptide was of higher abundance in NSCLC patients #1, #2, and #3 with 2.9-and 9.7-fold increase compared with the other two cancer types, whereas NSCLC patient #4 showed similar intensity to breast cancer and AML samples (Fig. 5B). Although there were differences in relative abundance for several acetylated peptides derived from albumin, the overall protein level of albumin was not changed (Fig. 5C). The histological diagnosis of NSCLC patient #4 (with 35 years of smoking history) was squamous cell carcinoma; while the histological diagnosis of #1, #2, and #3 patients (nonsmokers) of NSCLC was adenocarcinoma (Supplemental Table S1). DISCUSSION Extensive effort has been expended on method development for analyzing the serum/plasma proteome to identify potential biomarkers (1,3,26,27). In the current study, we have developed a robust workflow combining PTM motif antibody enrichment and LC-MS/MS analysis for profiling different PTMs. The workflow has proven successful to analyze both plasma (data not shown) and serum samples for many commonly studied PTMs, including phosphorylation, acetylation, arginine methylation, and lysine methylation, with lysine acetylation and arginine mono-methylation yielding the highest number of unique peptide identifications.
Another important feature of the workflow is its compatibility with both label-free and isobaric-labeling quantification strategies. For this study, we employed label-free quantification based on MS1 precursor intensities for each identified PTM peptide (14,22). The key factor for successful label-free quantification is the consistency of instrument performance during the entire data collection period. To validate the reproducibility of the method, we have performed a technical triplicate experiment using pooled NSCLC serum. At the MS2 identification level, 46% (361 out of 778 unique AcK peptides) and 40% (226 out of 564 unique Rme peptides) of peptides were identified across all three LC-MS/MS (Supplemental Fig.  S3). The MS1 integrated peak areas of identified modified peptides showed a tight distribution of %CV across technical triplicate samples (Supplemental Fig. S4), with the median %CV for AcK and Rme peptides were 23 and 17%, respectively (Supplemental Tables S5 and S6). These quantitative data demonstrate the robustness of the method as well as the accuracy of label-free quantification for peptides with PTMs. These data are consistent with previous studies showing the reproducibility of label-free quantification, and label-free quantification followed by affinity purification has been broadly adopted for quantifying changes of various PTMs (14, 21, 28 -33). Various software solutions, both academic and commercial, have been developed for label-free quantitative approaches, including MaxLFQ, Skyline, Ideal-Q, and OpenMS. These software packages have been extensively tested and validated in the literature (25, 34 -36). Additionally, data from the spiked-in standard peptides showed reproducible retention time and integrated peak areas across all 48 LC-MS/MS runs that spanned over a week of continuous instrument operation (Supplemental Fig. S5). The accuracy of the quantitative data was also ensured by manual review of integrated peaks for those peptides with significant abundance changes across different cancer groups. The quantitative data from this study showed a distinct pattern for both AcK and Rme in which NSCLC patient samples were significantly different than AML and BC (Fig. 3). The hierarchical clustering based on the quantitative data also identified patient #4 of NSCLC as an outlier in the group for both modifications. Patient #4 was diagnosed as adenocarcinoma while the other three NSCLC patients were diagnosed as squamous cell carcinoma, raising the possibility that serum PTMs could also serve as biomarkers for histological classification of NSCLC. The small number of samples for each cancer type profiled in this study precludes definitive conclusions on this point but serves as an area worthy of further investigation. Therefore, it should be noted that this study is largely technical in nature, and further work on a larger cohort of samples would be necessary to more fully validate candidate disease biomarkers identified herein.
In order to achieve deeper proteome coverage of serum/ plasma, recent studies have employed immunodepletion of abundant proteins followed by offline fractionation prior to LC-MS/MS analysis (4,37). Combined with multiplex isotopic labeling, Keshishian et al. confidently identified over 5300 proteins from plasma samples of four patients (4). In the current study, we identified a total of 1173 unique proteins by combining the results of both AcK and Rme enrichments. Of these, there were 422 proteins overlapping between the two studies (Supplemental Table S4, Supplemental Fig. S2). It should be noted that two unique peptides were required for protein identification in the study by Keshishian et al., while in this study we reported all proteins with at least one unique modified peptide that passed 1% FDR filtering. We also compared our results with identified plasma proteins deposited in the Plasma Proteome Database (38). Roughly half of the proteins identified in our study (428 out of 1173) are absent from the repository database (Supplemental Table S4). We believe the complementarity of the results at the protein level is due to the specific enrichment by PTM motif antibodies that unveiled a large group of low abundance proteins, including transcriptional regulators, RNA processing proteins, and receptors. Interestingly, in the AcK enrichment samples, we identified several serum-abundant proteins, including albumin, APOA1, serotransferrin, and alpha-2-macroglobulin, and most were acetylated at multiple lysine sites. We have observed differential patterns of acetylation in NSCLC patients FIG. 5. MS/MS spectra and intensity of Lys298 of Albumin (A and B). A number of unique AcK sites showed significantly different intensities in patient #4 of the NSCLC group (squamous cell carcinoma) compared with patients 1-3 (adenocarcinoma) such as Lys298 of albumin. The relative abundance of the Lys298-containing peptide is significantly higher in the BC group than the AML group, while the total protein abundance of human albumin showed no changes by Western blotting (C). #1, #2, and #3 (histological diagnosis of adenocarcinoma) for these proteins compared with the NSCLC patient #4 (histological diagnosis of squamous cell carcinoma) (Fig. 3). The protein group showing higher abundance of acetylated peptides in NSCLC included albumin, with several lysine acetylation sites showing similar patterns as exemplified by Lys298 (Fig. 5B). The differential abundance of acetylated albumin peptides was in contrast to the total protein level of albumin, which showed minimal changes across all 12 cancer patients (Fig. 5C). As the most abundant protein in blood, albumin had previously been considered as an interfering protein rather than a potential biomarker when performing serum/plasma proteome profiling studies. However, the findings in this study of differential acetylation sites on albumin such as Lys160, Lys161, and Lys298 in different cancer types suggest that the potential exists for biomarker discovery even among abundant serum proteins.
Histone proteins, including H1, H3, and H4, were identified in the study heavily acetylated at a number of lysine sites. Hyper-acetylation of core histone tails is believed to neutralize the positive charge of these domains and weaken histone-DNA and nucleosome-nucleosome interactions. This weakening destabilizes chromatin structure and increases the accessibility of DNA to various DNA-binding proteins (39 -41). Peptides from histone H3.3 acetylated at Lys15, Lys19 and H4 Lys6, Lys13, Lys17 showed significant abundance differences between cancer groups that may serve as indicators of the status of nucleosomes.
In contrast to the lysine acetylation results, in which many peptides were identified from abundant serum proteins, arginine mono-methylated peptides identified were distributed into diverse protein groups with few assignments to abundant serum proteins. The abundance differences of Rme sites were also generally less dramatic than AcK sites between the NSCLC group and the other two cancer groups (Fig. 3). Although there were not as many dramatic intensity changes of Rme sites across the three cancer groups, the four patients belonging to each cancer group clustered together, indicating more direct correlation between cancer type and arginine mono-methylation levels in serum than lysine acetylation levels. Consistent with the hierarchical clustering showing a closer relationship between AML and BC as cancer groups (Fig. 3), none of the p values derived from methyl-arginine quantitative data for site AML and BC samples passed Benjamini-Hochberg critical value analysis (Supplemental Table S3), which indicates similar levels of arginine mono-methylation levels between AML and BC. There were, however, significant differences in site abundance between NSCLC serum samples compared with AML or BC, including heterogeneous nuclear ribonucleoproteins, well-studied RNA processing proteins involved in posttranscriptional modification of pre-mRNA (42). Methylation of hnRNP family members containing the RGG motif promotes import into the nucleus (43). Therefore, changes in the levels of identified arginine methyl-ation level in hnRNPs such as Arg291 of hnRNP A0, Arg206 of hnRNP A1, Arg203 of hnRNP A2/B1, and Arg272 of hnRNP D0 may help estimate their nucleo-cytoplasmic shuttling as well as posttranscriptional regulation in different cancers.
There have been few studies that detail posttranslational modification of serum/plasma proteins, so it was not clear whether the acetylation and arginine methylation identified from our study was regulated by well-known enzymatic catalysis or in an enzyme-independent manner. For example, we have identified several well-studied lysine acetylation sites on histones, including H3 and H4 (Supplemental Table S2). This finding is supported by the presence of several types of histone acetyltransferases as well as histones in the bloodstream even though they are nuclear proteins (44 -46). Acetylation of albumin and hemoglobin by circulating aspirin has been previously shown in blood, providing one possible explanation for enzyme-independent lysine acetylation of serum-abundant proteins (47). The identification of most arginine methylation sites on low abundance proteins such as transcriptional regulators and RNA processing proteins in our study makes it less likely that there was a concentration-dependent, enzyme-independent reaction for methylation in blood. In fact, numerous protein arginine methyltransferases have been identified in plasma in previous studies (44,48). Therefore, arginine methylation could occur either in intracellular compartments followed by release to the bloodstream or be catalyzed in the bloodstream directly.
In summary, we have completed a large-scale PTM profiling study in serum that identified many different types of PTMs in bloodstream. To our knowledge, this is the first study that systematically identified and quantified PTMs in serum. The workflow we have developed and optimized successfully provided accurate quantitative information for lysine acetylation and arginine mono-methylation in the serum samples of three cancer types. The successful identification of Rme modification of low abundance proteins demonstrated the specificity and sensitivity of PTM motif antibody based enrichment for serum samples. In addition, the identification of multiple lysine acetylation sites in abundant serum proteins such as albumin, some of which showed differential abundance patterns across cancer types, proved that immunodepletion of top abundant proteins in serum is unnecessary and potentially undesirable for serum PTM profiling studies. Together, these results provide novel identification data of posttranslational modifications of serum proteins and define a workflow that is suitable for large-scale quantitative proteomic studies of clinical serum/plasma samples for biomarker identification.