iTRAQ Labeling is Superior to mTRAQ for Quantitative Global Proteomics and Phosphoproteomics*

Labeling of primary amines on peptides with reagents containing stable isotopes is a commonly used technique in quantitative mass spectrometry. Isobaric labeling techniques such as iTRAQ™ or TMT™ allow for relative quantification of peptides based on ratios of reporter ions in the low m/z region of spectra produced by precursor ion fragmentation. In contrast, nonisobaric labeling with mTRAQ™ yields precursors with different masses that can be directly quantified in MS1 spectra. In this study, we compare iTRAQ- and mTRAQ-based quantification of peptides and phosphopeptides derived from EGF-stimulated HeLa cells. Both labels have identical chemical structures, therefore precursor ion- and fragment ion-based quantification can be directly compared. Our results indicate that iTRAQ labeling has an additive effect on precursor intensities, whereas mTRAQ labeling leads to more redundant MS2 scanning events caused by triggering on the same peptide with different mTRAQ labels. We found that iTRAQ labeling quantified nearly threefold more phosphopeptides (12,129 versus 4,448) and nearly twofold more proteins (2,699 versus 1,597) than mTRAQ labeling. Although most key proteins in the EGFR signaling network were quantified with both techniques, iTRAQ labeling allowed quantification of twice as many kinases. Accuracy of reporter ion quantification by iTRAQ is adversely affected by peptides that are cofragmented in the same precursor isolation window, dampening observed ratios toward unity. However, because of tighter overall iTRAQ ratio distributions, the percentage of statistically significantly regulated phosphopeptides and proteins detected by iTRAQ and mTRAQ was similar. We observed a linear correlation of logarithmic iTRAQ to mTRAQ ratios over two orders of magnitude, indicating a possibility to correct iTRAQ ratios by an average compression factor. Spike-in experiments using peptides of defined ratios in a background of nonregulated peptides show that iTRAQ quantification is less accurate but not as variable as mTRAQ quantification.

Stable isotope labeling techniques have become very popular in recent years to perform quantitative mass spectrometry experiments with high precision and accuracy. In contrast to label-free approaches, multiplexed isotopically labeled samples can be simultaneously analyzed resulting in increased reproducibility and accuracy for quantification of peptides and proteins from different biological states. Isotopic labeling strategies can be grouped into two major categories: isobaric labels and nonisobaric labels. In the former category are iTRAQ 1 (isobaric tags for relative and absolute quantification (1)) and TMT (tandem mass tags (2)) mass tags. In the nonisobaric labeling category are methods such as mTRAQ (mass differential tags for relative and absolute quantification), stable isotope labeling by amino acids in cell culture (SILAC (3)), and reductive dimethylation (4). Isobaric labeling techniques allow relative quantification of peptides based on ratios of low m/z reporter ions produced by fragmentation of the precursor ion, whereas nonisobaric labeling yields precursors with different masses that can be directly quantified from MS1 intensity. iTRAQ and mTRAQ reagents provide a great opportunity to directly compare capabilities of reporter and precursor ion quantification since both labels have identical chemical structures and differ only in their composition and number of 13 C, 15 N, and 18 O atoms. In fact, iTRAQ-117 and mTRAQ-⌬4 are identical mass tags with a total mass of 145 Da (Fig. 1A). To achieve 4-plex quantification capabilities for iTRAQ labels, the composition of stable isotopes is arranged in a way to obtain the reporter ion/balancing group pairs 114/31, 115/30, 116/ 29, and 117/28 (1). Three nonisobaric mTRAQ labels were generated by adding or removing four neutrons to the mTRAQ-⌬4 label resulting in mTRAQ-⌬8 and mTRAQ-⌬0, respectively. Both iTRAQ and mTRAQ reagents are available as N-hydroxy-succinimide esters to facilitate primary amine labeling of peptides.
One potential advantage of an iTRAQ labeling strategy is its additive effect on precursor intensities when samples are multiplexed, resulting in increased sensitivity. However, iTRAQ ratios have been demonstrated to be prone to compression. This occurs when other nonregulated background peptides are co-isolated and cofragmented in the same isolation window of the peptide of interest and contribute fractional intensity to the reporter ions in MS2-scans (5)(6)(7). Because most peptides in an experiment are present at 1:1:1:1 ratios between multiplexed samples, all ratios in the experiment tend to be dampened toward unity when cofragmentation occurs. This inaccuracy led to the development of mTRAQ labels to facilitate accurate precursor-based quantification of proteins initially identified in iTRAQ discovery experiments with targeted assays, such as multiple reaction monitoring (MRM) (8). Although iTRAQ has been widely used in discovery-based proteomics studies, mTRAQ has only appeared in a small number of studies thus far (8).
In this study we investigated the advantages and disadvantages of iTRAQ and mTRAQ labeling for proteome-wide analysis of protein phosphorylation and expression changes. We selected epidermal growth factor (EGF)-stimulated HeLa cells as a model system for our comparative evaluation of iTRAQ and mTRAQ labeling, as both changes in the phosphoproteome (9) as well as the proteome (10) are well described for EGF stimulation. We show that iTRAQ labeling yields superior results to mTRAQ in terms of numbers of quantified phosphopeptides, proteins and regulated components. By means of spike-in experiments with GluC generated peptides of known ratios we find that iTRAQ quantification is more precise but less accurate than mTRAQ due to ratio compression. We identify a linear relationship of observed versus expected logarithmic GluC generated peptide ratios as well as for logarithmic iTRAQ and mTRAQ ratios of the phosphoproteome and proteome analysis. This indicates a uniform degree of ratio compression over two orders of magnitude throughout iTRAQ data sets and explains why iTRAQ ratio compression does not compromise the ability to detect regulated elements in these experiments.
iTRAQ and mTRAQ Labeling of Peptides-Desalted peptides were labeled with iTRAQ and mTRAQ reagents according to the manufacturer's instructions (AB Sciex, Foster City, CA). For 2 mg peptide 20 units of labeling reagent were used. Peptides were dissolved in 600 l of 0.5 M TEAB pH 8.5 solution and labeling reagent was added in 1.4 ml of ethanol. After 1 h incubation the reaction was stopped with 50 mM Tris/HCl pH 7.5. Differentially labeled peptides were mixed and subsequently desalted on 500 mg SepPak columns.
Preparation of Phosphoproteome and Proteome Samples-Quantitative analysis of serine, threonine, and tyrosine phosphorylated peptides was performed essentially as described by Villen and Gygi with some modifications (11).
Peptides were reconstituted in 500 l strong cation exchange buffer A (7 mM KH 2 PO 4 , pH 2.65, 30% MeCN) and separated on a Polysulfoethyl A strong cation exchange (SCX) column from PolyLC (250 ϫ 9.4 mm, 5 m particle size, 200 A pore size) using an Akta Purifier 10 system (GE Healthcare). We used a 160 min gradient with a 20 min equilibration phase with buffer A, a linear increase to 30% buffer B (7 mM KH 2 PO 4 , pH 2.65, 350 mM KCl, 30% MeCN) within 30 min, a second linear increase to 75% buffer B in 80 min, 100% B for 10 min, and a final equilibration with Buffer A for 20 min. The flow rate was 3 ml/min and the sample was injected after the initial 20 min equilibration phase. Upon injection, 60 fractions were collected with a P950 fraction collector throughout the run. Pooling of SCX fractions was guided by the UV 214 nm trace and fractions were combined starting where the first peptide peak appeared. For proteome analysis, 10% of each SCX fraction was aliquotted, and fractions were combined into 16 proteome samples that were subsequently desalted using StageTips (12). For phosphoproteome analysis 90% of each SCX fraction was used, and the SCX fractions were combined into 12 phosphopeptide samples that were subsequently desalted with reversed phase tC18 SepPak columns (Waters, 100 mg, WAT036820) and evaporated to dryness in a vacuum concentrator. SCX-separated peptides were subjected to IMAC (immobilized metal affinity chromatography) as described (11). Briefly, peptides were reconstituted in 200 l IMAC binding buffer (40% MeCN, 0.1% FA) and incubated for 1 h with 10 l of packed Phos-Select beads (Sigma, P9740) in batch mode. After incubation, samples were loaded on C18 StageTips (12), washed twice with 50 l IMAC binding buffer and washed once with 50 l 1% formic acid. Phosphorylated peptides were eluted from the Phos-Select resin to the C18 material by loading 3 times 70 l of 500 mM K 2 HPO 4 (pH 7.0). StageTips were washed with 50 l of 1% formic acid to remove phosphate salts and eluted with 80 l of 50% MeCN/ 0.1% formic acid. Samples were dried down by vacuum centrifugation and reconstituted in 8 l 3% MeCN/0.1% formic acid.
NanoLC-MS/MS Analysis-All peptide samples were separated on an online nanoflow HPLC system (Agilent 1200) and analyzed on an LTQ Orbitrap Velos mass spectrometer (Thermo Fisher Scientific). Fifty percent of each phosphopeptide sample and 10% of each peptide sample containing approximately 1 g of sample were in-iTRAQ is Superior to mTRAQ for Discovery Proteomics 10.1074/mcp.M111.014423-2 jected onto a fused-silica capillary column (New Objective, PicoFrit PF360 -75-10-N-5 with 10 m tip opening and 75 m inner diameter) packed in-house with 14 cm of reversed phase media (3 m ReproSil-Pur C18-AQ media, Dr. Maisch GmbH). The HPLC setup was connected via a custom made nanoelectrospray ion source to the mass spectrometer. After sample injection, peptides were separated at an analytical flowrate of 200 nL/min with an 70 min linear gradient (ϳ0.29%B/min) from 10% solvent A (0.1% formic acid in water) to 30% solvent B (0.1% formic acid/90% acetonitrile). The run time was 140 min for a single sample, including sample loading and column reconditioning. Data-dependent acquisition was performed using the Xcalibur 2.1 software in positive ion mode at a spray voltage of 2.5 kV. Survey spectra were acquired in the Orbitrap with a resolution of 60,000 and a mass range from 300 to 1800 m/z.
For all collision-induced dissociation (CID) scans, a normalized collision energy of 30 was used, the maximum inject time was 100 ms, and maximum ion counts were set to 1 ϫ 10 4 ions. For all higherenergy collisional dissociation (HCD) scans, collision energy was set to 45, maximum inject time was 250 ms and maximum ion count was 5 ϫ 10 4 counts. For iTRAQ samples, the 8 most intense ions per cycle were isolated and fragmented back to back with CID and HCD in parallel. CID product ions were analyzed in the linear ion trap, whereas HCD product ions were analyzed with the Orbitrap cell. mTRAQ samples were analyzed with CID-Top16 methods . We used an isolation window of 2.5 Th for isolation of ions prior to CID and HCD. Ions selected for MS/MS were dynamically excluded for 20 s after fragmentation.
Quantification and Identification of Peptides and Proteins-All mass spectra were processed using the Spectrum Mill software package v4.0 beta (Agilent Technologies, Santa Clara, CA) which includes modules developed by us for iTRAQ and mTRAQ-based quantification and phoshosite localization. Precursor ion quantification was done using extracted ion chromatograms for each precursor ion. The peak area for the extracted ion chromatograms of each precursor ion subjected to MS/MS was calculated automatically by the Spectrum Mill software in the intervening high-resolution MS1 scans of the LC-MS/MS runs using narrow windows around each individual member of the isotope cluster. Peak widths in both the time and m/z domains were dynamically determined based on MS scan resolution, precursor charge and m/z, subject to quality metrics on the relative distribution of the peaks in the isotope cluster versus theoretical. Similar MS/MS spectra acquired on the same precursor m/z in the same dissociation mode within Ϯ 60 s were merged. For the iTRAQ datasets where consecutive CID and HCD spectra were triggered on the same precursor, composite spectra were generated by extracting the iTRAQ reporter region (m/z 113-121) from a given HCD spectrum and merging it into the preceding CID spectrum after deleting reporter ion signal, if present, in the original CID spectrum. Precursor ion purity (PIP) was calculated for every precursor ion triggered for MS/MS from its corresponding MS1 spectrum as 100% X the intensity of the precursor ion isotopic envelope/total intensity in the precursor isolation window. MS/MS spectra with precursor charge Ͼ7 and poor quality MS/MS spectra, which failed the quality filter by not having a sequence tag length Ͼ1 (i.e. minimum of three masses separated by the in-chain mass of an amino acid) were excluded from searching.
For peptide identification MS/MS spectra were searched against an International Protein Index protein sequence database (IPI version 3.66, human, containing 86,886 entries) to which a set of common laboratory contaminant proteins was appended. Search parameters included: ESI linear ion-trap or ESI Orbitrap HCD scoring parameters, trypsin enzyme specificity with a maximum of three missed cleavages and KP or RP cleavages allowed, 50% minimum matched peak intensity, Ϯ 20 ppm precursor mass tolerance, Ϯ0.7 Da (CID) or Ϯ 50 ppm (HCD) product mass tolerance, and carbamidomethylation of cysteines and iTRAQ or mTRAQ labeling of lysines and peptide n-termini as fixed modifications. Allowed variable modifications were oxidation of methionine, phosphorylation of serine, threonine or tyrosine residues with a precursor MHϩ shift range of -18 to 270 Da. Identities interpreted for individual spectra were automatically designated as valid by optimizing score and delta rank1-rank2 score thresholds separately for each precursor charge state in each LC-MS/MS while allowing a maximum target-decoy-based false-discovery rate (FDR) of 1.0% at the spectrum level. This yielded a final overall FDR at the peptide level of Ͻ1%.
In calculating scores at the protein level and reporting the identified proteins, redundancy is addressed in the following manner: the protein score is the sum of the scores of distinct peptides. A distinct peptide is the single highest scoring instance of a peptide detected through an MS/MS spectrum. MS/MS spectra for a particular peptide may have been recorded multiple times, (i.e. as different precursor charge states, isolated from adjacent SCX fractions, modified by oxidation of Met) but are still counted as a single distinct peptide. When a peptide sequence Ͼ8 residues long is contained in multiple protein entries in the sequence database, the proteins are grouped together and the highest scoring one and its accession number are reported. In some cases when the protein sequences are grouped in this manner there are distinct peptides which uniquely represent a lower scoring member of the group (isoforms or family members). Each of these instances spawns a subgroup and multiple subgroups are reported and counted toward the total number of proteins.
iTRAQ and mTRAQ ratios were obtained from the protein summary details export Table in Spectrum Mill. The median ratios of all nonphosphorylated peptides were used to normalize the ratios of all phosphorylated peptides. Median iTRAQ/mTRAQ ratios of phosphopeptides for each biological replicate were calculated over all versions of the same peptide including different charge states. The highest scoring versions of each distinct peptide were reported in supplemental Table S1 per experiment. To obtain iTRAQ/mTRAQ protein ratios the median was calculated over all distinct peptides assigned to a protein subgroup in each biological replicate. Frequency distribution histograms were obtained from Graphpad Prism 5.0. Log 2 phosphopeptide ratios followed a normal distribution that was fitted using least squares regression. Mean and standard deviation values derived from the Gaussian fit were used to calculate p values that were subsequently corrected for multiple testing by the Benjamini-Hochberg (BH) method (13). The purpose of using BH FDR p values was to prioritize peptide and protein ratios and establish a reasonable cutoff with a defined FDR of regulated components. Phosphopeptides and proteins with a BH FDR p Ͻ 0.05 in each biological replicate were defined as significantly regulated. Scatterplots with marginal histograms were generated in R with in-house scripts. We defined the iTRAQ/mTRAQ ratio of a given peptide/protein to be reproducible if it was quantified in two biological replicates, and if replicate ratios have variability within the bounded confidence interval. The confidence interval was calculated as the 99% prediction confidence interval based on the linear regression fit for two replicates (14). Robust, weighted regression was used, with weights proportional to the sum of the absolute values of the two ratios.
Annotated MS/MS spectra for all phosphorylated peptides can be manually inspected with a Spectrum Mill viewer via hyperlinks in supplemental Table S1. For best results use Internet Explorer on Windows.

RESULTS
Our goal in this study was to examine mTRAQ and iTRAQ labeling strategies in an experimental paradigm representative of a typical quantitative proteomics experiment. To this end, we compared three different biological states to fully utilize the three different nonisobaric tags that are available for mTRAQ-labeling (Fig. 1A). The same three biological samples were labeled in parallel with four isobaric iTRAQ labels whereby three labels were used to quantify relative changes among the biological states and a fourth label was used at a 1:5 ratio to test mixing accuracy (Fig. 1A, supplemental Fig. S1). We chose a well-studied model system for our comparison of precursor versus reporter ion quantification. Serum-starved HeLa cells were stimulated with EGF for 10 min and 24 h to monitor short and long term effects of EGF receptor (EGFR) signaling. After lysis and tryptic digestion, peptides were labeled with iTRAQ and mTRAQ reagent as depicted in Fig. 1A. To exclude systematic effects caused by isotopic impurities in any of the individual labeling reagents, the labels were exchanged in the biological replicate experiments (supplemental Fig. S1). We selected optimal data acquisition methods by analyzing these samples by direct LC-MS/MS analysis. Subsequently, we reduced sample com-plexity via SCX fractionation and used the optimal acquisition methods for deep phosphoproteome and proteome analysis.
Defining Optimal Acquisition Strategies for iTRAQ-and mTRAQ-labeled Phosphopeptide and Whole Proteome Samples on an LTQ Orbitrap Velos-Various data acquisition strategies have been described previously to detect and quantify iTRAQ-labeled peptides on LTQ Orbitrap instruments, such as pulsed-Q dissociation (15) and alternating CID/HCD scans (16) (17). Detection of fragment ions smaller than ϳ30% of the precursor ion mass is inherently difficult in linear ion traps (18), unless pulsed-Q dissociation is used. Alternatively, iTRAQ reporter ions can be also detected in linear ion traps after CID when they get released from low mass b-ions and lysine y1 ions subjected to MS3 fragmentation as we have shown previously (19). Sensitivity and precision for detection of iTRAQ reporter ions in the Orbitrap cell after HCD fragmentation have improved significantly since an axial extraction field has been implemented in HCD cells (20). We chose to measure iTRAQ-labeled samples with alternating CID/HCD scans on the same precursor as both fragmentation mode scans can be acquired in parallel without any further increase of duty cycle time. For HCD scans, we picked a normalized collision energy of 45 (CE%) which we found ideal to generate iTRAQ reporter ions as well as sequence ions in the same scan. Therefore, peak lists derived from HCD scans were not only used for quantification, but for identification as well. Using Spectrum Mill, the reporter mass region of HCD scans was merged with CID scans triggered on the same precursor to link quantitative data obtained from HCD scans with their preceding CID scans. This strategy is especially helpful when HCD scans contain only reporter ions but no useful sequencing ions, which are then derived in parallel from CID scans of the same precursor. There are no constraints in the choice of MS2 fragmentation modes for precursor ion quantification methods such as mTRAQ-labeling. HCD fragmentation with detection in the Orbitrap cell has the advantage of richer fragment ion spectra that can yield more identifications than CID scans in the linear ion trap as long as peptide intensities are high enough to reach target values of 5 ϫ 10 4 ions. On the other hand, CID with detection in the linear ion trap is more sensitive and requires only target values of 1 ϫ 10 4 ions, which can be an advantage for low abundance precursors such as phosphopeptides. We compared the numbers of identified and quantified peptides in unfractionated proteome and phosphopeptide samples of EGF-stimulated HeLa cells that were labeled with iTRAQ or mTRAQ. iTRAQ samples were measured with a Top8 CID/HCD method, in which the 8 most intense precursors were fragmented back-to-back with CID and HCD (Fig. 1B). mTRAQ samples were measured with a Top16 CID method and results were compared with Top8 HCD data acquisition. In this simple pre-experiment, we observed that the iTRAQ strategy outperformed the mTRAQ strategy by being able to quantify ϳthreeand 2.5-fold more distinct phosphopeptides and unmodified peptides, respectively. The fraction of quantified iTRAQ-labeled peptides is greater for proteome samples with unmodified peptides than for phosphopeptides. For highly abundant proteome samples, CID scans do not contribute any additional identifications to the ones derived from HCD scans, which in most cases also contain quantitative information. In contrast, CID scans on low abundance phosphopeptide samples more often result in identifications that could not be achieved with HCD scans. However, quantitative information from the low mass region of subsequent HCD scans was not available for all such CID scans. Head-to-head comparison of CID and HCD fragmentation for mTRAQ-labeled samples revealed that both fragmentation techniques are comparable with slightly more identifications in CID scans for both proteome and phosphopeptide samples. We selected CID acquisition as the method of choice for mTRAQ-labeled samples in this study because samples derived from phosphopeptide enrichments tend to have lower abundance analytes and CID detection in the ion trap provides better sensitivity.
iTRAQ-labeling is Superior to mTRAQ-labeling for Identification and Quantification of Phosphorylated Peptides-To test both labeling strategies on large scale data sets, we separated peptide samples derived from EGF-stimulated HeLa cells using SCX chromatography. We used 10% of the total sample for proteome analysis and split the sample into 16 fraction pools, whereas 90% of the total sample was used for phosphoproteome analysis and split into 12 fraction pools (Fig. 2). Phosphopeptides were further enriched with IMAC beads (11). In total, we identified more than 42,000 distinct phosphopeptides of which more than 39,000 were quantified (supplemental Table S1). Single iTRAQ experiments yielded more than 20,000 phosphopeptides with an overlap of 56% within biological replicates resulting in 12,129 distinct phosphopeptides that were quantified reproducibly (Table I, Figs. 3A, 3B). A typical mTRAQ experiment resulted in more than 11,000 phosphopeptides with a 40% overlap of phosphopeptides between biological replicates and a total of 4,448 reproducibly quantified distinct phosphopeptides. We observed 2.7-fold more reproducibly quantified phosphopeptides from the iTRAQ strategy compared with the mTRAQ strategy. This was true at both the 10min (Fig. 3A) and 24h (Fig. 3B) stimulation time points. We ascribe this large difference in identified and quantified peptides to two factors. First, the signal FIG. 2. Experimental workflow for global phosphoproteome and proteome analysis. After EGF-stimulation cells were lysed, digested with trypsin and labeled either with iTRAQ or mTRAQ reagents. Samples were mixed and separated via SCX: 90% of total peptides were combined into 12 phosphopeptide samples and further enriched for phosphorylated peptides with IMAC beads, whereas 10% of total peptides were combined into 16 whole proteome samples. All samples were analyzed on a LTQ Oribtrap Velos instrument with 140 min runs. iTRAQ samples were acquired with CID/HCD-Top8 and mTRAQ samples with CID-Top16 methods. Peptides were identified and quantified with Spectrum Mill. strengths for iTRAQ-labeled peptide precursors were ϳtwofold greater than for mTRAQ-labeled peptides due to the additive effect of iTRAQ labeling on precursor intensities when samples are multiplexed using the different iTRAQ tags. Second, multiplexing using the mTRAQ-strategy generates samples that are up to three times more complex at the precursorion level owing to the creation of up to three distinct precursor ions for every labeled peptide. This results in greater redundancy in MS2 scanning events and, therefore, undersampling of the proteome. We found that ϳ50% of all MS2 scans in mTRAQ data sets are triggered on peptides that have already been sequenced in a different mTRAQ state.
The scatterplots in Fig. 3 illustrate the reproducibility of ratios that were quantified for distinct phosphopeptides that overlapped in two biological replicates. Peptide ratios in perfect correlation between two replicates would lie on a regression line through the origin with a slope of 1. We used robust weighted linear regression analysis and called peptide ratios reproducible when bounded within a 99% confidence interval. Several features are evident in these data. Ratios appear compressed toward a log2 ratio of 0 for iTRAQ data when compared with the mTRAQ-derived ratios. This is most likely due to the well-described isolation impurity phenomenon (6). We also noticed some ratios in the mTRAQ data in the "off-diagonal" quadrants of the scatterplot (upper left and lower right). The label swapping strategy employed in this study allows identification of these ratios as irreproducible, possibly owing to interference in precursor level quantification. Interference of precursor quantification occurs usually only on one of the different precursor ions of a mTRAQ-labeled peptide and the reverse labeling strategy in such cases allows easy identification of such cases. For the 10 min EGF stimulation data sets, 90% of all data points lie within a linear range between 0.74 and 2.82 for iTRAQ, whereas for mTRAQ between 0.58 and 6.15. However, very similar percentages of regulated phosphopeptides-that is to say peptides whose ratios are statistically significantly different than 1:1-were identified using BH FDR p values (p Ͻ 0.05) for iTRAQ and mTRAQ labeled samples. In both data sets ϳ10 -12% and 2-3% of all phosphopeptides were up-regulated after 10 min and 24 h EGF stimulation, respectively. Fewer measured ratios are in discordance between two biological replicates in iTRAQ data sets compared with mTRAQ data sets.
We mapped all proteins with up-regulated phosphopeptides in the iTRAQ and mTRAQ data sets to the ErbB family KEGG pathway and found that both labeling strategies lead to the identification of the most important known signaling  . S2). However, 80 regulated phosphosites on kinases were observed in the iTRAQ data set compared with only 40 in the mTRAQ data set. Also, about threefold more regulated phosphotyrosine containing peptides were detected in the iTRAQ data set. Higher Numbers of Identified and Quantified Proteins were Obtained Using iTRAQ Compared with mTRAQ Labeling-Higher numbers of quantified peptides can translate into higher numbers of quantified proteins and better estimates of the relative protein quantity among conditions tested. To robustly quantify changes in protein expression we required a minimum of three quantified peptides for each protein. Using this criterion, 2699 proteins were reproducibly quantified in the iTRAQ data set, corresponding to 1.7-fold more proteins compared with the mTRAQ data set (Table I, Fig. 4). After 10 min of EGF stimulation no major changes in protein expression were detected (supplemental Table S1), whereas ϳ1% of all proteins were observed to be regulated at the 24 h time point with both labeling techniques. As most proteins did not change their expression after 10 min or 24 h of EGF-stimulation, we conclude by extension that most of the observed regulated phosphorylation events are not because of changes in protein abundance. Among the five most highly regulated proteins, we observed JunB, a mitogenic transcription factor that was also found to be regulated after long-term EGF stimulation in an earlier proteomics study (10).
Dampening of iTRAQ ratios occurs to a greater extent at the protein level compared with the phosphopeptide level presumably due to the much higher sample complexity of the former. In the 24 h EGF stimulation data set, 90% of all data points lie within a linear range between 0.82 and 1.23 for iTRAQ is Superior to mTRAQ for Discovery Proteomics iTRAQ, whereas they range between 0.72 and 1.45 for mTRAQ. We reasoned that sample complexity should be reflected in the PIP (21), a measure of the precursor to the total ion intensity in the isolation window used to select it for fragmentation. Higher PIP values correspond to "purer" signal for the selected peptide. Plotting a cumulative distribution of this precursor isolation purity value in proteome samples did reflect that precursor interference was occurring more often than in the phosphoproteome samples, however the effect was subtle (supplemental Fig. S3). The median value of all PIPs for peptides in proteome samples was 78.8%, while the median value in the phosphopeptide samples was 82.6%. (supplemental Fig. S3). In contrast, the median PIP of mTRAQ labeled phosphopeptide and proteome is 70%, reflecting the greater sample complexity introduced by the nonisobaric mass tags.
iTRAQ Ratios are Compressed Relative to mTRAQ Ratios but the Percentage of Statistically Significantly Regulated Phosphopeptides and Proteins is Similar-We observed a large overlap of data obtained with both labeling methods, as 92% of proteins and 68% of phosphopeptides quantified with mTRAQ were also quantified with iTRAQ (supplemental Fig. S5). Direct comparison of reproducible phosphopeptide and protein quantifications in the iTRAQ and mTRAQ data sets shows a good correlation between both labeling techniques (Fig. 5). Fitting lines through the plots in Fig. 5 (using robust weighted linear regression) results in slopes of 1.30 for phosphopeptide and 1.63 for protein ratios with Pearson coefficients of 0.88 and 0.92, respectively. In general, we expect more complex samples to suffer a greater degree of compression. This is borne out by the slopes of the lines in the fit. The value of the slope indicates the degree of compression, with higher number indicating greater compression. The more complex proteome sample is more highly compressed (1.63ϫ) than the relatively less complex phosphopeptide sample (1.30ϫ). Regression analysis of the proteome data set on peptide level resulted in a slope of 1.37 with a Pearson coefficient of 0.84 (supplemental Fig. S6). As the peptide scatter plot shows a sigmoidal trend with lower correlation we have more confidence in the regression analysis on protein level.
A very similar percentage of regulated phosphopeptides and proteins can be defined with BH FDR p values for iTRAQ and mTRAQ data sets (Table I). This trend is evident from the linear slopes observed in Fig. 5 and occurs over two orders of magnitude. Standard deviations in log 2 -space for whole iTRAQ and mTRAQ data sets are 0.45 (iTRAQ) versus 0.77 (mTRAQ) for the phosphoproteome and 0.17 (iTRAQ) versus 0.32 (mTRAQ) for the proteome. Smaller standard deviations lead to smaller cut-off values for a given p value, such as p Ͻ 0.05. Therefore, despite global ratio compression in iTRAQ data sets, regulated components can be identified to a similar degree as in data sets acquired using precursor-based quantification strategies (e.g. mTRAQ and SILAC) that presumably suffer from little or no compression.
iTRAQ-quantification is More Precise but Less Accurate-To measure quantification variability and accuracy in iTRAQ-and mTRAQ-data we spiked GluC generated peptides with known ratios into SCX fractions 6 and 8 of the whole proteome analysis. These peptides were derived from a GluC digest of HeLa lysate to enable facile distinction from background peptides that had been derived from trypsin digestion (Fig. 6A). The GluC generated peptides were labeled with iTRAQ or mTRAQ reagents, mixed at defined ratios, and spiked into tryptic whole proteome samples such that ϳ80% of all peptides were tryptic and ϳ20% were GluC-derived. We observed that median ratios for the GluC-derived peptides were more accurate for mTRAQ-labeled peptides, but also exhibited much higher variability. After mTRAQ labeling, median ratios deviated by an average of only 4% from the intended mixing ratios, whereas iTRAQ ratios were lower by an average of 32% from the intended mixing ratios. Coefficients of variation (CV) were much higher for mTRAQ-labeled peptides, reaching more than 100%. In contrast, the CVs for the iTRAQ measurements were ϳ30%.
To test the dependence of iTRAQ quantification accuracy on PIP, we plotted the observed iTRAQ ratios of the GluC peptides versus precursor isolation purity (supplemental Fig. S4A). While the relatively poor accuracy of iTRAQ ratios has been ascribed to interference caused by co-isolation of peptide precursors (6), we observed no striking correlation in these plots. Plotting the intensity values of precursor isotopic envelopes in the MS1 scans recorded immediately prior to the corresponding MS2 scans versus the summed iTRAQ reporter ion intensities of all four channels shows that precursor ions of similar intensity can generate reporter ions with variable intensity values over at least two orders of magnitude (supplemental Fig. S4B). The lack of correlation makes it impossible to use precursor isolation purity as a metric for quantification accuracy.
Compression of iTRAQ Data can be Modeled by Regression Analysis of Logarithmic Ratios-To better understand precursor isolation interference of iTRAQ ratios we generated a model of iTRAQ ratios affected by 1:1 background ratios at constant PIP values (Fig. 6B). Interference follows a sigmoidal trend with logarithmic ratios that show maximum log 2 values of 3.17 and 2.32 at PIP values of 75 and 50%, respectively. At a PIP of 75% (which is similar to the average PIP for our phosphoproteome and proteome data sets), we observe a nearly linear correlation of interfered to expected log 2 ratios over two orders of magnitude. Linear regression analysis of values that lie between Ϫ3.32 and 3.32, that correspond to 10-fold down-and up-regulation in linear space, shows a slope of 0.7127. This shows that at a PIP value of 75% iTRAQ peptide ratios may be corrected for compression over two orders of magnitude. The linear trend of iTRAQ versus mTRAQ log 2 ratios (Fig. 5) from the phosphoproteome and proteome analysis indicates the possibility to define an overall interference-based correction factor. To test this hypothesis, we compared logarithmic iTRAQ ratios of GluC peptides spiked into tryptic background versus no background. We observed a linear correlation of the median spike-in ratios with versus without tryptic background peptides (Fig. 6C). The slope and Y-intercept derived from linear regression analysis were used to calibrate compressed iTRAQ ratios and decreased the average deviation to 5%. Finally, we also applied the same correction to the iTRAQ phosphoproteome and proteome data sets using mTRAQ-derived ratios as true values for calibration. Logarithmic iTRAQ data points were multiplied by a factor of 1.30 for phosphopeptides and 1.63 for proteins to decompress the data set (supplemental Table S1). iTRAQ is Superior to mTRAQ for Discovery Proteomics FIG. 6. iTRAQ quantification is more precise but less accurate than mTRAQ. A, GluC peptide spike-in experiments to test accuracy and variability of quantification. HeLa proteins were digested with endoproteinase GluC, labeled with three different iTRAQ or mTRAQ labels, mixed in defined ratios and spiked into fractions 6 and 8 of the whole proteome analysis. GluC-derived peptides were less than 20% of all identified peptides for iTRAQ as well as mTRAQ samples. Median ratios and coefficients of variation (CV) are shown for all spike-in GluC-derived peptides in a 1:1:1 background of tryptic peptides derived from biological samples. iTRAQ ratios show a strong compression, whereas mTRAQ peptides have much higher CV values. B, Simplified precursor isolation purity model of iTRAQ ratio compression. Effect of PIP at constant values is shown on iTRAQ ratio compression with real log 2 values on the abscissa and modeled log 2 values on the ordinate. Equations (1) and (2) simulate the contribution of 1:1 background reporter ions to the reporter ions of the peptide of interest based on constant PIP values. The extent of the linear trend in the sigmoidal function is dependent on the PIP value, which at PIP ϭ 75% spans two orders of magnitude in linear space. C, Linear relationship of logarithmic GluC peptide ratios measured with and without tryptic background peptides. Scatterplot depicts log 2 iTRAQ ratios of GluC peptides with and without background interference. In red, median ratios of all GluC peptides are shown at a given mixing ratio as well as the corresponding linear regression line. Note that after correction via the slope of the linear regression, corrected iTRAQ ratios are quite close to the original ratios without background. DISCUSSION In this study, we extensively compared two chemical labeling strategies for MS-based proteomic quantification. The mTRAQ strategy derives quantitative information from precursor ion intensity whereas the iTRAQ strategy requires peptide fragmentation to produce reporter ions for quantitative measurement. We compared the ability of each method to detect significant relative changes on both the (phospho)peptide and protein level. To emulate a representative biological experiment, we analyzed changes in the phosphoproteome and total proteome upon EGF stimulation at a very deep coverage. We found that iTRAQ-based quantification provided nearly threefold more reproducibly quantified phosphopeptides (12,129 versus 4,448) and nearly twofold more quantified proteins (2,699 versus 1,597) compared with mTRAQ-based quantification. These results clearly demonstrate that iTRAQ labeling is superior to mTRAQ for proteome-wide discovery studies in complex samples. The differences can be explained by iTRAQ labeling having an additive effect on precursor intensities, whereas mTRAQ labeling leads to a large number of effectively redundant MS2 scanning events caused by triggering on the same peptide bearing different mTRAQ labels. Our results also indicate that further improvements in instrument control of data acquisition are needed to prevent redundant sequencing of the same peptide in its different isotopically labeled states in mTRAQ, SILAC or reductive dimethylation experiments. In our experiments, up to 50% of all MS2 scans could be considered redundant. If these duty cycles could be channeled into sequencing unrelated precursors, proteome depth and coverage would likely improve for all precursor-level quantification methods.
Most key proteins in the EGFR signaling network were quantified with both techniques, indicating that all of these protein lie within the limit of quantification for both iTRAQ-and mTRAQ-quantification. However, the majority of expressed cellular protein kinases found in the study were detected and quantified only in our iTRAQ data set, indicating a potential benefit to sensitivity of low-level proteins.
Dampening of iTRAQ ratios because of co-isolation interference via nonregulated background peptides has recently been discussed in a number of publications (5-7). Savitski and coworkers defined a signal-to-interference score (s2i) for iTRAQ quantification and described improved reproducibility of quantification results for affinity enriched protein kinase subproteomes at high s2i values that correspond to high purity during fragmentation. The s2i score is very similar to PIP values we have used in this study in the Spectrum Mill software package. We observed only minimal correlation of high PIP values with quantification accuracy of GluC-derived peptides spiked-in at known ratios into complex proteome fractions. This is because of the fact that the intensities of mass-tag reporter ions in iTRAQ are poorly correlated to the abundance of the precursor ions from which they derive.
Precursor ions of similar intensities can produce iTRAQ reporter ions that span over two orders of magnitude in intensity. This means that very low intensity background ions can significantly contribute to reporter ion generation when they get cofragmented with the precursor ion the MS2 scan is triggered on. Therefore PIP cannot be used as an accurate measure to filter for correctly quantified iTRAQ ratios in whole proteome and phosphoproteome datasets.
It has been noted previously that iTRAQ ratio compression occurs at a consistent proportion within an experiment and thus results in a linear relationship between observed and expected ratios of spike-in proteins (5). We observe the same trend in our GluC peptide spike-in data sets as well as when comparing iTRAQ and mTRAQ ratios for the proteome and phosphoproteome analysis. This also explains why similar percentages of differentially regulated phosphopeptides and proteins can be identified in iTRAQ and mTRAQ data, as not only regulated iTRAQ ratios get compressed but each ratio in the whole data set. Our simplified iTRAQ interference model at constant PIP values indicates that this linear relation of logarithmic ratios comprises the central part of a sigmoidal curve. Linear regression analysis should therefore only be useful for moderate iTRAQ ratios (e.g. between 10-fold up-or down-regulation) because larger changes would be underestimated. It is important to note that correction of iTRAQ ratio compression by an average compression factor can only be applied to data sets where very large numbers of true abundance ratios are known and excellent linearity between compressed and true ratios can be observed.
As co-isolation interference is dependent on sample complexity and the number of co-eluting peptides, improving chromatographic separation of peptides using higher resolution methods can somewhat improve quantification accuracy (22,23). We see a similar effect after IMAC enrichment of phosphopeptides as iTRAQ ratio compression is smaller for phosphoproteome samples compared with proteome samples due to their overall lower complexity. Very recently, it has been shown that decreasing the width of the isolation window for example from 3 to 1 Th has almost no beneficial effect on TMT quantification accuracy (24). However, in two recent studies alternative peptide fragmentation approaches have been described to decrease co-isolation impurities during iTRAQ reporter ion generation that make use of MS3 scans that are either triggered on reduced charged state precursors after proton transfer reactions (24) or on high m/z MS2-fragment ions (25). All of these approaches will be helpful to improve iTRAQ quantification accuracy and to overcome iTRAQ ratio compression problems.
In summary, we show here that iTRAQ reagents are superior to mTRAQ reagents to identify larger numbers of novel regulated elements in whole proteome and phosphoproteome discovery efforts. iTRAQ quantification exhibits much better sensitivity, less variability and better reproducibility than quantification with mTRAQ. Reduced accuracy