Multiplexed, Quantitative Workflow for Sensitive Biomarker Discovery in Plasma Yields Novel Candidates for Early Myocardial Injury*

We have developed a novel plasma protein analysis platform with optimized sample preparation, chromatography, and MS analysis protocols. The workflow, which utilizes chemical isobaric mass tag labeling for relative quantification of plasma proteins, achieves far greater depth of proteome detection and quantification while simultaneously having increased sample throughput than prior methods. We applied the new workflow to a time series of plasma samples from patients undergoing a therapeutic, “planned” myocardial infarction for hypertrophic cardiomyopathy, a unique human model in which each person serves as their own biologic control. Over 5300 proteins were confidently identified in our experiments with an average of 4600 proteins identified per sample (with two or more distinct peptides identified per protein) using iTRAQ four-plex labeling. Nearly 3400 proteins were quantified in common across all 16 patient samples. Compared with a previously published label-free approach, the new method quantified almost fivefold more proteins/sample and provided a six- to nine-fold increase in sample analysis throughput. Moreover, this study provides the largest high-confidence plasma proteome dataset available to date. The reliability of relative quantification was also greatly improved relative to the label-free approach, with measured iTRAQ ratios and temporal trends correlating well with results from a 23-plex immunoMRM (iMRM) assay containing a subset of the candidate proteins applied to the same patient samples. The functional importance of improved detection and quantification was reflected in a markedly expanded list of significantly regulated proteins that provided many new candidate biomarker proteins. Preliminary evaluation of plasma sample labeling with TMT six-plex and ten-plex reagents suggests that even further increases in multiplexing of plasma analysis are practically achievable without significant losses in depth of detection relative to iTRAQ four-plex. These results obtained with our novel platform provide clear demonstration of the value of using isobaric mass tag reagents in plasma-based biomarker discovery experiments.

We have developed a novel plasma protein analysis platform with optimized sample preparation, chromatography, and MS analysis protocols. The workflow, which utilizes chemical isobaric mass tag labeling for relative quantification of plasma proteins, achieves far greater depth of proteome detection and quantification while simultaneously having increased sample throughput than prior methods. We applied the new workflow to a time series of plasma samples from patients undergoing a therapeutic, "planned" myocardial infarction for hypertrophic cardiomyopathy, a unique human model in which each person serves as their own biologic control. Over 5300 proteins were confidently identified in our experiments with an average of 4600 proteins identified per sample (with two or more distinct peptides identified per protein) using iTRAQ four-plex labeling. Nearly 3400 proteins were quantified in common across all 16 patient samples. Compared with a previously published label-free approach, the new method quantified almost fivefold more proteins/sample and provided a six-to nine-fold increase in sample analysis throughput. Moreover, this study provides the largest high-confidence plasma proteome dataset available to date. The reliability of relative quantification was also greatly improved relative to the label-free approach, with measured iTRAQ ratios and temporal trends correlating well with results from a 23-plex immunoMRM (iMRM) assay containing a subset of the candidate proteins applied to the same patient samples. The functional importance of improved detection and quantification was reflected in a markedly expanded list of significantly regulated proteins that provided many new candidate biomarker proteins. Preliminary evaluation of plasma sample labeling with TMT six-plex and ten-plex reagents suggests that even further increases in multi-plexing of plasma analysis are practically achievable without significant losses in depth of detection relative to iTRAQ four-plex. These results obtained The goal of biomarker discovery studies in plasma or other biological matrices is to identify proteins that are truly differential in abundance between a population of cases and suitable controls, or before and after a relevant perturbation. The approach has generally been to use label-free methods to analyze individual samples from each class or condition, often few in number, and to employ a differential cutoff to discriminate real differences from technical artifacts or biological noise. These cutoffs are frequently arbitrarily determined, but may be informed by variables such as the observed CVs of measured peptides and the total number of samples included in the study.
To reconcile the vast dynamic range and complexity of plasma with the hypothesis that disease-specific markers are likely to be of relatively low abundance, plasma samples for biomarker discovery are often extensively processed before liquid chromatography tandem MS (LC-MS/MS) 1 analysis.
Typical approaches to improve depth of detection in plasma include immunoaffinity-based depletion of abundant proteins and offline chromatography at the protein or peptide level using approaches based on separation techniques that differ from the final RP-HPLC separation into the mass spectrometer (e.g. strong cation exchange chromatography (SCX), size exclusion chromatography, or reversed phase chromatography at basic pH). Increasing the fraction number in plasma discovery experiments has been shown to increase both the total number of peptides/proteins confidently identified and the relative enrichment of low abundance proteins, (1) with accumulating data supporting reversed phase at basic pH as a particularly effective fractionation strategy (1)(2)(3). Unfortunately, such intensive processing of individual samples improves depth of coverage at the dual cost of increased preanalytical variability and decreased throughput. Pre-analytical variability in turn increases the technical CV and hence the minimum difference that can be reliably measured between samples. As an example of the merits and limitations of such a strategy, we have reported previously on plasma-based biomarker discovery for myocardial injury using a label-free workflow consisting of depletion of the most abundant plasma proteins followed by digestion and extensive fractionation of peptides by SCX chromatography prior to LC-MS/MS analysis (4). Detection of up to 900 proteins per sample time point was achieved by fractionation into 80 sub-samples and analyzing each using a 90 min effective gradient (150 min inject-to-inject). Such extensive fractionation markedly constrained the number of samples that could be analyzed, and the attendant pre-analytical variability contributed to the determination that only relatively large fold-changes in abundance (Ͼ5ϫ) could be reliably distinguished by this label-free approach.
Though more quantitative and efficient approaches for biomarker discovery in plasma are clearly warranted, surprisingly few plasma studies have attempted to employ contemporary quantitative proteomic methods in deep profiling experiments. Until recently, the use of iTRAQ and similar labeling methods for plasma proteomics had succeeded in quantifying only a few hundred proteins. Somewhat better results are beginning to be reported. A recent interlaboratory study comparing the performance of several high resolution mass spectrometers with respect to detection and iTRAQ quantification identified 1200 to 1700 proteins in plasma in an iTRAQ fourplex format after two rounds of combined IgY14/Supermix (Sigma-Aldrich, St.Louis, MO) immunoaffinity depletion of samples and partitioning of peptides by isoelectric focusing into 30 fractions (5). In a very large-scale study using immunoaffinity depletion of the six most abundant plasma proteins, 8-plex iTRAQ labeling of peptides and LC-MS/MS analysis of 24 SCX peptide fractions, Cole et al. identified and quantified a core set of 982 proteins in at least 50 out of 500 plasma samples allowing for protein identification using a single peptide (6). The total aggregate number of proteins quantified using two or more peptides across all 72 iTRAQ experiments in these experiments was ca. 900, whereas the average number of proteins quantified by two or more peptides per iTRAQ experiment was ca. 300 (personal communication with the author).
We reasoned that the combination of optimized plasma processing protocols, improved chromatography and latestgeneration MS systems could further enhance biomarker discovery, enabling practical use of iTRAQ and similar labeling approaches such as TMT (7,8) for plasma proteomics without sacrificing depth of proteome detection. We deployed and tested these process improvements in the context of ongoing discovery and verification of plasma biomarkers of acute myocardial injury, using the planned myocardial infarction model described previously (4).

EXPERIMENTAL PROCEDURES
Patient Sample Collection-Four patients undergoing planned myocardial infarction (PMI) with alcohol ablation for the treatment of hypertrophic obstructive cardiomyopathy (HOCM) were included in this study (4,9). All protocols for blood collection were approved by the Massachusetts General Hospital Institutional Review Board, and all subjects gave written informed consent.
Peripheral plasma at baseline, as well as 10, 60, and 240 min post-injury for each patient were included in the study.

Sample Preparation for Discovery Proteomics Study in PMI Patient Samples-
Plasma Depletion and Enzymatic Digestion-Four hundred microliters peripheral plasma from four patients collected at baseline and 10, 60, and 240 min postalcohol ablation was immunoaffinity depleted of 14 most abundant proteins followed by the next ϳ50 moderately abundant proteins using IgY14 LC20 and Supermix LC10 columns (Sigma-Aldrich, St.Louis, MO). Tandem depletion was performed on Agilent 1100 HPLC (Agilent, Santa Clara, CA) system using Dilution, Stripping, and Neutralization buffers provided by the manufacturer and following manufacturer's instructions (Sigma-Aldrich). Flow through of the Supermix column representing depleted plasma was concentrated and buffer exchanged to 50 mM Ammonium Bicarbonate to the original volume (400 l) using Amicon 3K concentrators (Millipore, Billerica, MA). Protein concentrations of depleted plasma were determined by BCA protein assay (Thermo Fisher Scientific, Waltham, MA).
Four hundred microliters of IgY14/Supermix depleted peripheral plasma per time point and patient was denatured with 6 M Urea, reduced with 20 mM dithiothreitol at 37°C for 30 min, and alkylated with 50 mM iodoacetamide at room temperature in the dark for 30 min. Urea concentration was diluted to 2 M with 50 mM ammonium bicarbonate prior to Lys-C digestion (Wako, Richmond, VA) at 1:50 (w:w) enzyme to substrate ratio at 30°C for 2 h with mixing on the shaker at 850 rpm. Urea was further diluted to less than 1 M prior to overnight digestion with trypsin (Promega, Madison, WI) with 1:50 (w:w) enzyme to substrate ratio at 37°C with shaking at 850 rpm. Digestion was terminated with formic acid to a final concentration of 1%. The digests were desalted using Oasis HLB 1cc (30 mg) reversed phase cartridges (Waters, Milford, MA) with 0.1% Formic acid and 0.1% Formic acid/80% Acetonitrile as buffers A and B, respectively, using a vacuum manifold. Cartridges were conditioned with 3 ϫ 500 l buffer B followed by equilibration with 4 ϫ 500 l buffer A. After loading the digests at a reduced flow rate, they were washed with 3 ϫ 750 l buffer A and eluted with 3 ϫ 500 l buffer B. Eluates were frozen and dried by vacuum centrifugation. Digests were reconsti-tuted in 400 l of 0.1% formic acid and post-digestion concentrations were determined by BCA. Based on the post-digestion concentration, 80 g aliquots were prepared, frozen, dried to dryness by vacuum centrifugation and stored at Ϫ80°C.
Spiking in Synthetic Peptide Mixture-Mixture of 97 heavy labeled synthetic peptides was used for spiking into the different time point samples of three out of the four PMI patient samples. Peptide mixture was spiked in at the time of reconstitution of samples for iTRAQ labeling at the ratio of 1:2:5:10 in the four different channels of iTRAQ four-plex. Total amounts spiked into the samples were as follows: baseline -1 pmol; 10 min -2 pmol; 1 h -5 pmol; 4 h -10 pmol.
iTRAQ Labeling of Plasma Samples-Eighty micrograms dried aliquots of the four time points (baseline, 10, 60 and 240 min post-injury) for each PMI patient were labeled with iTRAQ four-plex reagent following manufacturer's instructions for labeling plasma (AB Sciex, Framingham, MA), which calls for use of two-fold more reagent for plasma than other samples (cell lysates, tissues, etc). In an effort to eliminate bias toward any iTRAQ channel, different iTRAQ channel layout was used for the different time points of the four PMI patient samples. After reconstituting samples in 30 l 1 M triethylammoniumbicarbonate (TEAB) 100 l ethanol was added to each sample. Pooled iTRAQ reagent from two vials was added to each sample, mixed and incubated at room temperature for 1 h. Three microliters of each sample was used to check label incorporation by LC-MS/MS prior to quenching the reaction. Once satisfied with labeling efficiency (Ͼ 95% label incorporation) the reactions were quenched by adding Tris pH 8 for a final concentration of 100 mM and incubating at room temperature for 15 min. Labeled samples representing four different time points of a PMI patient were mixed together, dried down and desalted using Oasis HLB 1cc (30 mg) reversed phase cartridges as described above. Eluates were frozen, dried to dryness, and stored at Ϫ80°C.
Evaluation of TMT6 and TMT10 Reagents for Labeling Plasma-For the evaluation of TMT six-plex, and TMT ten-plex reagents (Thermo Fisher Scientific) as well as direct comparison with iTRAQ four-plex reagent pooled normal plasma from commercial sources was used (BioreclamationIVT, Baltimore, MD). Plasma was depleted using IgY14/Supermix immunodepletion strategy and digested as described above. Following digestion and desalting, aliquots of 4 ϫ 80 g, 6 ϫ 53.3 g, and 10 ϫ 32 g aliquots were made for iTRAQ four-plex, TMT six-plex, and TMT ten-plex labeling, respectively. Mixture of the same 97 heavy labeled synthetic peptides was spiked in prior to labeling at the following ratios: for iTRAQ four-plex at 10:5:1:2 ratios corresponding to 114:115:116:117 channels, for TMT six-plex at 10:5:1:1:2:0.5 corresponding to 126:127:128:129:130:131 channels, and for TMT ten-plex at 10:5:0.5:2:1:1:2:3:5:0.5 corresponding to 126:127N:127C:128N:128C:129N:129C:130N:130C:131 channels. iTRAQ labeling of plasma was done as described above. For TMT, labeling samples were reconstituted in 100 mM TEAB for a final concentration of 1 g/l protein (53 l for TMT six-plex, and 32 l for TMT ten-plex). 21 l and 12 l of each of the TMT six-plex or TMT ten-plex reagent was added to the plasma aliquots, mixed and incubated at room temperature with shaking for 1 h. Three microliters of each sample was used to check label incorporation by LC-MS/MS prior to quenching the reactions. Reactions were quenched by adding 5% hydroxylamine at a 0.08 l/g concentration, dried down and desalted using Oasis HLB 1cc (30 mg) cartridges. Further processing of these samples was done as described below.
Fractionation of Peptides by Reversed Phase Chromatography at High pH (Basic pH RP)-Digested and iTRAQ labeled plasma sample for each patient was reconstituted in 540 l of 20 mM ammonium formate/2% acetonitrile pH 10, loaded on a Zorbax 300 Extend 2.1 ϫ 150 mm column (Agilent Technologies, Santa Clara, CA), and fractionated on an Agilent 1100 Series HPLC instrument by basic re-versed-phase chromatography at a flow rate of 200 l/min. Mobile phase consisted of 20 mM ammonium formate/2% acetonitrile pH 10 (buffer A) and 20 mM ammonium formate 90% acetonitrile pH 10 (buffer B). After loading 500 l of sample (300 g) onto the column, the peptides were separated using the following gradient: 5 min isocratic hold at 0% B, 0 to 15% solvent B in 8 min; 15 to 28.5% solvent B in 33 min; 28.5 to 34% solvent B in 5.5 min; 34 to 60% solvent B in 13 min, for a total gradient time of 64.5 min. Using 96 ϫ 2 ml well plates, fractions were collected every 0.6 min for a total of 84 fractions through the main elution profile of the separation. In addition, the extreme early and late portions of the gradient were collected into two additional larger volume fractions. All fractions were acidified to a final concentration of 1% formic acid and the internal 84 fractions were then recombined by pooling early, mid and late fractions together resulting in a total of 28 fractions using concatenation strategy (3). These 28 fractions along with the additional 2 fractions representing early and late eluting peptides constructed a total of 30 fractions to be analyzed by LC-MS/MS. All fractions were dried to dryness by vacuum centrifugation and stored at Ϫ80 C until mass spectrometric analysis.
NanoLC-MS/MS analysis-For plasma samples from individual patients each of the 30 fractions was reconstituted in 16 l of 5% formic acid/3% Acetonitrile and 2 l were analyzed on Q Exactive mass spectrometer (Thermo Fisher Scientific) equipped with a nanoflow ionization source (James A. Hill Instrument Services, Arlington, MA) and coupled to an EASY-nLC 1000 UHPLC system (Thermo Fisher Scientific). Chromatography was performed on a 75 m ID picofrit column (New Objective, Woburn, MA) packed in house with Reprosil-Pur C18 AQ 1.9 m beads (Dr. Maisch, GmbH, Entringen, Germany) to a length of 20 cm. Columns were heated to 50°C using column heater sleeves (Phoenix-ST, Chester, PA) to prevent overpressuring of columns during UHPLC separation. The LC system, column, and platinum wire to deliver electrospray source voltage were connected via a stainless-steel cross (360 m, IDEX Health & Science, UH-906x). Mobile phases consisted of 0.1% formic acid/3% acetonitrile as solvent A, and 0.1% formic acid/90% acetonitrile as solvent B. Peptides were eluted at 200 nL/min with a gradient of 6 to 35% B in 150 min, 35 to 60% B in 8 min, 60 to 90% B in 3 min, hold at 90% B for 10 min, 90% B to 50% B in 1 min, followed by isocratic conditions at 50% B for 10 min. A single Orbitrap MS scan from 300 to 1800 m/z at a resolution of 70,000 with AGC set at 3e6 was followed by up to 12 ms/ms scans at a resolution of 17,500 with AGC set at 5e4. MS/MS spectra were collected with normalized collision energy of 27 and isolation width of 2.5 amu. Dynamic exclusion was set to 20s, and peptide match was set to on.
For samples of plasma pooled from multiple patients and used in the iTRAQ/TMT comparison some of the above parameters were revised. Analyses were performed on a Q Exactive Plus mass spectrometer (Thermo Fisher Scientific) with an isolation width of 2.0 amu. For TMT labeled peptides the normalized collision energy was decreased to 26. For TMT-10 labeled peptides MS/MS spectra were collected at a resolution of 35,000.
Data Analysis-Data analysis was done using the Spectrum Mill MS Proteomics Workbench software package v 4.2 beta (Agilent Technologies). Similar MS/MS spectra acquired on the same precursor m/z within Ϯ 60 s were merged. MS/MS spectra were excluded from searching if they failed the quality filter by not having a sequence tag length Ͼ 0 (i.e. minimum of two masses separated by the in-chain mass of an amino acid) or did not have a precursor MHϩ in the range of 750 -4000. All extracted spectra were searched against a UniProt database containing human reference proteome sequences (including isoforms and excluding fragments), 58,929 entries. The sequences were downloaded from the UniProt web site on October 17, 2014, redundant sequences removed, and a set of common labora-tory contaminant proteins (150 sequences) appended. Search parameters included: ESI-QEXACTIVE-HCD-v2 scoring, parent and fragment mass tolerance of 20 ppm, 40% minimum matched peak intensity, trypsin allow P enzyme specificity with up to four missed cleavages, and calculate reversed database scores enabled. Fixed modifications were carbamidomethylation at cysteine. iTRAQ/TMT labeling was required at lysine, but peptide N termini were allowed to be either labeled or unlabeled. Allowed variable modifications were acetylation of protein N termini, oxidized methionine, deamidation of asparagine, pyro-glutamic acid at peptide N-terminal glutamine, and pyro-carbamidomethylation at peptide N-terminal cysteine with a precursor MHϩ shift range of Ϫ18 to 70 Da. Database matches were autovalidated at the peptide and protein level in a two-step process with identification FDR estimated by target-decoy-based searches using reversed sequences. Peptide autovalidation was done first and separately for each patient directory of 30 LC-MS/MS files using an auto thresholds strategy with a minimum sequence length of six, automatic variable range precursor mass filtering, and score and delta Rank1 -Rank2 score thresholds optimized to yield a spectral level FDR estimate for precursor charges 2 thru 4 of Ͻ0.8% for each precursor charge state in each LC-MS/MS run. For precursor charge 5, thresholds were optimized to yield a spectral level FDR estimate of Ͻ0.4% across all runs per patient (instead of each run), to achieve reasonable statistics because many fewer spectra are generated for the higher charge state.
Protein polishing autovalidation, a feature of Spectrum Mill, was then applied using an auto thresholding strategy. Protein polishing determines the maximum protein level score of a protein group that consists entirely of distinct peptides estimated to be false-positive identifications (PSM's with negative delta forward-reverse scores). Then all PSM's contributing to protein groups with a score at or below this protein score threshold, or derived from only a single patient, are removed from the set of PSM's obtained in the initial peptide-level autovalidation step. This step further filters all the peptide-level validated spectra with the primary goal of eliminating peptides identified with low scoring peptide spectrum matches (PSM's) that represent proteins identified by a single peptide from a single patient, so-called one-hit wonders. Proteins were grouped together across the four patient directories with minimum number of directories set to 2; minimum protein score of 13 and maximum protein level FDR estimate of 0%. The protein polishing step filtered the results so that each identified protein is detected in more than one patient and is comprised of multiple peptides unless a single excellent scoring peptide was the sole match. As shown in Fig. 2, these settings yielded a spectrum level FDR estimate of Ͻ0.5% and a peptide level FDR estimate of Ͻ1.5% for each patient. In aggregate across all 4 patients the estimated FDR's are spectrum level: 0.45%, peptide level: 2.36%, and protein level: Ͻ0.02% (1/5304). Because the protein level FDR estimate neither explicitly requires a minimum number of distinct peptides per protein nor adjusts for the number of possible tryptic peptides per protein, it may underestimate false positive protein identifications for large proteins observed only on the basis of multiple low scoring PSM's.
In calculating scores at the protein level and reporting the identified proteins, redundancy is addressed in the following manner: the protein score is the sum of the scores of distinct peptides. A distinct peptide is the single highest scoring instance of a peptide detected through an MS/MS spectrum. MS/MS spectra for a particular peptide may have been recorded multiple times, (i.e. as different precursor charge states, in adjacent bRP fractions, or different modification states) but are still counted as a single distinct peptide. When a peptide sequence Ͼ8 residues long is contained in multiple protein entries in the sequence database, the proteins are grouped together and the highest scoring one and its accession number are reported.
In some cases when the protein sequences are grouped in this manner there are distinct peptides which uniquely represent a lower scoring member of the group (isoforms, family members, or different species). Each of these instances spawns a subgroup and multiple subgroups are reported and counted toward the total number of proteins. Peptides shared between subgroups were counted toward each subgroup's count of distinct peptides and protein level iTRAQ/ TMT quantitation. As listed in supplemental Table S2, assembly of confidently identified PSM's from all four patients into proteins yields 5340 total protein subgroups from 4591 protein groups. For further analyses in this study relying solely on identification the list was filtered to include only the 5304 protein subgroups identified by two or more peptides across all four patient samples. For quantitative analyses the list was filtered to include only the 4819 protein subgroups identified by two or more peptides in at least one patient sample.
Protein quantitation was done using iTRAQ/TMT ratios representing 10 min versus baseline, 1 h versus baseline, and 4 h versus baseline for each protein or different channels in iTRAQ/TMT comparison experiment. Spectrum Mill used the reporter ion intensities to calculate the iTRAQ/TMT ratios for each PSM. A protein level iTRAQ/ TMT ratio was calculated as the median of all PSM level ratios contributing to the protein remaining after excluding those PSM's lacking an iTRAQ/TMT label, having a negative delta forward-reverse score (half of all false-positive identifications), or having a precursor ion purity Ͻ50% (MS/MS has significant precursor isolation contamination from co-eluting peptides).
Data Analysis of Spiked-in Peptides-After the first search and validation, unmatched spectra were searched against a Uniprot subset database containing only the proteins from which the unlabeled versions of the heavy labeled synthetic peptides would derive. Fixed modifications used for the search were iTRAQ, TMT6, or TMT10 labels at peptide N termini. Variable modifications were C13N15 Arginine, and iTRAQ, TMT6 or TMT10 labeled C13N15 Lysine at the C termini. iTRAQ and TMT ratios for the synthetic peptides were exported from Spectrum Mill and used for further analysis of iTRAQ, TMT6 and TMT10 spiked-in data.
Statistical Analysis-The moderated F-test was used to assess statistically significant changes over the time course in four PMI patient samples. The F-test was used to compare log2 transformed iTRAQ protein ratios from each of the three groups representing the 10 min versus baseline, 1 h versus baseline, and 4 h versus baseline comparisons to determine if any of the proteins in the groups have ratios statistically different from zero. Use of the F-test enables detection of arbitrary temporal patterns, and the moderated version of the test borrows information from all the observed proteins to assess variation of the ratios in a more robust manner. The moderated F-test is implemented in R (10) using the limma (11) library. Nominal p values determined by the test are corrected for multiple testing using the false discovery rate [FDR, (12)].
Scatter plots (Fig. 3) were generated to show iTRAQ ratios of each protein across pairs of patient samples, and time points. Statistically significant proteins (common across all scatter plots because they are determined using all patients and time points) are plotted in red.
The set of statistically significantly regulated proteins are grouped based on their temporal profile using fuzzy c-means clustering ( Fig. 4) (13). The number of clusters is chosen based on visual inspection of cluster coherence and uniqueness of temporal profiles.

Sample Preparation for Verification Study Using ImmunoMRM (iMRM) Assay-
Plasma Digestion-Three out of the four PMI patients were included in the 23-plex immunoMRM assay. Thirty microliters plasma from baseline, 10, 60, and 240 min post-injury were digested in 3 process replicates on a Bravo Automated Liquid Handling Platform (Agilent Technologies). Briefly, plasma samples were added to a 96-well, 2 ml capacity plate and digested using the same protocol design as described earlier (14). To reduce evaporation, the plate was covered during the overnight period. The reactions were quenched by addition of formic acid to a final concentration of 1%. Prior to desalting, 100 fmol of isotopically labeled synthetic peptide standards were added to each well. Samples were then desalted using Oasis HLB 30 mg plate (Waters) on a Positive Pressure-96 Processor with the standard protocol as described previously. Samples were eluted in 80% acetonitrile/0.1% formic acid (3 ϫ 500 l) into a 2 ml capacity 96-well plate, frozen, vacuum centrifuged to dryness and stored at Ϫ80°C.
Antibody Preparation-Anti-peptide antibodies were prepared as rabbit polyclonals by either Epitomics (Burlingame, CA) or New England Peptide (Gardner, MA) except for antibody against peptide TDPGVFIGVK (IL-33) which was prepared as a rabbit monoclonal. Antibodies incubated individually to 1 m Dynabeads® Protein G (Life Technologies, Grand Island, NY) in batch mode mixing at 4°C overnight. Beads were then transferred to wells in a KingFisher 96 Deepwell plate following protocol performed on the KingFisher 96 magnetic bead processor (Thermo Fisher Scientific Inc.): 30 min incubation with freshly made 20 mM dimethyl pimelimidate200 mM triethanolamine pH 8.5 followed by 30 min quenching step in 150 mM ethanolamine. Antibody Protein G conjugated beads were then washed twice for 5 min with 5% acetic acid/0.03% CHAPS followed by 5 min re-equilibation with 1X PBS/0.03% CHAPS. Antibody beads were finally resuspended in 1ϫ PBS/0.03% CHAPS/0.1% NaN3 to 0.5 mg/ml and stored at 4°C until use.
Peptide Immunoaffinity Enrichment-Each digested sample was resuspended in 210 l of 1ϫ PBS 0.01% CHAPS 100 mM Tris pH 8.0 and transferred to a 250 l capacity 96-well plate. A mixture of 23 antipeptide antibodies cross-linked to beads was prepared and added to each well resulting in 1 g of each conjugated antibody except for TPM.SID, FGL2.ELE and ITGB.GEV which were only 0.5 g because of a limited supply. Samples were tumble-mixed overnight by tumble action at 4°C. The next day, plates were transferred onto a KingFisher 96 magnetic particle processor and processed as described previously (15). Briefly, after mixing for 5 min on the King-Fisher, the beads were washed twice with 1ϫ PBS/0.0.3% CHAPS for 90 s followed by a third wash with 0.11ϫ PBS/0.03% CHAPS for 90 s. Peptides were eluted from the antibody beads by mixing on King-Fisher with 30 l of 5% acetic acid/3% acetonitrile for 5 min. To eliminate possible bead transfer to the next phase of analysis (LC-MRM-MS), the elution plate (Bio-Rad Laboratories, Waltham, MA) was placed on a magnetic plate holder on wet ice and the contents of each well were transferred to a fresh plate, sealed with aluminum foil seal mat and stored at Ϫ80°C until analysis.
LC-MRM-MS Analysis-Samples were reconstituted in 30 l 0.1% formic acid/3% acetonitrile and 10 l was analyzed on a 4000 Q Trap hybrid triple quadrupole, linear ion trap mass spectrometer equipped with Advanced Captive Spray MS source (Bruker, Auburn, CA) and coupled with an Eksigent Nano LC 2D Plus HPLC system (AB Sciex, Framingham, MA). Liquid chromatography conditions were described previously (15). Samples were analyzed on a 75 m ID IntegraFrit column packed in-house to 10 cm with Reprosil-Pur C18 AQ 3 m beads (Dr. Maisch, GmbH, Entringen, Germany) and connected to the spray tip of Captive Spray source. MS source parameters included source voltage of 1300, curtain and nebulizer gases of 0, interface heater temperature of 110°C and collision gas set to medium. Three transitions per precursor were monitored with unit resolution for both Q1 and Q3. Declustering potential and collision energy were calculated for each precursor using the equations for 4000 Q Trap. Scheduled MRM method was used with 2 min retention time window and target cycle time of 0.9 s.
Data Analysis-Analysis of the data was done using Skyline open source software package (16). For each peptide light/heavy peak area ratio was used for further evaluation of results using QuaSAR (http:// genepattern.broadinstitute.org/gp/pages/index.jsf?lsid ϭ QuaSAR). Corresponding protein concentrations were calculated as described previously (15). Ten minutes versus baseline, 1 h versus baseline, and 4 h versus baseline protein concentration ratios were plotted to compare similar ratios calculated for iTRAQ reporter ions.

Improved Workflow Provides Faster, Deeper Discovery With
More Precise Quantitation-To evaluate our plasma proteomic analysis workflow in the context of relevant biomarker discovery, four patients undergoing a therapeutic, PMI for hypertrophic cardiomyopathy were included in this study. Briefly, this planned injury model to ameliorate excess cardiac muscle in the interventricular septum recapitulates important features of spontaneous myocardial infarction, including the release of standard biomarkers (cardiac troponins) with the expected kinetics. The clinical characteristics of the patients are summarized in supplemental Table S1. Peripheral blood was collected from the patients at baseline (after catheter placement but before administration of alcohol into the coronary sinus) as well as at 10 min, 1 h, and 4 h after the induction of injury. The 16 total patient plasma samples were processed using the strategy shown in Fig. 1. The comprehensively revised workflow included numerous changes relative to our previous label-free approach (4), including changes to depletion, fractionation, on-line reversed phase chromatography, and MS instrumentation, in addition to the incorporation of iTRAQ labeling. In order to increase depth of detection in plasma, immunoaffinity depletion of abundant proteins is commonly employed (17)(18)(19). Here we used the IgY14/Supermix tandem depletion system (Sigma-Aldrich) prior to digestion and iTRAQ labeling of patient samples. The IgY14 antibody column removes the 14 most abundant plasma proteins including albumin, immunoglobulins G, A, and M, transferrin and others. The column flow-through is passed through the Supermix column, targeting 99% depletion of 50 moderate abundance proteins (though a larger number of proteins are captured at lower depletion efficiencies). Together these columns remove 96 to 99% of the total protein mass of plasma, promoting detection of low abundance proteins and allowing sampling of proteins present in plasma at low to sub-ng/ml levels (20,21).
After abundant protein depletion and digestion, the samples were digested to peptides and labeled by iTRAQ fourplex reagent using a protocol optimized for plasma (see Methods). This chemical labeling strategy covalently modifies the primary amines of the N termini and Lysine side-chains of peptides (22). When iTRAQ-labeled peptides fragment in the mass spectrometer, distinct low m/z reporter ions are generated for each of the four different labels; relative quantification is based on ratios of these low m/z reporter ions. Incorporating iTRAQ labeling into the workflow allows mixing of four samples after the digestion step, reducing both pre-analytical and analytical variability and thereby providing more precise relative quantification of proteins compared with label-free methods (23,24). iTRAQ-labeled peptides were fractionated offline using high pH reversed phase chromatography ("basic RP"). To improve orthogonality to the final on-line low pH RP separation, early, middle and late fractions were combined prior to LC-MS/MS analysis (2,3). Basic reversed phase chromatography with concatenation of fractions has been shown to provide better resolution of iTRAQ-labeled peptides and phosphopeptides than SCX (25,26). Thirty total fractions were analyzed by LC-MS/MS on 75 m ID columns packed with sub-2 m beads using a 150 min gradient (210 min inject-to-inject) and a latest generation Q Exactive mass spectrometer (Thermo Fisher Scientific). The combination of parallel time point analysis using iTRAQ and reduction of fraction number from 80 (SCX) to 30 (basic RP) more than offset the increase in effective gradient length (90 to 150 min), leading to more than a six-fold improvement in sample analysis speed and sample throughout compared with prior work (4).
Data were extracted and searched against the Uniprot Human database using Spectrum Mill software (Agilent). Up to 37,918 distinct peptides from an average of 4641 proteins (range 4280 -4836) were identified in individual patient samples with a maximum peptide FDR estimate of Ͻ1.5% per patient ( Fig. 2; see "Methods" for details). An aggregate total of 5,304 proteins were identified from at least two patients with at least two distinct peptides across the four patient samples with a protein FDR estimate of Ͻ0.02% (1/5304). An aggregate of 4819 proteins were further quantified by two or more peptides in at least one of the patient samples. supplemental Table S2 summarizes all proteins identified and quantified in the study. The list of identified peptides from all the proteins is presented in supplemental Table S3. Sixty four percent of the proteins (a total of 3390) were identified and quantified in all 16 samples from the four patients (Fig. 2). This represents nearly a five-fold increase in the depth of coverage of plasma proteome relative to our earlier label-free study (4), (supplemental Fig. S1). Importantly, the vast majority (ca. 85%) of proteins detected in our previous study were rediscovered in the present study.
To assess the improvement in depth of coverage that was specifically attributable to the additional depletion of moderately abundant plasma proteins by Supermix, we compared IgY14 depletion to IgY14/Supermix tandem depletion for a subset of 12 samples from three patients. Approximately 2800 total proteins were identified and quantified (with two or more distinct peptides/protein) in aggregate across the twelve samples using IgY14 depletion alone. In contrast, nearly 5100 proteins were detected and quantified in the same twelve samples using IgY14/Supermix depletion (supplemental Table  S4, supplemental Fig. S1B). Therefore, the additional depletion of moderately abundant plasma proteins alone increases global proteome coverage by 82%. Further depletion does not however create a strict superset: 402 proteins most of which were identified and quantified by two-eight distinct peptides in at least one patient sample after IgY14 depletion alone were not detected after tandem IgY14/Supermix depletion (supplemental Table S5). Forty-six out of the 402 proteins FIG. 1. Diagram of improved workflow for discovery proteomics in plasma. Samples from four different time points of planned MI (PMI) patients were depleted from abundant proteins, reduced, alkylated and digested by LysC/Trypsin. Following desalting, samples were labeled by four-plex iTRAQ reagent, and mixed after evaluating label incorporation. Sample was then fractionated using reversed phase chromatography at high pH into 30 pooled, concatenated fractions. Fractions were analyzed by data dependent analysis on a Q Exactive mass spectrometer using 75 m picofrit columns packed in-house with 1.9 m beads to 20 cm length. See Methods for details.
were detected in the Supermix bound fraction of the same three patient samples.
The normalized mean intensities of proteins observed and quantified across the 16 plasma samples after IgY14/Supermix depletion span nine orders of magnitude (supplemental Figs. S2 and S3 and supplemental Table S6). It is noteworthy that cardiac troponins I and T, which are known to be at low abundance in plasma at early time points after myocardial injury (typically picogram/mL), are robustly detected and quantified in all patient samples with up to 6 -8 peptides observed for each troponin (supplemental Table S7). In contrast, cardiac Troponin T was observed only sporadically and with only a single peptide in our prior label-free study (4), whereas Troponin I was not observed at all. Of note, no peptides from cardiac troponins were detected by the current workflow in samples depleted with IgY14 alone (data not shown).
iTRAQ Detects Novel Candidate Biomarkers of MI-In our experimental design, patients served as their own controls, with data reported as iTRAQ ratios between the 10 min, 1 h and 4 h post-injury timepoints versus the pre-injury baseline. Quantification was based on the median ratio of all peptides quantified for each protein, as reported by Spectrum Mill (see "Methods"). To facilitate comparison, data for each iTRAQ channel (representing a specific patient and time point) were median centered, that is, ratios for all proteins were normalized to the median ratio for that channel. A moderated F-test was used to assess statistically significant changes in abun-dance of the 4819 proteins detected and quantified with two or more peptides in at least one of the patient samples (see Methods). Selected examples of the resulting scatter plots are shown in Fig. 3. A total of 333 proteins were significantly regulated over the time course (meaning relative protein abundance differed significantly from baseline at least in one postinjury time point) with a Benjamini-Hochberg corrected p value of less than 0.05 (Table I). The number of confidently regulated proteins is six times larger than in our previous study (4). Levels for 90% of the 333 regulated proteins increased as a result of cardiac injury, whereas the levels for only 38 proteins decreased following the injury. Established markers of myocardial injury including myoglobin (MB), creatine kinase B (CKB), creatine kinase M (CKM), fatty acid binding protein (FABP) and all Troponins (TNNT2, TNNI3, TNNC1, and TNNC2) were in the up-regulated list and showed consistent behavior across the different patients (27).
Fuzzy c-means clustering of the 333 regulated proteins revealed five distinct clusters comprised of 323 of the proteins (Fig. 4, Table I). Cluster 1 represents those proteins that peak at 10 min post-injury. Cluster 2 consists of proteins that continue to increase from 10 min to 1 h and then plateau, whereas cluster 3 consists of those proteins that begin to decline after 1 h. Proteins that rise continuously throughout the time course comprise cluster 4, and those proteins that decline after injury are represented by cluster 5. Clustering in the present study recapitulates the temporal behavior of reg-  Table enumerates the number of spectra collected, distinct peptides and proteins identified along with the FDR values achieved in four PMI patient samples as reported by Spectrum Mill (see "Methods" for details). Venn diagram shows the overlap of proteins quantified in four PMI patient samples. a Proteins identified in at least two patients with two or more peptides. b Subset of identified proteins with two or more distinct peptides observed in at least one patient. c Protein subgroups (groups); that is, 5304 distinct protein subgroups were identified within 4555 protein groups. Proteins that share a detected distinct peptide (length Ͼ8) are combined into a group. A protein group is parsimoniously expanded to one or more subgroups to distinguish proteins that also have one or more distinct peptides that are not shared with the rest of the group, typically isoforms and family members. ulated proteins observed in our previous study (4), such as ACLP1 and PF4V1 peaking at 10 min (cluster 1), and FHL1 and MYL3 rising more slowly (clusters 2 and 3). Troponins showed the expected temporal profiles, with TNNT2, TNNI3, and TNNC1 belonging to cluster 4 and showing continuously increasing levels up to the 4 h time point. The vast majority of differential regulated proteins identified are candidate markers for early myocardial injury, showing elevation at 10 min that persists until at least 60 min after injury. Using the list of regulated proteins sorted by the cluster number we generated a heat map to investigate protein changes at individual patient level (supplemental Fig. S4). Importantly, consistent changes are observed for most of the proteins across the different time points and patients, suggesting (despite small sample numbers) that candidate markers may be uniform across the population.
iTRAQ Can Provide Highly Reproducible Quantification in Plasma Despite Ratio Compression-To assess the reproducibility of iTRAQ quantification as well as the extent of ratio compression we spiked heavy-labeled synthetic peptides at different concentrations into 4 iTRAQ channels corresponding to the four time point samples of PMI patients prior to iTRAQ labeling. An amount predicted to be at the detection threshold was spiked into the first channel, with the remaining channels were spiked with relative ratios of 2:1, 5:1, and 10:1. Results from 97 synthetic peptides are summarized in supplemental Fig. S5. Median iTRAQ ratios were compressed up to 50% in depleted plasma versus theoretical values, with compression being nonlinear; there was increased relative compression at higher relative ratios, consistent with prior studies in cell lysates (28,29). We also assessed the reproducibility of quantification incorporating iTRAQ labeling using the data from 97 peptides. This was accomplished by performing the 4-channel spike-in experiment (described above) in replicate using the four different time point samples from each of 3 different patients. The median CVs ranged from 16 to 24% for the 3 different peptide spike level ratios tested.
Further Increases in Multiplexing of Plasma Analysis can be Successfully Achieved Using TMT-Labeling-Next, we tested whether TMT isobaric labeling reagents could further enhance throughput for discovery proteomics in plasma. Pooled normal plasma from a commercial source (BioreclamationIVT, Baltimore, MD) was depleted and digested as described. The total amount of digested plasma protein used in each labeling plex was kept constant at 360 g; that amount was divided by 4, 6, or 10 (i.e. 80 g, 53.3 g, and 32 g) for each labeling channel of iTRAQ four-plex, TMT six-plex and TMT ten-plex, respectively. All reporter ion channels contained equal amounts of the same plasma sample within an experiment to allow an unbiased assessment of total protein and peptide level coverage. The number of distinct peptides decreased by ϳ17% in both TMT six-plex and ten-plex experiments relative to the iTRAQ four-plex experiment (supplemental Fig. S6). Only 9% fewer proteins were detected in the TMT six-plex study and 17% fewer were detected in theTMT ten-plex experiment as compared with the iTRAQ four-plex experiment. These results      indicate that even ten-plex analyses can be leveraged for higher throughput for biomarker discovery studies in plasma without significant loss in sensitivity. Furthermore, results from spiked in heavy labeled synthetic peptides indicate that ratio compression of TMT is similar to that of iTRAQ (supplemental Fig. S7).  (13). Each line represents temporal behavior of a protein over the time course. X-axis represents the time points (baseline, 10 min, 1 h, and 4 h), and Y-axis represents normalized protein abundance. Proteins were assigned to a cluster based on the membership value of Ͼ 0.7. Proteins with membership value in between 0.5 and 0.7 were not assigned to any cluster. Bar graph on the lower right corner shows the number of proteins in each cluster. b 323 out of 333 regulated proteins were clustered into 5 distinct clusters using Fuzzy C-means clustering. Proteins were assigned into clusters using membership value of Ͼ 0.7. Proteins with a membership value in between 0.5 and 0.7 weren't assigned to any cluster and marked as NA.

iTRAQ-based Quantification of Proteins Correlates Well With
(iMRM) assay (see "Methods") to quantify both known biomarkers of cardiac disease such as TNNI3 and novel candidate proteins that were prioritized for verification based on results from our earlier study (4) (Table II). Although most of these novel candidate proteins were in the up-regulated list from the current iTRAQ-based discovery study, some did not appear to be regulated based on iTRAQ quantification. To verify the discovery results and determine the correlation of iTRAQ quantification with a "gold standard" targeted approach, we applied the iMRM assay to 12 plasma samples from three patient time series experiments used in iTRAQ discovery. Each of the baseline, 10min, 1 h, and 4 h time point samples was analyzed in full process triplicate for all three patients (i.e. 36 samples, total). Fig. 5 shows the temporal trends observed in iTRAQ and iMRM experiments for selected proteins. The observed trends in abundance over the time course are the same by both quantification strategies for all three patients analyzed, although, as expected, the measured ratios are compressed in iTRAQ relative to iMRM. For example, levels for ACLP1 rise sharply at the 10min time point and decline thereafter with both the iTRAQ and iMRM workflows (Fig. 5A) whereas levels of FHL1 protein increase steadily to the 1 h time point and despite some decline remain elevated above baseline at the 4 h time point (Fig.  5B). FSTL1 (Fig. 5C) is an example of a protein thought to be regulated based on our earlier label free discovery experiments, but appears to be invariant over time after injury based on both iTRAQ and iMRM. supplemental Table S8 shows iMRM results for all of the proteins across the three patients analyzed. DISCUSSION Biomarker discovery approaches using MS-based proteomics have suffered from a lack of throughput, detection sensitivity, and quantitative precision that contributes to both false positive and false negative results. Particularly in plasma proteomics, where the objective of deep profiling for candidate discovery is challenged by the dynamic range and complexity of the matrix, extensive sample fractionation has also increased noise and decreased throughput. We have developed a comprehensive workflow for discovery proteomics in plasma that leverages isobaric labeling strategies, intensive depletion of abundant plasma proteins, optimized fractionation methods, improved reversed-phase column characteristics, and latest generation MS instrumentation to address these collective shortcomings.
The ability to increase sample analysis throughput is critically important for biomarker discovery experiments. In even the most carefully designed experimental and sample-selection strategies, biological variability because of both human and disease heterogeneity introduces noise that has a deleterious impact on candidate discovery. This contributes to lists of differential proteins that do not generalize, hence biomarker candidates that cannot be verified. This is made worse by the mismatch between high data dimensionality and low sample number that is characteristic of most transcriptomeand proteome-scale experiments and increases rates of false positive discovery. These problems are most directly addressed by increasing the number of samples analyzed, a strategy impeded by limitations in cost and throughput. Rel- ative to transcriptional profiling, proteomics is further complicated by the need for sample enrichment and fractionation (often extensive) to enable sensitive detection of lower abundance proteins, which, in turn, further decreases throughput. The improvement in throughput we have achieved relative to "conventional" label-free approaches to deep discovery in plasma is dramatic. The six-to nine-fold improvement in throughput we have demonstrated using iTRAQ four-plex, while not a complete solution, represents substantial progress toward ameliorating the problem. In addition, our preliminary results using six-plex and ten-plex TMT reagents for labeling of plasma demonstrates that even further increases in analysis throughput are possible through higher multiplexing. These improvements in throughput derive from parallel downstream processing of a multiplexed samples, use of offline basic RP chromatography with concatenation of fractions prior to LC-MS/MS and use of nano-columns packed with sub-2 m particle packing for online LC-MS/MS. Relative to our prior, label-free study (4), we estimate that 60% of the improvement in efficiency is because of the use of iTRAQ four-plex labeling which reduced on-instrument time by 75%. The remaining 40% of improvement derives from combined use of optimized off-line, basic reversed phase fractionation at high pH, optimized on-line nano-flow chromatography with heated columns packed with sub-2 m packing, and much faster instrumentation (Q-Exactive). These factors together enabled reduction of the number of fractions requiring analysis from 80 to just 30. As a result, analyses that previously took one month with dedicated instrumentation are now being completed in 5 days using the four-plex reagent, and will be further reduced using the higher-plex reagents. The chief cost drivers in most deep proteomics experiments employing fractionation prior to LC-MS/MS is instrument time and human labor for sample processing and data analysis.
Although elements of our revised protocol (such as the use of isobaric labels and conversion to a tandem depletion strategy) add incremental cost to sample preparation, we estimate that total per-sample analysis cost was actually reduced by ϳthree-fold because of decreases in all four of the cost drivers identified above. Although label-free quantification methods continue to evolve and improve as a result of faster instrumentation and improved analysis algorithms (for example, see (30)), these experiments have higher cost because individual samples/fractions need to be analyzed in multiple replicate to ensure adequate statistical power. A disadvantage of the isobaric labeling approaches is that it is more difficult to derive a semi-quantitative estimate of absolute abundance in the sample. However, as the main objective of most proteomics discovery experiments is to assess relative changes in abundance across samples, we think that the positive attributes obtained with isobaric labeling outweigh this factor.
Increased throughput in proteomics experiments often comes at the cost of decreased sensitivity. However, as demonstrated, our approach couples substantial improvements in throughput with almost five-fold improvement in coverage of the plasma proteome compared with our prior workflow. The methods we used in our prior cardiovascular discovery (4), consistent with the state of the art at the time, yielded 900 or fewer confident protein identifications per sample (requiring two peptides/protein for identification). The iTRAQ four-plex workflow described here identified an average of 4641 proteins in each of 16 plasma samples with two or more distinct peptides per protein. Nearly 3400 proteins were detected and quantified in all patient samples. This comparison used the same type of samples collected by the same institutions using a similar experimental and sample processing paradigm, as well as from head-to-head comparisons using a subset of identical samples.
Recently, Farrah and colleagues reported a set of 3553 high confidence plasma proteins (31) compiled after reanalysis of multiple high quality plasma proteomic datasets from multiple published sources using their centralized data processing workflow. In their analysis, 2568 of those 3553 proteins were detected by two or more peptide. We compared our list of 5304 proteins identified with two or more peptides with their list of proteins meeting the same criterion (supplemental Table  S6, supplemental Fig. S1C). More than 85% of the proteins on their list were detected and quantified in our study. Moreover, half of the additional 982 proteins they included based on detection of only a single peptide were observed and quantified in our study by two or more peptides. Importantly, over half of the 5304 proteins confidently identified in our experiments are not included in the set published by Farrah and colleagues. Thus in four four-plex experiments we recapitulated and substantially surpassed the plasma proteomic characterization achieved by the union of multiple published plasma proteomes (31), or in any individual study published to date (for a recent review, see (32)). Proteins robustly identified in the Farrah dataset but not in ours included a few immunoglobulins and keratins that were either purposefully removed from our samples by immunoaffinity depletion or were not present in the database we used for searching.
Given the general perception that disease-specific markers, shed, secreted or leaked from affected tissue and massively diluted into the systemic circulation, are likely to be at low abundance in plasma (33), such increases in coverage seem very likely to be advantageous in biomarker discovery. Importantly, although in many discovery contexts the specific functional importance of improved depth of detection remains conjectural, here the proteins detected and quantified at the lower abundance levels include cardiac troponins I and T, established markers of myocardial injury of preeminent importance in the diagnosis and management of myocardial injury (34). Robust identification and measurement of these critical plasma biomarkers was barely evident in our earlier analyses; the new data provides a concrete demonstration of the utility of deep coverage.
Multiple components of our revised workflow contribute to improved detection and quantification of low abundance plasma proteins. For example, troponins were not identified until a second stage of depletion using Supermix was employed. The impact of tandem depletion was readily apparent in our global proteomic results. Total proteins we identified and quantified across three out of four iTRAQ experiments using IgY14/Supermix depletion represent a two-fold increase over the number of proteins quantified with a workflow identical except for use of IgY14 depletion alone. This augmentation of protein coverage by Supermix depletion exceeds that reported in the literature (21), likely because of the advantages of faster instrumentation and a globally more sensitive work-flow. Analyzing the "depletate" from both depletion methods, although not currently practical given the added cost and time, is likely to further increase the number of proteins detected.
The present study also resulted in the identification of 333 regulated proteins across the four PMI patient samples. The majority of these candidate markers were elevated at 10 min post-ablation, and persisted for at least 60 min following injury. Sixty five percent of the candidates rise early and persist up to at least 4 h (clusters 2 and 4). This temporal profile is desirable for markers of myocardial injury. The new list of candidate biomarker proteins we identified to be regulated using iTRAQ includes 32 of the 52 proteins verified in our prior label-free study (4), as well as over 300 new candidates. Fifteen out of the remaining 20 proteins were detected and quantified in the current study, but were not observed to be regulated. Two of those proteins (FSTL1, SCUBE1) were analyzed by iMRM in the present study and shown to not be regulated, consistent with the iTRAQ results.
A central reason for our incorporation of isobaric labeling into the workflow was the potential to improve the precision of relative quantification in biomarker discovery relative to labelfree approaches. Here we demonstrated that the reproducibility of quantification using iTRAQ, including all sources of variability introduced by the multidimensional sample processing strategy illustrated in Fig. 1, ranged from 16 to 24% (median CV). This level of reproducibility is only slightly lower than that routinely achieved in MRM-MS and iMRM-MS candidate biomarker verification studies (see (35,36) and references therein). Although ultimately the most important measure of the utility of isobaric labeling will be demonstration of improved ability to identify functional biomarkers, a relevant surrogate is demonstration that results consistently reflect differential trends established by a "gold standard" quantification method. In MS-based proteomics, stable isotope dilution multiple reaction monitoring MS has emerged as the preferred method for targeted, quantitatively precise measurement of analytes in complex matrices (36). The 23 proteins quantified by iTRAQ in the context of the full discovery datasets exhibited good correlation with the iMRM-MS data that was obtained using stable isotope labeled peptides for quantification. Importantly, a subset of the proteins that we had observed to change in abundance in prior label-free experiments in a parallel set of patient samples were not observed to change using either iTRAQ or iMRM. These results further support that the quantitative precision obtainable using isobaric labeling in a discovery setting reflect real differences in the samples analyzed and form a more reliable basis on which to decide whether to configure more quantitative assays for, be they MS-based or Ab-based.
Compression of ratios is a well-known phenomenon with isobaric labeling (29,37,38). It is caused by co-isolation and fragmentation of iTRAQ-labeled peptide precursors, resulting in mass-tag ions from unrelated peptides merging and the respective sources being indistinguishable. In the discovery experiments, the iTRAQ ratios for spiked heavy peptides were compressed up to two-fold in depleted plasma versus theoretical. Interestingly, the larger the actual fold difference the larger the extent of compression observed. This effect was also observed in the iMRM versus iTRAQ ratio comparisons where, for the most extreme case (Fig. 5, FHLI, patient  3), a ca. sixfold change in abundance measured by iMRM was observed as only a twofold change by iTRAQ. The smallest known differentials evaluated in the present study were two-fold differences for all of the 97 spiked heavy peptides. The median observed difference in the iTRAQ data for this set of peptides was 1.67, close to theoretical. Although false negatives could occur in isobaric labeling experiments, especially for proteins undergoing small changes in abundance, we have no evidence for them in the present study.
The results presented here provide clear demonstration of the value of using isobaric mass tag reagents in plasmabased biomarker discovery experiments to increase sample analysis throughput while providing very high sensitivity of analysis, and to derive high confidence lists of regulated proteins. We have leveraged human "perturbational" experiments in which each person serves as their own biologic control across a series of time points. This experimental paradigm enhances our statistical power to look for quantitative differences from baseline. We acknowledge, however, that observed differences between individuals (both biologic and technical) may also be considerable. Ultimately, any biomarker that emerges from our discovery work must then be tested across large heterogeneous groups of individuals (cases, controls, "at risk" individuals, etc.) using more high throughput methodologies.
The raw mass spectrometry data and the sequence database used for searches may be downloaded from MassIVE (http://massive.ucsd.edu) using the identifier: MSV000079033. Download this dataset directly from ftp:// MSV000079033:a@massive.ucsd.edu.