Enhancing the Identification of Phosphopeptides from Putative Basophilic Kinase Substrates Using Ti (IV) Based IMAC Enrichment*

Metal and metal oxide chelating-based phosphopeptide enrichment technologies provide powerful tools for the in-depth profiling of phosphoproteomes. One weakness inherent to current enrichment strategies is poor binding of phosphopeptides containing multiple basic residues. The problem is exacerbated when strong cation exchange (SCX) is used for pre-fractionation, as under low pH SCX conditions phosphorylated peptides with multiple basic residues elute with the bulk of the tryptic digest and therefore require more stringent enrichment. Here, we report a systematic evaluation of the characteristics of a novel phosphopeptide enrichment approach based on a combination of low pH SCX and Ti4+-immobilized metal ion affinity chromatography (IMAC) comparing it one-to-one with the well established low pH SCX-TiO2 enrichment method. We also examined the effect of 1,1,1,3,3,3-hexafluoroisopropanol (HFP), trifluoroacetic acid (TFA), or 2,5-dihydroxybenzoic acid (DHB) in the loading buffer, as it has been hypothesized that high levels of TFA and the perfluorinated solvent HFP improve the enrichment of phosphopeptides containing multiple basic residues. We found that Ti4+-IMAC in combination with TFA in the loading buffer, outperformed all other methods tested, enabling the identification of around 5000 unique phosphopeptides containing multiple basic residues from 400 μg of a HeLa cell lysate digest. In comparison, ∼2000 unique phosphopeptides could be identified by Ti4+-IMAC with HFP and close to 3000 by TiO2. We confirmed, by motif analysis, the basic phosphopeptides enrich the number of putative basophilic kinases substrates. In addition, we performed an experiment using the SCX/Ti4+-IMAC methodology alongside the use of collision-induced dissociation (CID), higher energy collision induced dissociation (HCD) and electron transfer dissociation with supplementary activation (ETD) on considerably more complex sample, consisting of a total of 400 μg of triple dimethyl labeled MCF-7 digest. This analysis led to the identification of over 9,000 unique phosphorylation sites. The use of three peptide activation methods confirmed that ETD is best capable of sequencing multiply charged peptides. Collectively, our data show that the combination of SCX and Ti4+-IMAC is particularly advantageous for phosphopeptides with multiple basic residues.

Metal and metal oxide chelating-based phosphopeptide enrichment technologies provide powerful tools for the in-depth profiling of phosphoproteomes. One weakness inherent to current enrichment strategies is poor binding of phosphopeptides containing multiple basic residues. The problem is exacerbated when strong cation exchange (SCX) is used for pre-fractionation, as under low pH SCX conditions phosphorylated peptides with multiple basic residues elute with the bulk of the tryptic digest and therefore require more stringent enrichment. Here, we report a systematic evaluation of the characteristics of a novel phosphopeptide enrichment approach based on a combination of low pH SCX and Ti 4؉ -immobilized metal ion affinity chromatography (IMAC) comparing it one-to-one with the well established low pH SCX-TiO 2 enrichment method. We also examined the effect of 1,1,1,3,3,3hexafluoroisopropanol (HFP), trifluoroacetic acid (TFA), or 2,5-dihydroxybenzoic acid (DHB) in the loading buffer, as it has been hypothesized that high levels of TFA and the perfluorinated solvent HFP improves the enrichment of phosphopeptides containing multiple basic residues. We found that Ti 4؉ -IMAC in combination with TFA in the loading buffer, outperformed all other methods tested, enabling the identification of around 5000 unique phosphopeptides containing multiple basic residues from 400 g of a HeLa cell lysate digest. In comparison, ϳ2000 unique phosphopeptides could be identified by Ti 4؉ -IMAC with HFP and close to 3000 by TiO 2 . We confirmed, by motif analysis, the basic phosphopeptides enrich the number of putative basophilic kinases substrates. In addition, we performed an experiment using the SCX/Ti 4؉ -IMAC methodology alongside the use of collision-induced dissocia-tion (CID), higher energy collision induced dissociation (HCD) and electron transfer dissociation with supplementary activation (ETD) on considerably more complex sample, consisting of a total of 400 g of triple dimethyl labeled MCF-7 digest. This analysis led to the identification of over 9,000 unique phosphorylation sites. The use of three peptide activation methods confirmed that ETD is best capable in sequencing multiply charged peptides. Collectively, our data show that the combination of SCX and Ti 4؉ -IMAC is particularly advantageous for phosphopeptides with multiple basic residues. Molecular & Cellular Proteomics 10:

10.1074/mcp.M110.006452, 1-14, 2011.
Reversible protein phosphorylation widely regulates cellular functions through protein kinases and phosphatases (1,2). Determination and a quantitative analysis of phosphorylation sites are a prerequisite for unraveling regulatory processes and signaling networks (3)(4)(5)(6). The analytical methods of choice for characterizing protein phosphorylation have shifted from traditional methods such as radioactive labeling and gel electrophoresis to advanced mass spectrometry, a highthroughput technology (7). It has been estimated that ϳ30% of cellular proteins are phosphorylated during the life cycle of the cell (8). There has been a continuing intense focus on developing enrichment and phosphopeptide sequencing strategies to facilitate the large-scale profiling of phosphorylation events. Currently, one of the most commonly adopted strategies is the use of two sequential steps of chromatographic based separations; an initial fractionation step for reducing sample complexity and, subsequently, a more specific enrichment of phosphopeptides.
Typically, low-pH strong cation exchange (SCX) 1 chromatography is used as the first step where peptides are fraction-ated based on their solution net charge (9,10) and the orientation of peptides to the negatively charged chromatographic material (11,12). Unlike glutamic and aspartic acid, phosphorylated amino acids are able to retain a negative charge under acidic (pH 2.7) conditions. This property can be exploited in SCX (10) for enrichment of phosphopeptides, which tend to elute earlier and are thus separated from the majority of nonphosphopeptides. Following SCX fractionation, several affinity-based methods have been introduced for improving the level of enrichment including; immobilized metal ion (Fe 3ϩ ) affinity chromatography (IMAC) (13,14), and various metal oxides among which TiO 2 is the most common (15,16). Additional enrichment strategies have also been developed applying different metal oxides such as ZrO 2 and Nb 2 O 5 (17,18) or IMAC using alternative metal ions such as Ga 3ϩ , Zr 4ϩ , and Ti 4ϩ (19 -21). Notably, the IMAC technology using Zr 4ϩ / Ti 4ϩ -metal ions use a phosphate group (as opposed to nitrilotriacetic acid or iminodiacetic acid) as the coordinating ligand that has shown potential to posses superior specificity than traditional metal oxides and Fe 3ϩ -IMAC (20, 21) based enrichment strategies. Recently, alternatives to SCX as a first step have also been demonstrated including the use of hydrophilic interaction chromatography (HILIC) (22,23), electrostatic repulsion liquid chromatography (ERLIC) (24) and strong anion exchange (SAX) (25)(26)(27). Although a great number of phosphorylation sites have been identified, it has also been pointed out that each phosphopeptide enrichment technology has inherent biases toward different physiochemical properties of phosphopeptides. For instance, Fe 3ϩ -IMAC has been shown to have a more efficient handling of multiply phosphorylated peptides compared with TiO 2 . This can be rationalized by the weaker binding to phosphopeptides by IMAC than TiO 2 (28). The specificity and ability to enrich for each method can vary from nearly 100% to a few percent, depending on sample complexity and peptide composition. One weakness common to most chelation strategies is their poor binding to phosphopeptides that contain multiple basic residues (29 -32). We argue that this may lead to an underrepresentation of basophilic kinase substrates in current phosphoproteome studies, as these are characterized by high R/K contents in the neighborhood of the phosphorylated residues. The weak representation of phosphopeptides containing multiple basic residues may be caused by an ionic interaction between the basic and phosphorylated residues that hinders the latter's ability to coordinate with the affinity purification by IMAC or metal oxides. Barnouin et al. demonstrated that issues relating to intramolecular interactions can be alleviated by switching from a water system to a perfluorinated solvent system (31). However, only simple mixtures were used and specificity was not evaluated using a more complex sample. Another possible way of disrupting the ionic interaction is by increasing the strength of the ion pairing reagent. It has been suggested that elevated levels of TFA will preferably interact with basic residues (33).
Under-representation of phosphopeptides containing multiple basic residues in an SCX-chelation strategy can also be explained by additional factors. Phosphorylated peptides containing multiple basic residues co-elute with the bulk of the digested sample and thus their enrichment is hampered because of the higher complexity and dynamic range issue present in these SCX fractions. These fractions also contain highly acidic peptides, which have been demonstrated in the past to bind to chelation material and reduce specificity. Conversion of carboxylate groups to methyl ester derivates has been demonstrated to improve the selectivity of Fe 3ϩ -IMAC (34), although the required chemistry can be problematic.
Sequencing of phosphorylated peptides by tandem mass spectrometry is another key issue that may hamper comprehensive analysis (35,36). Phosphopeptides are traditionally fragmented by collision induced dissociation (CID). However, phospho-serine and phospho-threonine containing peptides often undergo significant neutral loss of phosphate during CID, giving insufficient backbone fragmentation for phosphopeptide identification (35). Methods such as neutral loss-triggered MS 3 (10) and multistage activation (37) have been employed to impose additional activation events on pre-selected neutral loss peaks. However, there is still discussion whether it is really advantageous to use neutral loss-driven MS 3 and multistage activation scans in phosphoproteomics. Recently, electron capture dissociation (ECD) and electron transfer dissociation (ETD) have shown to be highly complementary to CID because they do not cleave the labile phosphate group, thereby improving the identification of phosphopeptides and localization of phosphorylation sites (38 -45).
In this study, we present a systematic evaluation of enrichment of phosphopeptides containing multiple basic residues using SCX-Ti 4ϩ -IMAC (21), placing it directly in the context of the SCX-TiO 2 enrichment method which we have helped establish (4,16,46) We argue, and show by motif analysis of the phosphopeptide datasets, that our approach is in particular beneficial when there is an interest in the analysis of putative basophilic kinases substrates.

EXPERIMENTAL PROCEDURES
Sample Preparation-HeLa and MCF-7 cells were grown on plates until confluence was reached. After harvesting the cells by scraping, lysis was carried out by resuspending them in lysis buffer containing 50 mM ammonium bicarbonate pH 8.0, 8 M urea. Complete EDTA-free protease inhibitors mixture (Roche) and phosSTOP phosphatase inhibitor mixture (Roche). The cell suspension was subjected to ultrasonication for 1 min at 60 W using 30 cycles with 50% duty. Subsequently, cell debris was removed by centrifugation at 1000 ϫ g for 10 min at 4°C. Six milligrams of total protein from HeLa cell and 1.2 milligrams of MCF-7 cells were re-suspended in a buffer containing 8 M urea, 50 mM ammonium bicarbonate pH 8.0, reduced with 10 mM dithiothreitol for 30 mins at 56°C and alkylated by addition of to 20 mM iodoacetamide. After 30 min incubation in the dark at room temperature, the first digestion was performed by addition of Lys-C at an enzyme/protein ratio of 1/50 and incubation for 4 h at 37°C. Subsequently, the digest was diluted to a final urea concentration of 2 M, and a second digestion with trypsin at an enzyme/protein ratio of 1/50 was performed at 37°C overnight. Finally, the digestion was quenched with 0.1% formic acid. The resulting digest of HeLa cells was desalted using 200cc Sep-Pak C18 cartridges (Waters Corporation), dried in vacuo and stored at Ϫ20°C. The tryptic digest of the MCF-7 cell lysate was split into three aliquots and then desalted using a 50cc Sep-Pack C18 column, and then washing with 0.1% formic acid. On-column stable isotope dimethyl labeling was sequentially performed, as described before (47). The resulting solution was then dried in vacuo and stored at Ϫ20°C. The light, intermediate and heavy dimethyl labeled samples were mixed in a 1:1:1 ratio.
Strong Cation Exchange Chromatography for Peptide Fractionation-The peptides or dimethyl labeled peptides were reconstituted in 10% formic acid and then loaded onto C18 cartridge (Aqua, Phenomenex) using an Agilent 1100 HPLC system. The flow rate applied was 100 L/min using 0.05% formic acid (pH 2.7) as solvent. Peptides were eluted from the C18 cartridge with 80% acetonitrile (ACN)/ 0.05% formic acid,(pH 2.7); onto a Polysulfoethyl A column (200 ϫ 2.1 mm) (PolyLC) for 10 min at the same flow rate. Separation of peptides was performed at using a nonlinear 65 min gradient: from 0 to 10 min, 100% solvent A (5 mM KH 2 PO 4 , 0.05% formic acid, 30% acetonitrile, pH 2.7); from 10 min to 15 min, to 26% solvent B (5 mM KH 2 PO 4 , 0.05% formic acid, 30% acetonitrile, 350 mM KCl, pH 2.7); from 15 min to 40 min, to 35% solvent B; and from 40 to 45 min, to 60% solvent B. At 49 min, the concentration of solvent B was 100%. The column was subsequently washed for 6 min with 100% solvent B and finally equilibrated with 100% solvent A for 9 min. The flow rate applied during the SCX gradient was 300 L/min. Fractions were collected every minute for 40 min. Each fraction was lyophilized and then desalted by resuspension in 2% acetic acid using 50 ml Sep-Pak C18 cartridges (Water Corporation). The eluted peptides from each fraction were split into five aliquots, lyophilized and stored at Ϫ20°C. The dimethyled labled peptides from each fraction were also desalted, dried in vacuo and stored at Ϫ20°C.
Phosphopeptide Enrichment-Ti 4ϩ -IMAC material was prepared and used essentially as previously described by us (21,48). Titansphere (GL Science) microcolumns were prepared as previously described (16,49). Affinity material (TiO 2 and Ti 4ϩ -IMAC resin) was loaded onto Gel-loader tip microcolumns using a C8 plug and ϳ1-2 cm length of material. The enrichment procedures for "positive" SCX fractions of tryptic digest of HeLa cell were as follows: first, Preequilibrated by 2 ϫ 30 L of TiO 2 loading buffer (Approach TiO 2 a : 80% ACN, 0.1% TFA with 50 mg/ml DHB or approach TiO 2 b : 80% ACN, 6% TFA with 50 mg/ml DHB) for TiO 2 columns, 2 ϫ 30 L of Ti 4ϩ -IMAC loading buffer (Approach Ti 4ϩ -IMAC a : 80% ACN, 6% TFA) to Ti 4ϩ -IMAC column, 2 ϫ 30 L of Ti 4ϩ -IMAC loading buffer (Approach Ti 4ϩ -IMAC b : 60% HFP, 0.1% TFA) for Ti 4ϩ -IMAC column,. Next, four aliquots of each SCX fraction were, re-suspended in 60 L of loading buffer each and loaded onto the equilibrated gel-loader tip microcolumns. Columns were sequentially washed with 60 L of loading buffer, followed by washing with 60 L of 50% ACN/0.1% TFA for TiO 2 columns, 60 L of 50% ACN/0.5% TFA containing 200 mM NaCl for Ti 4ϩ -IMAC a column and additional washing by 60 L of 50% ACN/0.1% TFA for Ti 4ϩ -IMAC a and Ti 4ϩ -IMAC b columns, respectively. The bound peptides were eluted by 20 L of 5% ammonia into 20 L of 10% formic acid and then stored at Ϫ20°C for LC-MS/MS analysis. The same procedures were performed for phosphopeptide enrichment from neutral phosphopeptide-enriched SCX fractions and whole unfractionated digests of HeLa lysate. For the SCX fractions of the dimethyl labeled of MCF-7 cell lysate enrichment was performed using the Ti 4ϩ -IMAC a strategy.
Mass Spectrometry-The analysis of the peptides was performed on a reversed phase (RP) nano-LC-coupled LTQ Orbitrap XL ETD (Thermo Fisher Scientific) as described previously with minor modifications (50). An Agilent 1200 series HPLC system was equipped with a 20 mm Aqua C18 (Phenomenex, Torrance, CA) trapping column (packed in-house, 100 m i.d., 5 m particle size) and a 400 mm ReproSil-Pur 120 C18-AQ (Dr. Maisch-GmbH) analytical column (packed in-house, 50 m i.d., 3 m particle size). Trapping was performed at 5 l/min solvent C (0.1 M acetic acid in water) for 10 min, and elution was achieved with a gradient of 10 -25% (v/v) solvent D (0.1 M acetic acid in 80% ACN) in 90 min with a total analysis time of 120 min. The flow rate was passively split from 0.60 ml/min to 100 nL/min when performing the elution. Nanospray was achieved using a distally coated fused silica emitter (360 m o.d., 20 m i.d., 10 m tip i.d.; constructed in-house) biased to 1.7 kV. The LTQ Orbitrap ETD was operated in the data dependent mode to automatically switch between MS and MS/MS. Survey full scan MS spectra were acquired from m/z 350 to m/z 1500 in the Orbitrap with a resolution of 60,000 at m/z 400 after accumulation to a target value of 500,000 in the linear ion trap. Supplementary activation was enabled for ETD (51). The five most intense peaks at a threshold of above 500 were alternatively fragmented in the linear ion trap using CID/ETD at a target value of 30,000. The ETD reagent target value was set to 100,000 and the reaction time to 50 ms.
The enriched phosphopeptides from dimethyl labeled samples were analyzed on an ETD equipped Orbitrap Velos instrument (Thermo Fisher Scientific, Bremen) connected to an LC system as described above. All instrument methods for the Orbitrap Velos were set up in the data dependent acquisition mode. Each sample was analyzed with either a Top10 HCD or Top10 CID/ETD method. For the low-resolution CID/ETD-MS/MS method, full scan MS spectra (from m/z 350 -1500) were acquired in the Orbitrap analyzer after accumulation to a target value of 5e5 in the linear ion trap. Resolution in the Orbitrap system was set to 60,000. The ten most intense peptide ions were sequentially isolated to a target value of 5000 and fragmented in the high-pressure linear ion trap by low-energy CID with normalized collision energy of 35%. The ETD reagent target value was set to 2e5 and the reaction time to 100 ms. For the HCD method, survey full scan MS spectra (from m/z 350 -1500) were acquired in the Orbitrap system with resolution 30,000. The ten most intense peptide ions were sequentially isolated to a target value of 3e4 and fragmented in the HCD collision cell with normalized energy of 35%. The resulting fragments were detected in the Orbitrap with resolution 7500.
Data Analysis-From each raw data file recorded by the mass spectrometer, representing the enriched phosphopeptide from each SCX fraction, peak lists containing CID, ETD and HCD fragmentation data were generated using Proteome Discoverer (Version 1.3, Thermo Fisher Scientific) with a signal-to-noise threshold of 3 and the following settings for the ETD nonfragment filter: precursor peak removal within a 4 Da, charge-reduced precursor removal within a 2 Da, and removal of known neutral losses from charge-reduced precursor within a 2 Da (the maximum neutral loss mass was set to 120 Da). Single-fraction peak lists generated from each method were then merged into one larger peak list for database searching using inhouse scripts, in which CID peak lists was further filtered as minimum fragment ions was set to 100, maximum number of fragment ion count was set to 100, maximum number of fragment ions was set to 100. Peak list files from HCD were deisotoped and charge deconvoluted as described (52,53). Peak lists were searched against a IPI human database (version 3.52, 69,164 sequences; 29,064,824 residues) and its decoy database created by Mascot using Mascot software version 2.3.02 (Matrix Science, UK). The database search was performed with the following parameters: a mass tolerance of Ϯ50 ppm for precursor masses and Ϯ0.6 Da for CID/ETD fragment ions and Ϯ0.05 Da for HCD fragments, allowing two missed cleavages, cysteine carbamidomethylation as fixed modification and methionine oxidation, phosphorylation of serine, threonine, and tyrosine as variable modifications. When the data was triple dimethyl labeled peptides, additional variable modifications relating to the N terminus and Lysine residues were also added. The enzyme was specified as trypsin and the fragment ion type was specified as electrospray ionization (ESI)-TRAP, ETD-TRAP, and ESI-quadrupole-time-of-flight (Q-TOF) for the corresponding mass spectra. The resulting .dat files were exported and filtered for a Ͻ1% false discovery rate (FDR) at the peptide level using in-house developed software "Rockerbox" (Version 1.2.6) (54) utilizing the percolator-based algorithm (55). Note, only PSMs with Mascot score Ն20 were accepted and then exported using "Rockerbox" (54). Site localization confidence was calculated using mascot delta score (43), which has been reported to be a viable method by Savitski and coworkers (56). The frequency of amino acids within identified phosphopeptides was performed using the IceLogo program (57), which builds on probability theory to visualize significant conserved sequence patterns in multiple peptide sequence alignments against background sequence sets and has a more dynamic nature and is correct and completer in the analysis of conserved sequence patterns, was used to analyze the frequency of amino acids. Note: considering the possibility of missed cleaved peptides, the peptide sequences were extended to 21 amino residues window surrounding the center of phosphorylation S/T/Y sites.
All of the data sets can be accessed as scaffold data files (www. proteomesoftware.com) at the publicly available repository Tranche (https://proteomecommons.org/) using the following hash code: p9mbfyL44qFNOl4ejNDmCϩ/wwGbPvnSQVsj6yueyNfOEla-q84opm09URtLWGtaaqulIϩc6pQjMKϩ5F2zghoogxtOJ/IAAAAAAA-ADKA ϭ ϭ . 2 and Ti 4ϩ -IMAC-Our main objective was to develop a phosphoproteomics method specifically improving enrichment of phosphopeptides containing multiple basic residues, enabling the expansion of global phosphoproteome profiles. Applying a low pH (2.7) SCX separation on a lysate allowed the fractionation of peptides based on charge state, resulting from protonation and deprotonation of basic and acidic groups. It is estimated that ϳ68% of an in silico tryptic digest of the human IPI protein database generates peptides with a net charge of 1ϩ under pH 2.7 assuming protonation of both the N terminus and the basic residue and deprotonation of the C terminus (9). Such tryptic peptides are often referred to as "2ϩ" in the literature (ourselves guilty) which ignores the deprotonation of the C terminus that has an important impact on the separation (12). Phosphopeptides containing only one basic residue will have a net charge of zero (protonation of the N terminus and the basic residue combined with deprotonation of the C terminus and phosphorylated residue) and will elute earlier in an SCX based separation. For the sake of simplicity we will refer to these as "neutral phosphopeptides". Phosphopeptides containing more than one basic residue will have a net charge corresponding to the number of basic residues minus one and for simplicity these positively charged phosphopeptides will be referred to as "positive phosphopeptides". Note that the "positive phosphopeptides" will co-elute with regular tryptic peptides that also possess a net positive charge. The reference sample we tested was a HeLa cell lysate digest (6 mg), which was initially fractionated by SCX. Each of these SCX fractions was divided into five aliquots for further experiments. Initially, single aliquots of the fractions containing positively charged peptides were pooled together and used as an initial test bed for optimizing enrichment protocols. It is noteworthy that all SCX fractions were desalted before enrichment to eliminate any deleterious effects caused by the phosphate present in the SCX solvents. Previously, Cramer and coworkers suggested that fluorinated solvents such as HFP can significantly improve "positive phosphopeptides" enrichment by reducing the level of intramolecular ionic interaction (31). Using this work as a reference, where it was suggested 40% HFP would be beneficial, we tested a range of organic solvents differing in their HFP (40, 60, and 80%) and TFA content (0 and 0.1%). Both TiO 2 and Ti 4ϩ -IMAC enrichment protocols were tested. It was observed that the optimal loading buffer is 60% HFP with 0.1% TFA for Ti 4ϩ -IMAC whereas the results for TiO 2 were much poorer than the classical solvent system (See supplemental Table S1). These initial findings were then used for a much larger and refined experiment.

Effect of Organic Additives on Phosphopeptides Enrichment by TiO
Comparative Evaluation of TiO 2 and Ti 4ϩ -IMAC Enrichment of Phosphopeptides-Aliquots of each of the SCX fractions corresponding to positively charged peptides were subjected to enrichment by TiO 2 and Ti-IMAC using several solvent systems that will help evaluate the use of TFA, DHB and HFP. In order to be consistent we created two enrichment methods for TiO 2 , one containing DHB (the classic strategy and termed TiO2 a ) and one containing 6% TFA (to mimic the classical conditions used for Ti-IMAC, termed TiO2 b ). Two Ti-IMAC methods were also used; Ti 4ϩ -IMAC with 6% TFA (termed Ti 4ϩ -IMAC a ) and Ti 4ϩ -IMAC with 60% HFP and 0.1% TFA (termed Ti 4ϩ -IMAC b ). The SCX fractions were subjected to enrichment by each method and then one-third of the enriched eluent (corresponding to a total lysate level of 400 g) was analyzed by LC-MS/MS where each precursor was subjected to both CID and ETD fragmentation events. An overview of the experimental strategy is shown in Fig. 1 (Table I and supplemental Tables  S2, S3, S4 and S5). In addition, the specificity of phosphopeptide enrichment was in the order of Ti 4ϩ -IMAC a Ͼ TiO 2 b Ͼ Ti 4ϩ -IMAC b Ͼ TiO 2 a ( Table I). Considering that the TiO 2 b method identified more phosphopeptides than the TiO 2 a method, only the phosphopeptides identified using the TiO 2 b method were further analyzed and compared with the Ti 4ϩ -IMAC approaches. Comparing overlap of the 3 methods clearly indicated that 6% TFA in conjunction with Ti 4ϩ -IMAC, Ti 4ϩ -IMAC a , was superior and most comprehensive ( Fig. 2A). 1321 unique phosphopeptides representing 21.7% of all phosphopeptides were solely identified by the TiO 2 b and Ti 4ϩ -IMAC a approaches. It should be pointed out that the use of FIG. 1. Experimental workflow used for the selective enrichment and identification of phosphopeptides. A HeLa lysate tryptic digest was initially fractionated by SCX. After desalting and aliquoting the SCX fractions, the late SCX fractions (containing the bulk of the tryptic peptides, and the phosphopeptides with multiple basic residues), were subjected to TiO 2 (Approach TiO 2 a : 80% ACN/0.1% TFA containing 50 mg/ml DHB used as loading buffer; Approach TiO 2 b : 80% ACN/6% TFA containing 50 mg/ml DHB used as loading buffer), Ti 4ϩ -IMAC (Approach Ti 4ϩ -IMAC a : 80% ACN/6% TFA used as loading buffer; Approach Ti 4ϩ -IMAC b : 60% HFP/0.1% TFA used as loading buffer) enrichment. The eluant was, subsequently, analyzed by LC-MS/MS utilizing both CID and ETD. The early SCX fractions were also analyzed, either directly but also after Ti 4ϩ -IMAC a enrichment, albeit by using CID only. SCX and TiO 2 /Fe 3ϩ -IMAC for the enrichment of "neutral phosphopeptides" is superb, leading to the generation of nearly pure pools of phosphopeptides (13)(14)(15)58). In contrast, TiO 2 performs here relatively poorly for "positive phosphopeptides," in line with what has been previously pointed out (29,32). In order to generate a reference to allow a more in-depth evaluation of the positive phosphopeptide pools and to confirm that Ti 4ϩ -IMAC a can also generate similar results to TiO 2 , a portion of the SCX fractions corresponding to where "neutral phosphopeptides" are found were also subjected to enrichment. Four of these SCX fractions were further purified by Ti 4ϩ -IMAC b approach, generating a data set of 3246 unique phosphopeptides from 1743 phosphoproteins (supplemental Table S6).
We further compared the number of unique phosphopeptides recovered from the early SCX fractions and those originating from later SCX fractions after enrichment with Ti 4ϩ -IMAC b approach. We found, as anticipated, little overlap in phosphopeptide identifications in between these early and late SCX fractions because of the separation power of SCX. In addition, direct analysis of these four fractions without enrichment resulted in the identification of 1883 unique phosphopeptides originating from 1145 phosphoproteins (supplemental Table S7). When comparing the data set obtained by direct analysis after SCX enrichment and that obtained after subsequent enrichment by Ti 4ϩ -IMAC a , it became apparent that the additional enrichment is also beneficial for the early SCX fractions, nearly doubling the Fig. 1. Venn diagrams displaying the overlap in detected (unique) phosphopeptides for (A) phosphopeptides detected in the late SCX fractions corresponding to peptides with a positive charge using the enrichment methods TiO 2 a , Ti 4ϩ -IMAC a and Ti 4ϩ -IMAC b , respectively (B) phosphopeptides detected in the early SCX fractions containing peptides with a net neutral charge either through direct analysis or after the additional enrichment using Ti 4ϩ -IMAC a . amount of detected phosphopeptides (Fig. 2B). The level of results for the direct analysis are largely made possible by the separation of the N-acetylated peptides away from the phosphorylated peptides by SCX (12). Interestingly, the enrichment factor without Ti 4ϩ -IMAC was close to 90% and with enrichment was close to 100%, according to the results generated by Mascot. ETD Fragmentation Improves Phosphopeptide Identification, in Particular, of Positive Phosphopeptides-All enriched fractions were analyzed by LC-MS/MS with alternating CID and ETD fragmentation. For the Ti 4ϩ -IMAC a experiment, CID and ETD contributed to the identification of 2814 and 3690 unique phosphopeptides, respectively (Fig. 3A). A total of 1755 unique phosphopeptides were commonly identified by both fragmentation techniques, corresponding to an overlap of only 37%. ETD generated more (41%) identifications than CID. When combined, the two dissociation approaches resulted in the identification of 4749 unique phosphopeptides, corresponding to 4814 unique phosphorylation sites. We further investigated the performance of CID and ETD by breaking down results obtained per individual SCX fraction (Fig. 3B). Our analysis indicates that, for individual fractions, CID and ETD identified differing numbers of phosphopeptides as well as generating different 'apparent' phosphopeptide specificities, ranging from 10% percent to 100%. Not only did ETD contributed to more identifications, it also resulted in a higher specificity being calculated for positive phosphopeptide enrichment by the Ti 4ϩ -IMAC a method. An observation that can be rationalized by the known higher efficiency of ETD in fragmenting phosphopeptides with a higher charge where the FIG. 3. Comparison of the identified "positive phosphopeptides" using CID and ETD as observed in the HeLa lysate Ti 4؉ -IMAC a data set. A, Venn diagram indicating the overlap between the phosphorylated peptides identified by CID and ETD (B) Bar diagram representing the unique phosphopeptides detected by CID and ETD for each SCX fraction analyzed, coupled to the left y axis. Gray and black bars indicate the number of phosphopeptides identified by CID and ETD, respectively. In addition, the line diagrams represent the specificity of identified phosphopeptides per SCX fraction, by CID (gray) and ETD (black), coupled to the right y axis. higher charge is a product of the presence of multiple basic residues (38,39,45).

Comparison of Amino Acid Frequency of the Identified Positive Phosphopeptides and Neutral
Phosphopeptides-To investigate the amino acid composition of phosphopeptides obtained for each method, we plotted the frequency of amino acids surrounding the phosphorylation site against a pooled data set consisting of all five data sets as background (Table. 1) using the freeware program Ice Logo (57). This background pooled data set contained a total of 8924 unique phosphorylated peptides.
Tryptic phosphopeptides have been reported to have an average length of 16 amino acids and also tend to harbor more missed cleavages. It has been reported that on average 1.1 missed cleavages per phosphopeptide are observed whereas for regular peptides the average is 0.3 (38). To accommodate for these trends, sequences of 21 amino acids surrounding all identified phosphorylated S/T/Y residues were analyzed. In Fig. 4 each panel represents a data set from a single experimental approach, against a background data set containing all datasets. Ice Logo plots represent enrichment of amino acids as positive values, whereas underrepresented amino acids occur as negative values, keeping in mind that for these data sets there will be two basic amino acids present in each sequence. Each data set generated a unique frequency for amino acids surrounding the phosphorylation site. We found overrepresentation of the basic amino acids R, K and H in proximity to phosphorylated residues for the TiO 2 b strategy (Fig. 4A). The frequency of arginine occurrence at positions P-2, P-3, and P-4 was increased more than 5% (p Ͻ 0.01) and also the basic amino acids K and H were overrepresented. A possible reason for these observations is that the proximity of phosphorylated residues to K/R can hamper the efficiency of trypsin digestion as suggested by others (38,59,60). For the Ti 4ϩ -IMAC b strategy (Fig. 4B), surprisingly, acidic amino acids of aspartic acid and glutamic acid are more frequently overrepresented (p Ͻ 0.05) along the peptide backbone. The basic amino acids are marginally overrepresented. Interestingly, proline which is a common member for several kinase substrate motifs, is markedly underrepresented in the position of p ϩ 1. If a proline is adjacent to a phosphosite which is not close to a basic residue then trypsin is unhindered whereas if a proline is adjacent to a lysine or arginine then trypsin maybe hindered. Our data suggests that prolines are often close to phosphosites and sufficiently far away from basic residues leading to the generation of a neutral phosphopeptide. The logo generated by Ti 4ϩ -IMAC a approach has a similar pattern to the TiO 2 b strategy, in which the basic amino acids of H, R and K and acidic amino acids of D and E are overrepresented (Fig. 4C). On closer inspection, the occurrence of basic amino acids is slightly higher than that for acidic amino acids (5% to 3%). It was also observed that Rs are more frequently enriched at P-2, P-3, and P-4. However, D and E have higher occurrence in the proximity on the right side (p ϩ 1…p ϩ 5) of phosphorylation sites. Strikingly, there is also a clear enrichment of phosphopeptides containing histidines, a pool of peptides that are not a by-product of an incomplete digestion and thus potentially represent a new set of phosphopeptides. However, the neutral phosphopeptide pool generated with Ti 4ϩ -IMAC b enrichment produced a different pattern with the residue proline overrepresented (Fig. 4D). Proline direction is common in phosphorylation motifs and, as expected, it is enriched at p ϩ 1. Finally, both basic residues and acidic residues showed depletion.
Specific Enrichment of Phosphopeptides Originating From Putative Basophilic Kinase Substrates-Protein kinase target substrates typically depends on the primary amino acid sequence proximal to the site that will be modified (3). The identification of kinase specific substrates can be evaluated using such sequence motifs. Here we performed such a motifdependent categorization of the detected phosphopeptides for both, albeit separately, the "positive" and "neutral" phosphopeptides enriched by Ti 4ϩ -IMAC a using the motif classification tool implemented in Maxquant (61). In the positive phosphopeptide and neutral phosphopeptide data sets, the algorithm available within Maxquant defined a preferred kinase to 1832 and 985 phosphopeptides, respectively. In Fig.  5A, a summary is displayed of the relative contribution of each kinase to the total data set, ignoring kinases that contributed less than 1% to the full data set. Significantly, substrates of different kinases are overrepresented when comparing these two data sets. Most obviously, there is an ϳ2-fold higher enrichment of peptides originating from putative basophilic kinase substrates such as PKA, PKA/AKT and AURORA in the positive phosphopeptide data set ( Fig. 5B and 5C). Basophilic kinase substrates only contributed to a mere 11% in the neutral phosphopeptide data set ( Fig. 5B and 5C).
An Evaluation of the SCX/Ti 4ϩ -IMAC a Strategy Using a Triple Dimethyl Labeled Lysate-To further evaluate the use of SCX/Ti 4ϩ -IMAC a approach for quantitative phosphoproteomics where complexity is increased because of the introduction of isotope labels, we performed a phosphorylation analysis on a total of 400 g of dimethyl labeled tryptic digests of MCF-7 cell lysate. A total of 35 fractions were generated using SCX, each of which was subjected to phosphopeptide enrichment via the Ti 4ϩ -IMAC a approach. After enrichment, the eluate of each fraction was analyzed on an LTQ-Orbitrap Velos mass spectrometer by using either HCD or alternating CID and ETD. The data was filtered to Ͻ1% FDR using "Rockerbox" (54), which contains percolator (55). The data was further filtered to solely possess PSMs with a minimum mascot score of 20. Next, if a phosphosite was identified more than once, we performed two stages of filtering. Initially, the PSMs corresponding to the earliest fraction were kept. Subsequently, the PSM corresponding to the highest score was kept regardless of the activation technique. supplemental Table S8 contains the final filtered results, summarized in Fig. 6. A total of 9117 unique phosphorylation sites on Phosphoproteomics Targeting Putative Basophilic Kinase Substrates 9678 unique phosphopeptides was identified (supplemental Table S8). Strikingly, 5300 unique phosphorylation sites (58% of the total) were identified from the later SCX fractions (28 -40) corresponding to the positive phosphopeptides. The filtering applied means any site observed in an earlier fraction would be the only site reported. Thus, if we observed the same site in two distinct peptides the peptide observed in the later fraction would be removed regardless of its score. A side effect of this filtering strategy is that it discriminates against phosphopeptides containing miss cleavages. The contribution to the unique phosphopeptide identifications per fraction broken down by activation technique are shown in Fig. 6B. As we have recently reported for regular peptides (53), HCD and CID were most effective for the earlier SCX fraction which contain peptides of lower net charge state, and the peptides carrying on 2ϩ charge. For the late SCX fractions, particularly fractions 34 -40, ETD outperformed HCD and CID, contributing 76% of the phosphopeptide identifications.

DISCUSSION
In recent years the combination of SCX and metal chelating enrichment strategies have enabled the large scale analysis of phosphoproteomes, identifying several thousands of phosphorylation sites (3-5, 38, 39). Notwithstanding this success, several issues hampering comprehensive analysis have remained relatively unaddressed. First of all, different affinitybased enrichment technologies have inherent biases often dependent on certain physicochemical properties of the diverse pool of phosphorylated peptides. Another issue is the relatively poor enrichment of phosphopeptides containing multiple basic residues. Poor binding of positive phosphopeptides was suggested to be caused by the intra-peptide interaction between the basic and phosphorylated residues (31), which hinders the ability to coordinate with the affinity material. For instance, Klemm and coworkers found drastically reduced recovery for synthetic phosphopeptides with multiple A, The bar chart shows the classification and distribution of substrates of different kinases found in the "positive phosphopeptide" pool (displayed with black bar) whereas the gray bars indicate those found in the "neutral" phosphopeptide pool. B and C, The pie chart shows the distributions of basophilic (black) and other kinases (gray) in the "positive" and "neutral" phosphopeptide data set, respectively. basic amino acids using TiO 2 enrichment (29). Comparative evaluation of the recovery of a basic tyrosine-phosphorylated angiotensin II peptide by Iliuk and coworkers yielded only 25% recovery from a Fe 3ϩ -IMAC resin and 50% after TiO 2 enrichment (32). The problem in enriching these phosphopep-tides containing multiple basic residues is further exacerbated because these phosphopeptides require typically the most stringent enrichment as they, in SCX, co-elute with the bulk of the regular tryptic peptide population. Importantly, many basophilic kinases such as those in the AGC kinase family (e.g. PKA, PKG, Aurora) are to a great extent directed by flanking basic amino acids. The current lack of a solution to enrich these basic phosphopeptides can restrict the profiling of phosphorylation sites in particular related to these cellular pathways. Earlier, Barnouin et al. has proposed the use of the perflourinated solvent HFP to improve the enrichment of basic phosphopeptides by Fe 3ϩ -IMAC (31). However, a systematic evaluation of HFP on enrichment for phosphopeptides containing multiple basic residues at the proteome level was not performed. A recently introduced enrichment technology known as Ti 4ϩ -IMAC was shown to have a higher inherent specificity for phosphopeptide enrichment using solely a high level of TFA (21) and without the need of additives, in contrast to the case of TiO 2 (21). In this study, we sought to evaluate Ti 4ϩ -IMAC and TiO 2 in conjunction with the perfluorinated solvent (HFP) and/or TFA for improving the enrichment for positive phosphopeptides eluting in the late SCX fractions. Our results showed that Ti 4ϩ -IMAC a (containing 6% TFA) is capable of achieving both the largest number of identification of positive phosphopeptides and the highest specificity (63%) of enrichment for the late SCX fractions compared with Ti 4ϩ -IMAC b (38%) and TiO 2 b (46%). Interestingly, TiO 2 b (containing 6% TFA) outperformed TiO 2 a (the classical DHB centric approach). To investigate the characteristic of identified phosphopeptides by different approaches and from different phosphopeptide populations, we initially performed a pI analysis for all datasets, however; each data set revealed a similar pI profile (supplemental Fig. S1), reflecting pattern similar to those of others who also applied SCX/TiO 2 for phosphoproteomics (5,27). The lack of insight generated by the pI plots led us to investigate amino acid frequency graphs to distinguish the characteristics for each data set. In the TiO 2 b approach, basic amino acids R, K, and H were found to be overrepresented with varying degrees (Fig. 4A). In the Ti 4ϩ -IMAC b data set, acidic amino acids D and E were more frequently enriched than the basic amino acids R, K, and H (Fig. 4B). One possible explanation is that HFP stretches the peptide backbone by breaking the intra-peptide ion interaction between negatively charged phosphate group and positively charged basic residues (31) such that the phosphate group can be more easily accessed by Ti 4ϩ -IMAC. As a result, the chance of affinity interaction of acidic amino acids with Ti 4ϩ -IMAC is also similarly increased. Apart from this, acidic residues such as aspartic acid and glutamic acid adjacent to tryptic cleavage sites can reduce the efficiency of trypsin digestion (38,59,60), resulting in miss-cleavages that increase the net charges of peptides. This also explains our observation that Ti 4ϩ -IMAC b data set has the highest proportion of phosphopeptides with at least 1 mis-cleavage (62.1%) compared with Ti 4ϩ -IMAC a (55.1%) and TiO 2 b (52.4%). The basic amino acids R, K and H as well as acidic amino acids D and E are overrepresented in the pool after Ti 4ϩ -IMAC a enrichment (Fig. 4C). Among the three methods, Ti 4ϩ -IMAC a appears to contain to a large degree all populations enriched by the use HFP and TiO 2 displaying thus the least bias. As mentioned earlier, the high content of TFA (6%) itself can reduce the level of intrapeptide bonding between positively charged basic residues and negative phosphate group (33) and aid coordination of the phosphate group to Ti 4ϩ . The high level of TFA also aided the enrichment levels achieved by TiO 2 but not to the same extent. The superior results for positive phosphopeptides by Ti 4ϩ -IMAC may be related to the use of a phosphate scaffold to immobilize the Ti 4ϩ metal ion which is more resilient to the high levels of TFA and may also restrict the type of ligand that can coordinate and allow phosphate groups to be favored.
Identifying specific substrates of kinases is essential for constructing phosphorylation networks and understanding signal transduction in complex biological systems (62). Most interestingly, our data set (Fig. 5) shows that positive phosphopeptides recovered from the late SCX fractions contain relative higher occurrences of phosphopeptides originating from putative basophilic kinase substrates (63). Overall, the relative contribution of basophilic kinase substrate phosphopeptides was ϳ25% in the positive phosphopeptide data set compared with only 11% in the neutral phosphopeptide data set. In the positive phosphopeptides we detected the majority of putative PKA, AKT and AURORA substrates. These findings suggest that in SCX based large scale phosphoproteomics analysis, the late SCX fractions, which require high level of phosphopeptide enrichment, cannot be ignored especially not when one is interested for instance in cAMP/PKA (64), cGMP/PKG, AKT, or Aurora involved signaling. A bias toward analyzing only the early SCX fractions, which is quite common in current phosphoproteomic analysis, will negatively influence the outcome of such analysis.
For the systematic evaluation, all enriched positive phosphopeptides were analyzed by alternating CID and ETD and we also performed HCD, ETD, and CID for the triple dimethyl labeled sample. Using these datasets we were able to assess whether HCD, CID, or ETD is better-suited for phosphopeptide containing multiple basic residues. Our data revealed that, in particular, for the late SCX fractions ETD boosted the number of phosphopeptide identifications significantly (Fig.  3A, Fig. 6), in line with previous reports (38,39,45,65,66). CONCLUSIONS The data presented here indicate that a combination of SCX fractionation and Ti 4ϩ -IMAC based enrichment, provides one of the most comprehensive phosphopeptide enrichment and analysis technologies up to date, enabling to further boost the number of phosphopeptide identifications in large scale phosphoproteomics analyses. The Ti 4ϩ -IMAC method enhances in particular the recovery of positive phosphopeptides, containing multiple basic residues, but performed also well in neutral phoshopeptide enrichment. We exemplify our strategy by being able to generate over 9000 phosphorylation sites from a single experiment consisting of a triple dimethyl labeled ly-sate. Therefore, we conclude that the Ti 4ϩ -IMAC based enrichment technology may be an extremely valuable tool in the ever expanding field of phosphoproteomics.