Stable Isotope Labeling of Phosphoproteins for Large-scale Phosphorylation Rate Determination*

Signals that control responses to stimuli and cellular function are transmitted through the dynamic phosphorylation of thousands of proteins by protein kinases. Many techniques have been developed to study phosphorylation dynamics, including several mass spectrometry (MS)-based methods. Over the past few decades, substantial developments have been made in MS techniques for the large-scale identification of proteins and their post-translational modifications. Nevertheless, all of the current MS-based techniques for quantifying protein phosphorylation dynamics rely on the measurement of changes in peptide abundance levels, and many methods suffer from low confidence in phosphopeptide identification due to poor fragmentation. Here we have optimized an approach for the stable isotope labeling of amino acids by phosphate using [γ-18O4]ATP in nucleo to determine global site-specific phosphorylation rates. The advantages of this metabolic labeling technique are increased confidence in phosphorylated peptide identification, direct labeling of phosphorylation sites, measurement phosphorylation rates, and the identification of actively phosphorylated sites in a cell-like environment. In this study we calculated approximate rate constants for over 1,000 phosphorylation sites based on labeling progress curves. We measured a wide range of phosphorylation rate constants from 0.34 min−1 to 0.001 min−1. Finally, we applied stable isotope labeling of amino acids by phosphate to identify sites that have different phosphorylation kinetics during G1/S and M phase. We found that most sites had very similar phosphorylation rates under both conditions; however, a small subset of sites on proteins involved in the mitotic spindle were more actively phosphorylated during M phase, whereas proteins involved in DNA replication and transcription were more actively phosphorylated during G1/S phase. The data have been deposited to the ProteomeXchange with the identifier PXD000680.

Signals that control responses to stimuli and cellular function are transmitted through the dynamic phosphorylation of thousands of proteins by protein kinases. Many techniques have been developed to study phosphorylation dynamics, including several mass spectrometry (MS)based methods. Over the past few decades, substantial developments have been made in MS techniques for the large-scale identification of proteins and their post-translational modifications. Nevertheless, all of the current MSbased techniques for quantifying protein phosphorylation dynamics rely on the measurement of changes in peptide abundance levels, and many methods suffer from low confidence in phosphopeptide identification due to poor fragmentation. Here we have optimized an approach for the stable isotope labeling of amino acids by phosphate using [␥- 18 O 4 ]ATP in nucleo to determine global site-specific phosphorylation rates. The advantages of this metabolic labeling technique are increased confidence in phosphorylated peptide identification, direct labeling of phosphorylation sites, measurement phosphorylation rates, and the identification of actively phosphorylated sites in a cell-like environment. In this study we calculated approximate rate constants for over 1,000 phosphorylation sites based on labeling progress curves. We measured a wide range of phosphorylation rate constants from 0.34 min ؊1 to 0.001 min ؊1 . Finally, we applied stable isotope labeling of amino acids by phosphate to identify sites that have different phosphorylation kinetics during G1/S and M phase. We found that most sites had very similar phosphorylation rates under both conditions; however, a small subset of sites on proteins involved in the mitotic spindle were more actively phosphorylated during M phase, whereas proteins involved in DNA replication and transcription were more actively phosphorylated during G1/S phase. Protein phosphorylation is crucial for modulating protein structure, protein localization, and the protein-protein interactions that form the basis of many cell-signaling networks. Phosphorylation-based signaling often takes the form of a cascade in which sequential protein phosphorylations lead to changes in protein stability, function, and localization. Protein kinases, the enzymes that propagate these signals, catalyze the transfer of ␥ phosphate from ATP onto serine, threonine, or tyrosine residues of substrate proteins. The sites of protein phosphorylation and phosphorylation dynamics are important in determining the biological outcome of a signaling event (1). For instance, protein phosphorylation drives many of the changes during the cell cycle (2,3). During mitosis, kinases are activated at precise times to direct the course of chromosome segregation and cell division. For example, CDK1 activation at the beginning of mitosis leads to phosphorylation of NUP98 during prophase, which in turn promotes nuclear envelope disassembly (4). Additionally, an increased protein phosphorylation rate combined with constitutive activation of signaling networks due to hyperactivated kinases is considered a hallmark of cancer (5,6). Because the rate of substrate phosphorylation is a straightforward readout of kinase activity, there is growing interest in measuring phosphorylation rates in order to better understand phosphorylation-based signaling networks and potentially design more effective cancer treatments (7,8).
Many techniques have been developed to study phosphorylation-based signaling dynamics. Some of the most commonly applied techniques for following changes in phosphorylation levels are the use of site-specific antibodies to probe phosphorylated proteins from cell extracts and quantitative mass spectrometry methods such as stable isotope labeling of amino acids in cell culture to identify and quantify phos-phorylated peptides (9 -11). In comparison with antibodybased methods, quantitative mass spectrometry techniques have the added advantage that thousands of phosphorylation site changes can be measured in a single experiment (9). Both of these techniques provide useful information on whether the total amount of phosphorylated protein is increasing or decreasing over time; however, they do not directly measure new phosphorylation events or phosphorylation rates (12). MS techniques and fluorescence techniques have been developed to measure phosphorylation rates on synthetic peptides with known kinase consensus motifs in cell lysates (13,14). These techniques provide a read-out of kinase activity in cell lysates under different biological conditions. Nonetheless, the use of peptides rather than intact endogenous protein might not reflect actual in vivo phosphorylation rates because of the loss of sequence context. In addition, there is often a loss of kinase specificity in peptide-based assays because the intact protein may contain additional kinase docking sites or be part of a larger protein complex (15).
Approaches used to directly label protein phosphorylation sites for detection by mass spectrometry using chemical approaches or other ATP analogs such as ATP␥S have been previously reported. Thiol phosphate approaches have been successfully used in combination with engineered kinases to directly label kinase substrates (16,17). However, most endogenous kinases utilize ATP more efficiently than ATP␥S, and thus these reactions do not give an accurate picture of in vivo kinase activity (18,19). Another approach is to use radioactively labeled 32 P-ATP or 32 P i to directly label and measure protein phosphorylation rates (20). 32 P labeling is highly specific and sensitive. It has been used in the past in combination with mass spectrometry and Edman sequencing (21,22) to identify phosphorylation sites; however, extra precautions need to be used with radioisotopes, and radiolabeled samples cannot be stored for long because of the short half-life of 32 P.
We have developed a quantitative mass spectrometry technique using stable-isotope-labeled [␥- 18 O 4 ]ATP to directly label phosphorylation sites and quantify phosphorylation changes over time on hundreds of native proteins in nucleo. Stable isotope labeling of phosphate by [␥- 18 O 4 ]ATP (SILAP) 1 can be used to specifically label phosphorylation sites without a radioisotope, and it has the added advantage that many phosphorylation sites can be identified and quantified at once using mass spectrometry. [␥- 18 O 4 ]ATP labeling has previously been used in reactions with purified kinases to increase the confidence in phosphorylation site assignments and in reactions with purified kinases to characterize different kinase inhibitors (23,24). These studies demonstrated that [␥- 18 O 4 ]ATP is stable and has properties very similar to those of ATP with natural isotope abundances. In the experiments described here, we demonstrated that [␥- 18 O 4 ]ATP can be used to label and confidently identify over 1,000 phosphorylation sites in a single experiment in isolated nuclei to measure the phosphorylation rates for proteins in asynchronous and G1/S and M phase synchronized cell nuclei, and to subsequently determine the most actively phosphorylated sites under these conditions. EXPERIMENTAL PROCEDURES HEK293 Cell Culture and Cell Cycle Synchronization-Human embryonic kidney (HEK293) cells were maintained in Dulbecco's modified Eagle's medium (Invitrogen) supplemented with 10% newborn calf serum (Invitrogen) and penicillin-streptomycin solution diluted 1:100 (10,000 units penicillin G and 10 mg streptomycin per milliliter) (Fisher) until they reached 80% confluency.
HeLa cells were maintained in Joklik's modified Eagle's medium supplemented with 10% newborn calf serum (Invitrogen) and penicillin-streptomycin solution diluted 1:100 (10,000 units penicillin G and 10 mg streptomycin per milliliter) (Fisher). Cells at 3 ϫ 10 5 confluency were synchronized by a double thymidine block. 1.5 ϫ 10 8 cells were collected immediately after the block. The remaining cells were released for 4 h and finally transferred into media containing 100 ng/ml nocodazole for 4 h. These cells were mitotic at harvesting. The efficiency of cell synchronization was analyzed via propidium iodide staining of fixed cells using flow cytometry analysis. In addition to the cells kept for cell cycle analysis, the double thymidine blocked cells and nocodazole blocked cells were split into two samples for two technical replicates of the stable isotope labeling time course experiments.
In Nucleo Kinase Assay-HEK293 or cell-cycle-synchronized HeLa cells were washed three times with ice-cold PBS and were then lysed in ice-cold hypotonic lysis buffer (10 mM KCl, 1.5 mM MgCl 2 , 10 mM HEPES-KOH pH 7.5, 1x HALT protease and phosphatase inhibitor mixture). The extent of lysis was monitored using trypan blue staining. Intact nuclei were collected by centrifugation and re-suspended in 37°C ATP reaction buffer (35 mM NaCl, 10 mM KCl, 5 mM MgCl 2 , 2 M CaCl 2 , 10 mM Tris-HCL pH 7.5) with 1x HALT and 5 mM [␥- 18 O 4 ]ATP (97% purity; Cambridge Isotope Laboratories, Tewksbury, MA). We used 5 mM [␥- 18 O 4 ]ATP in our assay to maintain intracellular ATP concentrations (1-5 mM ATP) (25). The nuclei were incubated in the [␥- 18 O 4 ]ATP-containing buffer in a 37°C water bath, and aliquots were collected after 5, 15, 30, 60, 120, and 240 min for HEK293 cells or after 0, 5, 10, 20, 40, 80, and 160 min for synchronized HeLa cells. After collection, the samples were immediately denatured using five volumes of urea buffer (9 mM urea, 10 mM Tris-HCl pH 8) and then snap-frozen. After all of the time points were collected, the samples were sonicated to break apart the chromatin, derivatized with dithiothreitol and iodoacetamide, and digested overnight with trypsin as previously described (26).
Phosphopeptide Enrichment-After trypsin digestion, the peptides were desalted using SepPac tC18 columns (Waters, Milford, MA) and then lyophilized. Phosphopeptides were enriched using titanium dioxide (27,28). Lyophilized peptides were re-suspended in loading buffer (2 M lactic acid, 50% acetonitrile (ACN)). Titanium beads (GL Sciences, Tokyo, Japan) were mixed with the peptides at a ratio of 4 mg of beads to 1 mg of peptide and rotated at room temperature for 30 min. The beads were rinsed once with loading buffer and twice with wash buffer (65% ACN, 0.1% TFA) and then were eluted in a basic elution buffer (50% ACN, 50 mM KH 2 PO 4 , pH to 10 with am-monium hydroxide), immediately dried, desalted using reversedphase C18 stop-and-go extraction tips (29), and stored in a Ϫ80°C freezer until analysis.
Metabolite Extraction and Analysis-Metabolites were extracted as detailed previously (30). Briefly, nuclei were pelleted at 600 ϫ g and the supernatant was removed. Then 2 ml of 80% methanol (Ϫ80°C) was added and the samples were stored on dry ice for 15 min. The supernatant was collected and the metabolites were re-extracted with 2 ml of 80% methanol (Ϫ80°C). The samples were dried under nitrogen and analyzed using liquid chromatography coupled to an Exactive Orbitrap (Thermo Scientific) as previously described (31).
Nano Liquid Chromatography Electrospray Ionization Tandem Mass Spectrometry-We analyzed the phosphopeptide-enriched samples and the flow-through samples on an LTQ-Orbitrap XL (HEK293 cells) or an LTQ-Orbitrap Elite and Q-Exactive (synchronized HeLa samples) mass spectrometer (Thermo Scientific) attached to an Eksigent AS2 autosampler and an Eksigent Nano-LC Ultra 2D Plus system run at 250 nL/min. The samples were loaded on a pulled-tip fused silica column with a 100-m inner diameter packed in-house with 12 cm of 3-m C18 resin (Reprosil-Pur C18-AQ) that served both as a resolving column and as a nanospray ionization emitter. Phosphopeptides were resolved on a gradient from 0% ACN to 30% ACN in 0.5 mM acetic acid over 115 min. Flow-through peptides were resolved on a two-step gradient from 5% ACN to 40% ACN over 100 min, then from 40% ACN to 60% ACN over 25 min, in 0.5 mM acetic acid.
The mass spectrometers were operated in the data-dependent mode with dynamic exclusion enabled (repeat count, 1; exclusion duration, 0.5 min). For the LTQ-Orbitrap XL MS, every cycle we collected one full MS scan (m/z 350 to 1650) at a resolution of 30,000 at an AGC target value of 7 ϫ 10 6 , and then nine MS2 scans of the most intense peptide ions using collisionally activated dissociation (normalized collision energy ϭ 40%, isolation width ϭ 3 m/z) at an AGC target value of 3 ϫ 10 4 . For the LTQ-Orbitrap Elite MS, every cycle we collected one full MS scan (m/z 350 to 1650) at a resolution of 60,000 at an AGC target value of 1 ϫ 10 6 , and then 15 MS2 scans of the most intense peptide ions using collisionally activated dissociation (normalized collision energy ϭ 35%, isolation width ϭ 2 m/z) at an AGC target value of 1 ϫ 10 4 or Higher-energy collisional dissociation (normalized collision energy ϭ 36%, isolation width ϭ 2 m/z, resolution ϭ 15,000) at an AGC target value of 5 ϫ 10 4 . Ions with a charge state of 1 and a rejection list of common contaminant ions (exclusion width ϭ 10 ppm) were excluded from the analysis.
Data Analysis and Bioinformatics-The data were searched using Mascot (version 2.2.07, Matrix Science, London, UK) and X!Tandem (version 2013.02.01.1, The GPM, Alberta, Canada) against the human UniProt database (canonical and isoform sequence data retrieved March 16, 2012; 140,795 sequences), using a mass tolerance of 20 ppm for precursor ions and 0.5 Da for fragment ions. Serine, threonine, and tyrosine phosphorylation; methionine oxidation; and N-terminal acetylation were set as variable modifications, and cysteine carbamidomethylation was set as a fixed modification. Up to two missed cleavages were allowed for a trypsin digest search. Scaffold (version 4.0.5, Proteome Software Inc., Portland, OR) was used to validate MS/MS-based peptide and protein identifications. All search results were loaded into Scaffold in a Mud-Pit-type setup. Peptide identifications were accepted at over 95% probability and protein identifications were accepted at over 99% probability according to the PeptideProphet and ProteinProphet algorithms (32). The filtered peptide identifications from Mascot and X!Tandem were loaded into in-house-developed software that used a naïve Bayes model of coeluting peptides identified in each run to predict retention times of peptides that were not identified in all runs. This is especially important for time course analysis because it fills in time point measure-ments from runs containing essentially the same peptides but different sets of MS2 identifications. Phosphorylated peptides from the database searches and our in-house-developed software were loaded into Quantitator, 2 which models the isotope distribution for each peptide based on its chemical formula in order to calculate the expected intensity distributions for relevant isotopic labeling states. Isotopic labeling states were defined based on their shared isotopic perturbation, namely, "light" phosphorylated peptides with no incorporated 18 O versus "heavy" phosphorylated peptides with between one and three incorporated 18 O atoms. The signal intensities of the raw isotopologue peaks were partitioned between the isotopic labeling states using linear regression. Finally, we determined phosphorylation site localization scores using AScore (33) and used a score cutoff of 20 for assigning confident phosphorylation sites. We used the maximum phosphorylation site localization score to filter peptides identified by more than one MS2 spectrum. Processed and raw data have been made available in the ProteomeXchange Consortium (34) with the dataset identifier PXD000680 and DOI 10.6019/PXD000680.
A signal-to-noise cutoff of 10 (where the signal was the fitted intensity for the light isotopic labeling state and the noise was the standard error of the fitted intensity) was used to filter unlabeled phosphopeptides for analysis. This ensured that the signal was high enough to be used to confidently measure the ratio of unlabeled to labeled phosphopeptide. Phosphorylation progress curves (increase in heavy labeled/light labeled phosphopeptide over time) were fit using the nonlinear least squares "SSAsympOrig" function in the R 2.15.0 environment.

Incorporation of 18 O into Phosphorylated Proteins in Nu-
cleo-To test the viability of SILAP as a labeling technique, we incubated asynchronous HEK293 nuclei with [␥- 18 O 4 ]ATP for varying amounts of time prior to protein digestion, phosphopeptide enrichment, and LC-MS/MS analysis (Fig. 1A). We performed the [␥- 18 O 4 ]ATP labeling experiments in nucleo because isolated nuclei maintain structural integrity and intranuclear protein concentrations while being permeable to molecules that cannot cross the plasma membrane, such as ATP (35). We expected that the endogenous kinases in the isolated nuclei would catalyze the transfer of P 18 O 3 from [␥- 18 O 4 ]ATP onto the hydroxyl group of substrate residues, producing P 18 O 3 -labeled phosphorylation sites that were 6 Da heavier than the unlabeled sites, as well as [␤-18 O 1 ]ADP (Fig. 1B). Labeling with [␥- 18 O 4 ]ATP created the expected 6.012-Da shift in the MS1 between new "heavy" P 18 O 3 -labeled phosphorylation sites and preexisting "light" phosphorylation sites ( Fig. 2A). The heavy and light labeled phosphorylated peptides co-eluted, making the identification and relative quantification of labeled species straightforward (Fig.  2B). We confirmed that P 18 O 3 labeling of phosphorylation sites caused the 6-Da shift by comparing tandem mass spectra for labeled peptides and their unlabeled counterparts. As an example, the phosphorylated b (b3, b7, b8, b9, b15, b16) and y (y16, y17) ions from the MS2 of the heavy-labeled AHNAK S216 peptide had a 6-Da shift in mass relative to an MS2 spectrum for the light species (Fig. 2C). The identification of phosphorylation sites by means of collisionally activated dissociation tandem mass spectrometry is notoriously difficult because of poorer fragmentation, so the further confirmation of heavy phosphorylation at the MS1 level and the residuespecific information at the MS2 level are added advantages of using this method.
In addition to the expected complete P( 18 O 3 ) labeling, 53% of all phosphorylation sites were also labeled with one heavy isotope of oxygen P( 18 O 1 16 O 2 ) and, to a lesser extent, with two heavy oxygens P( 18 O 2 16 O 1 ). The extent of this intermediate labeling increased over time (supplemental Fig. S1). Partially labeled ATP is one potential source for the P( 18 . S2). Previous in vitro kinase assays using purified kinases and synthetic peptide sub-strates have shown that [␥- 18 O 4 ]ATP is stable in reaction buffer over several hours (23,36). However, other experiments have demonstrated that the exchange of heavy oxygens for light on ATP can occur in the presence of certain enzymes and metabolites, so it is likely that other metabolic processes in nucleo caused the exchange of [␥- 18 O 4 ]ATP oxygens in our experiments (36 -38). Regardless of the source of partially labeled ATP, the phosphorylation sites that were labeled with one or two heavy oxygens were still "newly" labeled and could therefore be used in our analyses.
After incubating the isolated asynchronous HEK293 nuclei in [␥- 18 O 4 ]ATP buffer for four hours, we detected 633 unique phosphorylation sites labeled with heavy phosphate on serine, threonine, and tyrosine residues at a signal-to-noise cutoff for labeled peptides of 5 (detected across at least two time points), representing 21% of identified phosphorylated peptides across three biological replicates (supplemental Tables S1 to S3). Over half of the identified phosphorylated peptides were not labeled, indicating two pools of phosphorylation sites in our assay: those that were actively phosphorylated (heavy labeled) and those that were not (light labeled). These results demonstrate that SILAP can be used to label phosphorylation sites on a large scale and identify phosphorylation sites with active kinases under different experimental conditions, such as the cell cycle.
In order to determine whether a functional subset of proteins were actively being phosphorylated in asynchronous cell nuclei, we used DAVID to determine enriched GO terms in the proteins with labeled phosphorylation sites relative to all iden- tified proteins by means of functional annotation clustering (39). DAVID created five clusters of enriched GO terms sharing similar biological meanings for asynchronous cells and enrichment p values less than 0.05 with the following representative identifiers: chromosome (enrichment score: 3.1), RNA recognition motif (enrichment score: 2.6), transcription (enrichment score: 1.9), and negative regulation of RNA metabolic process (enrichment score: 1.6). These overrepresented compositional and functional classifications represent major structures and processes in the nucleus and reflect identifications from other large-scale analyses of the nuclear phosphoproteome (40). We also used iGPS to identify potential kinase-substrate relationships among the labeled phosphorylation sites (41). Over 30% of the labeled sites had the CDK consensus motif, and 10% were mapped directly to CDC2 (supplemental Fig. S3).
Phosphorylation Kinetics Modeling-We estimated labeling rate constants for phosphorylation sites by fitting the amount of "new" heavy-labeled phosphorylation over time (supplemental Table S4). Before fitting the data we filtered and used only peptides that had measurements across five or more time points. We plotted phosphorylation progress curves for these peptides and used a nonlinear least squares approximation to fit them to the following equation and estimate the rate constant (k) for "new" phosphorylation as well as the extent of "new" phosphorylation (H ϱ /L), In Eq. 1, H is the total extracted ion current for heavylabeled phosphorylated peptides (including species with one 18 O, two 18 O, and three 18 O), H ϱ is the maximum amount of phosphopeptide labeling, and L is the extracted ion current of unlabeled phosphorylated peptides. We were able to normalize against the amount of light phosphorylation because the addition of protease and phosphatase inhibitors prevented changes in the levels of species that were already phosphorylated before the start of our experiments. Previous [␥- 18 (24,42). Thus, the rate constant is reflective only of the on rate for new phosphorylation sites.
Using SILAP, we found that there was a wide range of basal protein phosphorylation rates in nucleo. These differences in rate are evident from the progress curves plotted in Fig. 3 (a complete list of identified sites and their rate constants are provided in supplemental Table S4). For example, S1163 on Myb binding protein 1A (k ϭ 0.018 min Ϫ1 ) and cyclin dependent kinase 1 Y15 (k ϭ 0.022 min Ϫ1 ) are phosphorylated at a slower rate than S824 on transcription intermediary factor 1 ␤ (k ϭ 0.036 min Ϫ1 ) (Fig. 3). Even phosphorylation sites on the same protein showed distinct phosphorylation rates. For example, S824 on transcription intermediary factor 1 ␤ (k ϭ 0.036 min Ϫ1 ) was labeled faster than S473 on the same protein (k ϭ 0.012 min Ϫ1 ). This result is in alignment with recent studies on transcription intermediary factor 1 ␤ phosphorylation finding that CHK2 kinase phosphorylates S473 at a slower rate than ataxia telangiectasia mutated phosphorylates S824 in response to DNA double strand breaks (38,39).
We also observed large differences in the extent of site phosphorylation (H ϱ /L). Over 60% of phosphorylation sites were labeled at very low levels. The final heavy phosphopeptide intensity for these sites leveled off at 25% of the light phosphopeptide intensity (H ϱ /L Ͻ 0.25). For example, the phosphorylation site Y15 on CDK1 was only labeled to H ϱ /L ϭ 0.12 (Fig. 3). 10% of the phosphopeptides were labeled at much higher levels. The final heavy phosphopeptide intensity for these sites was greater than 200% of the light phosphopeptide intensity (H ϱ /L Ͼ 2). For example, S824 on transcription intermediary factor 1 ␤ was labeled to H ϱ /L ϭ 2.6 ( Fig. 3). In this experiment, protease inhibitors and phosphatase inhibitors were added to the isolated nuclei, so in theory kinases could have catalyzed the phosphorylation of substrate proteins until the unphosphorylated substrates were depleted. Therefore H ϱ /L is representative of the pool of unphosphorylated substrates in the isolated nuclei at the time of labeling.
Cell Cycle Analysis-Different kinases are activated across the cell cycle, causing waves of protein phosphorylation that drive many important cell-cycle-related processes (43). We found that many peptides in our analysis of actively phosphorylated sites in asynchronous cells were predicted to be substrates of cyclin dependent kinases, indicating that these kinases were active under our assay conditions. Therefore, SILAP should be well suited for studying large-scale changes in active phosphorylation across the cell cycle. To test this, we synchronized HeLa cells to be at the G1/S phase boundary using a double thymidine block and synchronized cells to prometaphase using nocodazole. Cell synchronization was confirmed by propidium iodide staining followed by flow cytometry (supplemental Fig. S4). We extracted the G1/S phase and M phase cell nuclei, split them into two tubes for technical replicates, and incubated them with [␥- 18 O 4 ]ATP for varying amounts of time prior to phosphopeptide enrichment and MS analysis. We detected a total of 1,123 labeled phosphorylated sites in synchronized HeLa cells representing 10% of identified phosphopeptides across the two technical replicates.
Just as with the analysis of asynchronous cells, we filtered sites based on the signal-to-noise level of the light phosphorylation sites (s/n ϭ 10) and restricted progress curve fitting to phosphorylation sites with data across at least five time points. This resulted in 971 phosphorylation site progress curves. For most phosphorylation sites, the labeling kinetics between G1/S and M phase were indistinguishable (supplemental Table S4). For example, LMO7 (S1493) had nearly identical labeling in both G1/S and M phase samples (Fig. 4A). However, 116 phosphorylation sites had distinct labeling profiles in the G1/S and M phases, with 66 sites having faster G1/S labeling rates and 50 sites having faster M phase labeling (supplemental Table S5). We identified a range of differences in phosphorylation activities among these sites. For example, PRKDC (S3204) had much faster labeling in G1/S phase than in M phase, and NUP98 (S774) had much faster labeling in M phase than G1/S phase. MYBBP1A (S1163) had much less and much slower labeling than other sites but was clearly labeled faster in G1/S phase than in M phase (Figs. 4B and 4C).
Several known cell-cycle-regulated phosphorylation sites had different kinetics between G1/S and M phase. For example, NUP98 is heavily phosphorylated as part of the nuclear envelope breakdown process during mitosis (4). NUP98 S595, one of the sites that was phosphorylated more quickly and at higher levels in M phase samples, was recently identified as a CDK1/cyclin B substrate that is phosphorylated at high levels during mitosis relative to interphase (4). Other known cell-cycleregulated sites that were identified are Rb (S807/S811, T821/ Phosphorylation progress curves were chosen to represent a variety of phosphorylation activities, including phosphorylation sites that were phosphorylated faster in G1/S phase than in M phase. The MS1 spectra are labeled similarly to Fig. 4, with the peaks corresponding to light phosphorylation in blue and the sites corresponding to heavy phosphorylation in red. Included next to the representative MS1 spectra are phosphorylation progress curves for the G1/S phase data (orange) and phase data (green). Each time point has measurements from two technical replicates of the SILAP labeling time course. We averaged from multiple MS runs, charge states, and oxidized forms of the peptides for each data point. The error bars are the standard deviation across these measurements. 826, S788/795), which is hyperphosphorylated in G1 phase and maintained at high phosphorylation levels through mitosis, and NPM S10, which is a regulator of the G2/M transition (44).
We determined GO term enrichment for proteins with more active phosphorylation in M phase than in G1/S phase using all labeled phosphorylation sites as a background. GO terms that were significantly enriched (p value Ͻ 0.05 with Benjamini-Hochberg correction) included microtubule cytoskeleton, spindle, and mitotic cell cycle checkpoint. GO terms that were significantly enriched in proteins with more active phosphorylation in the G1/S phase were positive regulation of sequence-specific DNA binding transcription factor activity and nucleic acid metabolic process. As a comparison to the asynchronous cell data, we also analyzed GO term enrichment in proteins with labeled sites versus all identified sites taking the G1/S and M phase data together. Similar to the asynchronous cell data, there was an enrichment for RNA splicing proteins (enrichment score: 3.5), RNA recognition motif (enrichment score: 3.1), and proteins with the GO term "spindle" (enrichment score: 1.9).
Multisite Phosphorylation-Thirty percent of the peptides we identified in the cell cycle experiments had more than one phosphorylation site. Multiple phosphorylation sites on the same peptide can give additional information about the relative amount of each phosphorylation site, labeling preferences for one site over the other, and the order of labeling. Understanding the order and rates of multisite phosphorylation can lead to more insights into how different phosphoproteins are regulated (45). Most multiply phosphorylated peptides we identified had only one labeled site; for example, only S88 was labeled on the HMGA1 peptide containing S88, S91, and S92. The HMGA1 peptide with three phosphorylation sites was labeled at high levels on S88 (H ϱ /L ϭ 0.5), whereas the form with two phosphorylation sites was not labeled and the form with one site was not detected. These results suggest that HMGA1 S91 and S92 were phosphorylated at the same time, probably almost completely before the start of our labeling time course, followed by S88. These residues are known to be constitutively phosphorylated by casein kinase 2 to regulate HMGA binding to DNA (46).
A small number of phosphorylated peptides (25 peptides total), such as the peptide containing MISP S394 and S397, had more than one labeled phosphorylation site (Fig. 5A). There are three main peaks in the MISP (S394, S397) MS1 spectrum corresponding to the peptide with two light-labeled phosphorylation sites, one heavy-labeled and one light-labeled phosphorylation site, and two heavy-labeled phosphorylation sites (Fig. 5A). In other words, we are observing different species: one form of the protein that went from the unmodified state to having two phosphoryl groups added, and another form that had a previously phosphorylated group and then was progressively phosphorylated during these experiments. Mass shifts in the b, y, and phosphorylation loss ions in the MS/MS spectra of the species with one old and one new phosphorylation site can be used to determine whether one of the sites is predominantly old or new, or whether there is an equal mix of old and new labeling on each site (Fig. 4B). For example, after 10 min of labeling, the b5 ion, which only contains the MISP S394 phosphorylation site, was predominantly light (m/z ϭ 554) (Fig. 5B, inset). This means that the MISP S394 site was not labeled as quickly as the S397 site. These sites are phosphorylated by Polo-like kinase in vitro (47).
Histone H1 Phosphorylation-Histone H1 is known to be highly phosphorylated and to have many cell-cycle-regulated phosphorylation sites (48). There are 11 human histone H1 variants with different cellular functions and levels of phosphorylation (49). We identified six phosphorylation sites on the N-terminal tails of three histone H1 variants (H1.2, H1.4, and H1.5). These sites showed phosphorylation-site-specific and variant-specific kinetics across the cell cycle (Fig. 6). Histone H1.2/H1.4 S2 and T4 and histone H1.5 S2 had nearly identical phosphorylation rates. In all of these cases, the phosphorylation sites were labeled more quickly and to higher levels in G1/S than in M phase (Fig. 6A). The phosphorylation activity for these phosphorylation sites appeared to be independent of variant and site identity. The kinase for these phosphorylation sites has not been identified in human cells, but the equivalent peptide (SETAPAAPAAAPPAEK) is phosphorylated at both phosphorylation sites in a cell-cycle-specific manner by p38cdc2 in Chinese hamster ovary cells (50).
In contrast, H1.5 S18 and H1.4 T18 are labeled at very different rates despite both being known CDK1 substrates and having similar sequence contexts (16). H1.4 T18 was phosphorylated fairly quickly in both G1/S phase and M phase, whereas H1.5 S18 was barely labeled after 160 min (Fig. 6B). Interestingly, the peptide containing both H1.5 S18 and T11 phosphorylation was labeled at an even higher rate than H1.4 S18. Using the mass shifts from labeling in the MS1 and MS/MS spectra of this peptide, we determined that only the H1.5 T11 phosphorylation site on the doubly phosphorylated peptide was labeled in both G1/S and M phase (data not shown). In addition, the peptide containing only H1.5 T11 phosphorylation was not detected. This suggests that H1.5 S18 phosphorylation occurred first and outside of the time frame of our labeling experiment, followed by H1.5 T11 phosphorylation, and that H1.5 T11 phosphorylation might be dependent on H1.5 S18 phosphorylation. This result is in line with previous research by Lidner et al. suggesting that H1.5 S18 phosphorylation is one of the earliest phosphorylation events on histone H1 during the cell cycle, occurring during late G1 phase, whereas H1.5 T11 is one of the later phosphorylation events, occurring mainly in M phase (48).
These results exemplify the strengths of the SILAP technique: we were able to distinguish between the dynamics of different H1 proteoforms based on the site-specific labeling of phosphorylation. This in-depth study of multiple phosphorylation sites and their phosphorylation rates, particularly of peptides with two phosphorylation sites, would be difficult using any other technique. In fact, the order of histone H1 phosphorylation during the cell cycle is still controversial because it is difficult to distinguish between the H1 variants and their phosphorylation sites. DISCUSSION We demonstrated that [␥- 18 O 4 ]ATP can be used to label hundreds of endogenous phosphorylation sites in complex cell-like environments with heavy phosphate and to confidently identify phosphorylated peptides. In our experiments, we used SILAP to investigate site-specific rates of phosphorylation and predict kinase activities ex vivo. In addition, we showed that SILAP can be used to distinguish between phosphorylation activities in different phases of the cell cycle and is therefore extendable to other systems in which there are large changes in kinase activities. SILAP is the first large-scale technique used to identify site-specific phosphorylation rates for endogenous proteins. We measured a wide range of phosphorylation rates in nuclei ranging from 0.34 min Ϫ1 (fast) to 0.001 min Ϫ1 (slow). These rates reflect differences in nuclear kinase activities, local kinase and substrate contexts, and substrate protein concentrations. We also detected several proteins that had two or more phosphorylation sites with different phosphorylation rates, implying differential functions of the phosphorylation sites.
The identification of actively phosphorylated sites using SILAP might allow one to distinguish between sites that are important in dynamic cellular processes and those that are more stable phosphorylation sites. Thousands of phosphorylation sites have been identified via mass spectrometry, but only a small number have known functions (51). It has been postulated that many are static sites involved in protein folding, mediating stable protein interactions, or are nonfunctional and the result of off-target reactions by kinases (52). It is difficult to determine which sites are dynamically phosphorylated using standard mass spectrometry techniques; however, [␥- 18 O 4 ]ATP labeling can show which sites are labeled more quickly and which sites have slow or no active phosphorylation. We used SILAP to identify differences in the phosphorylation activities between G1/S and M phase. Many of the proteins with higher phosphorylation rates in the M phase samples are known to be involved in mitotic spindle formation and nuclear envelope breakdown, and many of the G1/S proteins with higher phosphorylation rates are involved in DNA replication and transcriptional regulation. Most of the phosphorylation sites identified had G1/S and M phase phosphorylation rates that were indistinguishable. This indicates that the G1/S and M phase nuclei were reproducibly prepared for the SILAP assays and many kinases had similar activities in G1/S and M phase nuclei under the conditions of our experiments. SILAP provides a snapshot of phosphorylation activities at the time when the nuclei are extracted. The different stages of mitosis occur in less than an hour, with the majority of protein phosphorylation occurring during prophase and pro-metaphase in preparation for spindle formation and cell division; thus it is possible that we captured the tail end of this wave of phosphorylation (43). In addition, a previous large-scale study by the Mann lab reported high phosphorylation site occupancy during mitosis (53). If this is the case, then the proportion of sites that are open for phosphorylation should be small during M phase. This means that the maximum ratio of heavy to light phosphorylation for these sites after labeling (H ϱ /L) will also be low, because we added phosphatase inhibitors to the reaction.
Measuring phosphorylation reaction rates in nucleo does come with some caveats. Small changes in nuclear preparations can cause differences in the measured kinase activity.
For example, there was variation in the identified sites, labeling rates, and extent of labeling between HEK293 nuclear reactions done on different days. To minimize variation, samples that are going to be compared should be prepared at the same time. Two biological replicates of the G1/S and M phase samples were prepared at the same time and had very similar results. In addition, kinase activities in nucleo might not reflect cellular kinase activities because of the addition of protease and phosphatase inhibitors and because important regulatory proteins and metabolite co-factors might be lost during the preparation. However, in nucleo reactions maintain more of the cellular environment than most other kinase assays, which are also performed for the most part in vitro. Maintaining the cellular environment is important because cellular structures such as scaffolds and localization can affect kinase activity and specificity (54).
Many large-scale mass spectrometry techniques have been developed to measure protein phosphorylation dynamics. These techniques measure global increases or decreases in the total amount of phosphorylated peptide in addition to phosphorylation site occupancy and have contributed significantly to our understanding of phosphorylation-based signaling. SILAP adds an extra dimension to the phosphoproteomics toolbox: direct measurement of phosphorylation rate at a fixed time. SILAP has the added advantages that phosphorylation sites can be distinguished from one another and that some information about phosphorylation order and relative activities on the same protein or peptide can be obtained. The estimated rate constants from these experiments could be FIG. 6. Histone H1 N-terminal phosphorylation progress curves. Phosphorylation progress curves for phosphorylation sites on the N-terminal portions of different histone H1 variants. The data points and fits in orange are from G1/S phase data, and the data in green are from M phase data. A, progress curves for H1.4 S2, T4, and H1.5 S2. B, progress curves for H1.4 (T18), H1.5 (S18), and the doubly phosphorylated peptide containing H1.5 (T11, S18). The H1.5 (T11, S18) data are noisy because of some interfering signals that were picked up by the quantification software, but the G1/S peptide was labeled more quickly and to a larger extent than the M phase peptide. The raw data for this figure are in supplemental Fig. S5. useful for systems-wide computational modeling of overall kinase signaling dynamics, cellular response mechanisms, and phenotype observations (55). Additionally, as activated kinase activity has been well studied to induce gene expression responses (56), it could be interesting to pair this methodology with microarray studies in a temporal manner to correlate phosphorylation rate dynamics with transcriptional state changes. Being able to track how phosphorylation gradients make their way into the nucleus to ultimately alter gene expression would answer many questions about how these signals are relayed in such rapid fashion.
The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (http:// proteomecentral.proteomexchange.org) via the PRIDE partner repository (34) with the dataset identifier PXD000680 and DOI 10.6019/PXD000680.