A Novel Quantitative Mass Spectrometry Platform for Determining Protein O-GlcNAcylation Dynamics*

Over the past decades, protein O-GlcNAcylation has been found to play a fundamental role in cell cycle control, metabolism, transcriptional regulation, and cellular signaling. Nevertheless, quantitative approaches to determine in vivo GlcNAc dynamics at a large-scale are still not readily available. Here, we have developed an approach to isotopically label O-GlcNAc modifications on proteins by producing 13C-labeled UDP-GlcNAc from 13C6-glucose via the hexosamine biosynthetic pathway. This metabolic labeling was combined with quantitative mass spectrometry-based proteomics to determine protein O-GlcNAcylation turnover rates. First, an efficient enrichment method for O-GlcNAc peptides was developed with the use of phenylboronic acid solid-phase extraction and anhydrous DMSO. The near stoichiometry reaction between the diol of GlcNAc and boronic acid dramatically improved the enrichment efficiency. Additionally, our kinetic model for turnover rates integrates both metabolomic and proteomic data, which increase the accuracy of the turnover rate estimation. Other advantages of this metabolic labeling method include in vivo application, direct labeling of the O-GlcNAc sites and higher confidence for site identification. Concentrating only on nuclear localized GlcNAc modified proteins, we are able to identify 105 O-GlcNAc peptides on 42 proteins and determine turnover rates of 20 O-GlcNAc peptides from 14 proteins extracted from HeLa nuclei. In general, we found O-GlcNAcylation turnover rates are slower than those published for phosphorylation or acetylation. Nevertheless, the rates widely varied depending on both the protein and the residue modified. We believe this methodology can be broadly applied to reveal turnovers/dynamics of protein O-GlcNAcylation from different biological states and will provide more information on the significance of O-GlcNAcylation, enabling us to study the temporal dynamics of this critical modification for the first time.

Glycosylation is one of the most abundant types of protein modification in biology. More than 50% of all proteins have been found to be glycosylated in cells according to the Swis-sProt database (1). Among the large diversity of protein glycosylation, enzyme-catalyzed O-GlcNAcylation is a type of post-translational modification (PTM) 1 with the attachment of a ␤-D-N-acetylglucosamine moiety to the hydroxyl group of serine or threonine residues. O-GlcNAcylation has been mostly found within the cytoplasm, but is also found in the nucleoplasm (2). So far, more than 500 proteins have been identified to be O-GlcNAcylated, and these modifications have fundamental roles in cellular structure, cell cycle control, metabolism, and transcriptional regulation (3,4) Additionally, the mutually exclusive occurrence of O-GlcNAc and protein phosphorylation on the same residues (serine and threonine) indicates competition between these two modifications, which may be responsible for their antagonistic roles in regulation of cellular signaling (5,6). More importantly, alterations in O-GlcNAcylation have been found to be associated with different kinds of diseases such as cancer and diabetes (7,8). Therefore, the identification of O-GlcNAc modified proteins and the dynamics of modification will lead to the better understanding of the functions of protein O-GlcNAcylation in human physiology and disease.
O-GlcNAc modification of proteins is also a dynamic process. The addition and removal of the modification is regulated by O-GlcNAc transferase (OGT) and O-GlcNAcase (OGA), respectively (9). OGT activity is dependent on the presence of its cofactor, UDP-GlcNAc, which is synthesized through the hexosamine biosynthetic pathway (HBP). HBP is a branch of glucose metabolism (Fig. 1A), but is also affected by various other metabolic pathways in the cell (10). About 2-5% of total cellular glucose is used to generate UDP-GlcNAc (2,11). Therefore, O-GlcNAcylation is highly regulated by cellular glucose availability. For instance, hyperglycemia was found to induce O-GlcNAc modification of CaMKII at Ser279 and enhance CaMKII autonomous activation in cardiac myocytes (12). Not only is O-GlcNAcylation regulated by levels of HBP metabolites, it can also be dynamically controlled by the inhibition of OGT and OGA (13). Inhibitors of either OGT or OGA have been developed to alter protein GlcNAcylation and might have potential clinical applications (14).
Identifying proteins and sites of O-GlcNAc modification has been challenging. Several enrichment methods have been developed over the past years to tackle this challenge. Lectin proteins have a weak affinity with the O-GlcNAc modification group and have been used to separate and retain O-GlcNAc peptides from unmodified peptides through lectin-agarose packed columns (15,16). With this method, named lectin weak affinity chromatography (LWAC), a recent study detected 142 O-GlcNAc sites on 62 mouse ESC nuclear proteins (17). Besides the lectin-O-GlcNAc interaction, pan-specific antibodies have also been produced to immunoprecipitate O-GlcNAc containing proteins (18). Two-hundred and eightyfour GlcNAc-modified chromatin proteins were identified with the use of the RL2 antibody and mass spectrometry based proteomics. Chemical methods have also been designed and successfully applied in the enrichment of O-GlcNAc (19). ␤-elimination of the O-GlcNAc group followed by Michael addition of a dithiothreitol or biotin tag, or named BEMAD, was developed and applied in many studies. This strategy was further modified by using d 0 or d 6 DTT to discriminate the derivatization between O-phosphate and O-GlcNAc under specific conditions (20). For example, multiple sites on histones H2A, H2B, and H4 were identified to be O-Glc-NAcylated with this method (21). Later, a mutant enzyme ␤-1,4-galactosyltransferase (GalT) was developed to specifically label O-GlcNAc residues with a keto-or azido-derivative of UDP-Galactose. The keto-or azido-group could then be used as a tag to chemically label the modified site with a biotin or fluorescent probe (22,23). With the utilization of this chemoenzymatic labeling method coupled with mass spectrometry based proteomics, 141 O-GlcNAc-modified sites on 64 proteins were identified from less than 15 g of HeLa spindle and midbody samples (24). Metabolic labeling to introduce a chemical tag on GlcNAc has also been used in enrichment. These tags include diazirine, azide, or alkyne modification groups added to GlcNAc (25,26). Upon click reaction or photo-crosslinking, GlcNAc peptides were enriched with beads bearing chemical groups that react with these tags (27). Another chemical method utilized the reversible reaction be-tween cis-1.3-diol in O-GlcNAc and boronic acid to form boronate esters (28).
Mass spectrometry (MS) plays an increasingly important role in quantitative proteomics because of the improvements in accuracy and sensitivity of MS related technologies (29). MS methods have been applied to identify proteins, map modification sites, and quantify changes in peptide abundance. Collision-induced dissociation (CID) fragmentation is the most time efficient MS method to identify large numbers of peptides in global proteomics experiments. However, just as with phosphorylation (30), O-GlcNAc is a labile PTM group. CID fragmentation of O-GlcNAc containing peptides causes partial loss of the GlcNAc moiety, thus losing the important site-specific localization information. Electron-transfer dissociation (ETD), on the other hand, is a different fragmentation method that does not break the bond between the GlcNAc moiety and the amino acid (31). In general, ETD fragmentation of parental ions with a charge state of 3 or higher provide better spectra than those with charge state of 2. Therefore, the combination of CID and ETD has been found advantageous to provide complete information for PTM identification (32,33).
Different labeling strategies and quantification methods have been developed to characterize the turnover rates and kinetics of protein modifications, such as methylation (34), acetylation (35,36), and phosphorylation (37). For example, histone methylations associated with gene activation have faster turnover rates than those associated with silent genes (34). Additionally, measurement of PTM turnover rates has become a strategy for determining the efficiency a potential epigenetic drug function (38). For example, in the study of EPZ-5676, a potent and selective aminonucleoside inhibitor of DOT1L histone methyltransferase activity reduced the cellular H3K79me2 levels with a half-time of 1.0 day, which was similar to the turnover rate of methylated H3K79 previously identified (39). For protein O-GlcNAcylation, the dynamic properties and significance have been widely recognized, however, turnover rates of protein O-GlcNAcylation have never been measured by any method before.
Here, we have developed a metabolic labeling method to isotopically label protein O-GlcNAc modifications in vivo by feeding cells with 13 C 6 -glucose, which produces 13 C-labeled UDP-GlcNAc through the hexosamine biosynthetic pathway. 13 C 6 -glucose is involved in many cellular pathways, and is metabolized by cells through glycolysis, the Krebs cycle, the pentose phosphate pathway, the pyrimidine biosynthetic pathway, and the hexosamine biosynthetic pathway to incorporate 13 C into different portions of the UDP-GlcNAc molecule (40). For example, the GlcNAc moiety in UDP-GlcNAc can be partially labeled as 13 C(6)-GlcNAc or 13 C(8)-GlcNAc. The six 13 C labels on the glucosamine ring are biosynthesized directly from 13 C 6 -glucose. The additional two 13 C labels are on the acetyl group in GlcNAc moiety. The acetyl group is transferred from acetyl-CoA. Cellular acetyl-CoA becomes heavily labeled from glucose metabolism, gets shuttled into cytosol and reacts with GlcNH2-6-P. (Fig. 1A) Glucose is also a major source of amino acids, primarily Ala, Asp, and Glu, via glycolysis and Krebs cycle.
In our methods, we used this 13 C 6 -glucose labeling approach to determine the time course of 13 C incorporation from 13 C 6 -glucose into UDP-GlcNAc, amino acids and O-Glc-NAcylated peptides in HeLa cells. An optimized strategy using phenylboronic acid solid phase extraction cartridge was utilized to enrich O-GlcNAc peptides. This enrichment method improves our ability to detect modified peptides and provides increased confidence in the identification of GlcNAc sites. Next, a mathematic model was developed to determine turnover of GlcNAc modifications on proteins. Overall, our metabolic labeling approach provides a useful quantitative platform for the understanding the roles of dynamic protein O-GlcNAcylation in biology.

EXPERIMENTAL PROCEDURES
Cell Culture and Isotope-labeling Experiments-HeLa cells were cultured in DMEM media in which normal glucose was replaced with 25 mM 13 C 6 -glucose and supplemented with 10% dialyzed FBS and penicillin-streptomycin antibiotics. For protein turnover measurement, HeLa cells were cultured in DMEM media in which normal lysine and arginine were replaced with same amount of lysine( 13 C 6 , 15 N 2 ) and arginine( 13 C 6 , 15 N 4 ) and supplemented with 10% dialyzed FBS and penicillin-streptomycin antibiotics. Cells were incubated at 37°C, 5% CO 2 and harvested after 6, 12, and 24 h after switching and incubating in heavy media. For pulse-labeling experiment, cells were harvest after 0.5, 2, 5, 9, 24, 32, and 48 h incubation. HeLa cells which were grown in 25 mM normal glucose DMEM media were used as zero start time point.
Cell Fragmentation and Protein Digestion-Frozen cell pellets were kept on ice to bring up to 4°C. Cells pellets were resuspended in 10ϫ volume of nuclear isolation buffer (15 mM Tris-HCL pH 7.5, 60 mM KCl, 15 mM NaCl, 5 mM MgCl 2 , 1 mM CaCl 2 , 250 mM Sucrose, 1 mM DTT 1 mM, 1X HALT phosphase inhibitor, 20 M Thiamet G, 10 mM sodium butyrate, and 0.3% Nonidet P-40), then gently pipetted up and down and incubated on ice for 5 min to break the cell membranes. Trypan blue showed more than 95% staining, indicating that most cells were lysed to nuclei. Fractioned cells were centrifuged 228 ϫ g for 5 min at 4°C to pellet nuclei. Whole nuclei were isolated and washed with Nonidet P-40 free nuclear isolation buffer several times and resuspended in 9 M urea buffer and then sonicated for 10 s, three times. After DTT reduction and akylation with iodoacetamide, nuclear proteins were digested overnight with endoproteinase Lys-C (1:50 ratio) in 4 M urea 25 mM Tris pH 8 buffer at room temperature overnight. Proteins for turnover measurement were digested overnight with trypsin (1:50 ratio) instead. Digested peptides were desalted with Oasis HLB cartridges.
Enrichment of O-GlcNAc Peptides-PNGased F was used to remove N-glycosylation in nondenaturing conditions. After Lys-C digestion, samples were desalted and dried using a speed-vac. Dried peptides (1 mg) at different time points were resuspended in 50 l of 1ϫ G7 reaction buffer, pH 7.5. Then, 5.0 l of PNGaseF (NEB 500 U/l) was added, and the samples were incubated at 37°C overnight. Then, O-HexNAc peptide enrichment was done by solid state extraction with PBA cartridges and nonaqueous DMSO. After washing the cartridge with anhydrous DMSO, digested peptides dissolved in DMSO were loaded and incubated with PBA resin at 37°C for 2 h. Then nonbound peptides were washed away with anhydrous ACN.
Lastly, bound peptides were eluted with 0.1% TFA solution after incubation at room temperature for 1 h. For the enrichment of O-GlcNAc peptide from a BSA peptide mixture, different solvents as indicated in the text were used.
Nano Liquid Chromatography Electrospray Ionization Tandem Mass Spectrometry-Desalted O-GlcNAc peptides were analyzed on an Orbitrap Fusion™ Tribrid™ Mass Spectrometer attached to an EasyLC system run at 300 nL/min. Peptides were eluted with a 135 min gradient from 2% to 30% ACN (120 min) and to 70% ACN over 15 min in 0.1% formic acid. Data dependent acquisition mode with a dynamic exclusion of 30 s was enabled. In every 3 s cycle, one full MS scan was collected with a scan range of 300 to 1500 m/z, a resolution of 120 K and a maximum injection time was 100 ms and AGC of 500,000. Then, a series of MS2 scans were acquired for the most abundant ions from the MS1 scan (max collection cycle time: 3 s) Ions were filtered with charge 2-7. An isolation window of 2.0 m/z was used with quadruple isolation mode. Ions were fragmented using collision induced dissociation (CID) with collision energy of 27%. Orbitrap detection was used with resolution of 15 K, maximum injection time of 120 ms and AGC 50000. If one or more than one product ions among 204.0867, 186.0761, 168.0655, 144.0655, 138.0550, or 126.0550 were detected, an additional ETD fragmentation on the same parent ions was applied. ETD energy was charge dependent (automatically scaled). Orbitrap detection was used with a resolution of 15 K. The maximum injection time was 120 ms and the AGC was set to 50,000. For protein turnover measurement, a similar LC-MS method was used without the product ion trigger ETD fragmentation.
Proteomics MS Data Analysis Including Peptide Identification and Quantification-RAW files were converted to MS1 and MS2 files by xtract 1.0.0.8 in pFind Studio 2.8 (pfind.ict.ac.cn) and converted to MGF files by pParse to calibrate the precursor isotopic mass to the monoisotopic mass and export coeluted precursors, which could increase the identification rate (41). The MS1 and MS2 files were used for quantification and the MGF files were used for database searching and site localization. The Uniprot human database was used for database searching (the version of May 2014 with 88,949 entries). Search parameters were set with the database search engine pFind 2.8 (42), including precursor mass tolerance of 5 ppm, fragment mass tolerance of 0.02 Da for HCD and ETD, Lys-C cleavage and up to 2 mis-cleavage. Cysteine carbamidomethylation was set as fixed modification, while methionine oxidation, asparagine and glutamine deamidation, serine, threonine, and asparagine (Asn) N-acetylhexosamine (HexNAc) were set as variable modifications. Asn(N) was used considering the low activity of PNGNase F in de-N-glycosylation (43). For the protein turnover experiment, the following parameters were used: precursor mass tolerance of 5 ppm, fragment mass tolerance of 0.02 Da, trypsin cleavage, and up to 2 miscleavages. Cysteine carbamidomethylation was set as fixed modification while methionine oxidation was set as a variable modification. The target-decoy approach was used to filter the search results, in which the false discovery rate was less than 1% at the spectral level (44). Before the site localization, the MGF files were preprocessed, including removing the entire precursor isotopic clusters, removing the isotopic peaks of fragment ions, keeping the monoisotopic peaks, and keeping top-6 peaks in every 100 m/z window of each MS/MS spectrum. After the preprocessing, Ascore for HCD spectra and their corresponding ETD spectra were calculated, respectively. Meanwhile, HCD and ETD spectral pairs were combined to calculate Cscore. Annotated MS2 including both HCD and ETD spectra for all identified HexNAc modified peptides are reported in supplemental Table S1 and supporting supplemental Fig. S5.
Then, the peptide and site identification list with GlcNAc modification was used for quantification. The unique peptides with GlcNAc were obtained with precursor m/z, charge state, retention time, search score, sequence, modification, and protein information. For each GlcNAc peptide, an isotopic cluster was obtained, including the monoisotopic peak (noted as M0), the Mϩ6 peak, and the Mϩ8 peak. At different time points, these three peaks were extracted with a close retention time. The area under curve (AUC) for each peak was calculated. Both the isotopic influence of M0 on Mϩ6 and the isotopic influence of Mϩ6 on Mϩ8 were subtracted. Finally, the sum of Mϩ6 AUC and Mϩ8 AUC divided by total M AUC at 0 h was calculated as the [G*]/[G] ratios or relative distributions at each time point.
Metabolite Extraction and MS Analysis of Metabolites-HeLa cells were labeled in DMEM media containing 13 C 6 -glucose. At various time-points after switching to 13 C 6 -glucose, media was rapidly removed and metabolism was quenched by immediate addition of cold extraction solvent (80:20 methanol/water, -80°C). The plate was kept at Ϫ80°C freezer for 15 min before the cells were scraped off and collected. The solution was centrifuged at maximum speed at 4°C for 5 min and the supernatant was collected. The pellet was re-extracted twice with 80:20 methanol/water, and supernatant was combined. Samples were dried down under N 2 flow, resuspended in water (LC-MS grade, Sigma), and analyzed by a Thermo Q-exactive Obitrap mass spectrometer coupled to a UPLC (Dionex 3000). Metabolites were separated with an ACQUITY UPLC® BEH C18 2.1 ϫ 100 mm column, 1.7 m particle size, with a gradient of solvent A (95% H 2 O, 5% methanol, 10 mM tributylamine, 9 mM acetate, pH ϭ 8. Kinetics Modeling-Relative distributions of the labeled UDP-GlcNAc, unlabeled unmodified peptides and O-GlcNAc-peptides were used to determine the turnover rates of the formation of the labeled GlcNAc peptides. The data were simulated according to the kinetic model as showing in Fig. 7 and fitted to differential equations (Eq. 1 and Eq. 2) using MATLAB by least square fitting. The k 1 and k -1 in the derivative equation Eq. 2 were fitted using "fmincon" function in Matlab. For each k 1 and k -1 sampled by "fmincon" function, the Y values at different time point t i were simulated with following equation: and Y(t 2401 ϭ 24) were compared with the measured Y values at 6, 12, 24 h, and the root of mean square was calculated to choose the k 1 and k -1 , which can minimize the difference between simulated Y(t) curves and our measured Y(t) values. Simulation results with k 1 , k-1 and R 2 for all GlcNAc-peptides quantified in this study were shown in supporting supplemental Fig. S4.

RESULTS
Metabolic O-GlcNAc Labeling-To quantitatively determine the rates of protein O-GlcNAcylation, we utilized the hexosamine biosynthetic pathway (HBP) that metabolically produces UDP-GlcNAc, the cofactor of the O-GlcNAc modification reaction as shown in Fig. 1A and 1B. About 2-5% of total glucose (the carbon source of UDP-GlcNAc) is utilized in HBP. By growing cells in 13 C 6 -glucose, heavily labeled UDP-GlcNAc metabolites were produced and accumulated over time. A time course of heavily labeled UDP-GlcNAc production was determined by using two different quantification methods. Supporting supplemental Table S3 shows the relative abundance of all isotopic UDP-GlcNAc species (from M(0) to M(ϩ17)) at different measured time points at the full mass level (MS1). As shown in Fig. 2, the GlcNAc unlabeled fraction, named UDP-12C-GlcNAc-MS1 was calculated from percentages of unlabeled M(0) and those labeled forms with less or equal M(ϩ5) over total UDP-GlcNAc. The use of M(ϩ5) as the cut-off is because the six 13 C labels on the glucosamine ring are biosynthesized directly from 13 C 6 -glucose. The GlcNAc labeled fraction, named UDP-13C-GlcNAc-MS1, was calculated from percentages of those labeled forms with equal or more than M(ϩ6) over total UDP-GlcNAc.
In order to distinguish if the 13 C label was on the GlcNAc moiety or on the UDP moiety in UDP-GlcNAc molecule, we then fragmented every possible isotopic form of UDP-GlcNAc from 606.07 m/z to 623.13 m/z with 13 C(0)-13 C (17)  All the possible combinations of parent masses were scanned and fragment masses were adjusted with isotopic abundance and quantified as shown in supplemental Table S3. We did not detect any 13 C(3), 13 C(4), and 13 C(5) labelings on the GlcNAc moiety after isotopic correction. The fraction of labeled UDP-GlcNAc was quantified as the sum of all labeled forms of GlcNAc moiety in UDP-GlcNAc molecule. As shown in supplemental Fig. S1, UDP-13 C(6)-GlcNAc and UDP-13 C(8)-GlcNAc are the two major species at all times. At the same time of the increasing fraction of cellular 13 C(2)-acetyl-CoA over time, 13 C(8)-GlcNAc fraction increases and becomes the major metabolite after 3 h.
Finally, we found that MS1 quantification on UDP-GlcNAc and MS2 quantification on GlcNAc moiety gave a very similar result as shown in Fig. 2. The difference is less than 4%. Fitting the MS2 quantification data to an exponential equation (X(t) ϭ 0.97-0.97*exp(-0.082*t)) allowed the relative abundance of heavy UDP-GlcNAc, annotated as X(t), to be determined at any specific time. At 8.9 h, 50% of the total UDP-GlcNAc became labeled at the GlcNAc moiety. This useful information was used as an input for the determination on protein O-GlcNAcylation turnover rates. On the other hand, fragmentation by electron transfer dissociation, the radical-based fragmentation approach, leaves the O-GlcNAc group attached to the amino acid residue during peptide backbone fragmentation. Because CID and ETD are complementary, we implemented an approach for O-GlcNAc site localization by the combined use of MS/MS data obtained using both CID and ETD. All precursors were first fragmented with CID, and ETD fragmentation was triggered only when O-GlcNAc related product ions were detected in the initial CID spectra. Both CID and ETD MS/MS spectra were used to identify peptides and O-GlcNAc sites using the peptide spectral matching. CID spectra in combination with ETD spectra boosted the site assignment of the GlcNAc modification. To validate the identified modification sites, we calculated an Ascore, widely used in the phosphoproteomics field, which is calculated based on the information of the O-GlcNAc site determining ions in both CID and ETD spectra. In addition to Ascore, another scoring algorithm, Cscore, was also used to valid the GlcNAc modification sites. Finally, the GlcNAc site assignments were manually checked and ambiguous assignments caused by missing diagnostic c/z/b/y ions were also corrected and reported. The increased information obtained by using two complemen-  Table S3 showed all values of the quantification. By fitting the relative abundances of heavily labeled UDP-GlcNAc over time to an exponential equation, a half-time rate of 8.9 h was determined for the production of 50% of GlcNAc labeled UDP-GlcNAc. tary fragmentation techniques improved the confidence in the O-GlcNAc site-localization.
We enriched O-GlcNAc modified peptides before MS analysis to facilitate identification. It is well known that boronic acids react with cis-1,2-or 1,3 -diols in glycans of glycopeptides, saccharides and many other molecules. In our study, we used PNGase F to remove N-glycosylation, as a large amount of N-glycan might compete with the reaction between O-GlcNAc and phenylboronic acid. (Fig. 1C) However, deamination may occur at this step, so variable modifications have to be considered for downstream peptide identification (46). O-GlcNAc enrichment was done with PBA solid phase extraction. The condensation between the 1,3-diol and phenylboronic acid generates a water molecule, and removing the water molecule could facilitate condensation and therefore binding, as shown in Fig. 4A. Therefore, instead of using aqueous solution (55 M H 2 O), we tested nonaqueous solvents. A GlcNAc-standard peptide TAPT(GlcNAc)STIAPG was mixed with BSA digested peptides. Four different solvents were used in the enrichment including anhydrous DMSO, anhydrous DMF, anhydrous ACN, and pH 8.5, 200 mM ammonium acetate buffer. Peptide mixtures were incubated with PBA at 35°C for 2 h to allow covalent binding between boronic acid and GlcNAc-peptides, and then the columns were washed to remove unbound peptides. GlcNAc-peptides were eluted by using 0.1% TFA solution. Results indicated that DMSO as the solvent had the highest degree of enrichment among the four solvents we tested. Fig. 4B shows the TIC of the enriched O-GlcNAc peptide from a peptide mixture and compared with the TIC of the input mixture. The O-GlcNAc peptide is the dominant peak in the eluted fraction with 46% retention. The O-GlcNAc standard peptide was enriched 39-fold by using DMSO, whereas it was only enriched fourfold by DMF and twofold by using pH 8.5 buffer. (Fig. 4C) Using ACN as the solvent had no obvious enrichment. We chose seven BSA peptides which have different hydrophobicity and retention times on the C18 column to calculate folds of enrichment. Their peptide identification, charges, m/z, and retention times are listed in the supplemental Table S4. The ratios of O-GlcNAc peptide/BSA peptides were calculated for input, flow-through, and elution fraction for the four different conditions (DMSO, H 2 O, ACN, and DMF) from the AUC of the extracted ion chromatogram. Folds of enrichments were then calculated and normalized with the ratio of the input and plotted in Fig. 4C. From these three ratios, we calculated the percentage of GlcNAc peptide retention with the column (bound GlcNAc versus total GlcNAc peptide). When using DMSO as the solvent, ϳ 46% of GlcNAc peptides are retained and ϳ 99% BSA peptides were found in the flow through. However, when water was used as the solvent, almost all of the GlcNAc peptides were retained, but there is a lot of nonspecific binding of BSA peptides (ϳ42%), which caused a lower fold of enrichment than using DMSO.
It is worth noting that O-GalNAc modification on serine and threonine can also be enriched with PBA strategy as the GalNAc structure bears a 1,3-diol as well. Therefore, we distinguished peptides with O-GalNAc modification from O-GlcNAc modification by observing the oxonium ion fragmentation profile of every identified HexNAc modified peptide and compared with results from a prior study (47). Oxonium ions of identified O-HexNAc modified peptides were extracted in the supplemental Table S1. We calculated ratios of the 138.055/ 144.066, 186.076/168.066, and 186.076/204.087 m/z ions, respectively. As shown in supplemental Fig. S2, which is a 3D scatter plot of log 2 of oxonium ion ratios, HexNAc modified peptides are grouped to two distinct clusters. In Large-scale Incorporation of 13 C-GlcNAc Into Peptides as Assessed by Mass Spectrometry-Next, we performed a PBA enrichment of O-GlcNAc peptides from a much more complex peptide mixture extracted from nuclear HeLa cells. In total, we identified 105 O-GlcNAc peptides on 42 proteins from HeLa nuclei. The sequences of the peptide with modification sites identified and the description of the proteins are summarized in supplemental Table S1. Among the O-GlcNAc peptides, one of the most remarkable and heavily O-GlcNAc modified proteins is host cell factor C1 (HCF-1). HCF-1 is a transcriptional coregulator involved in regulating cell cycle progression. HCF-1 undergoes an unusual proteolytic maturation, and the basic region of N-terminal HCF-1 (HCF-1 N ) is O-GlcNAcylated at multiple sites. O-GlcNAc modification is necessary for HCF-1 maturation and might also be involved in regulating HCF-1 function after its proteolytic cleavage (48,49). Several modification sites have been previously reported  supplemental Table S4).
with other enrichment methods, such as LWAC and enzymatic-chemical derivatization enrichment methods (48,50). In addition to HCF-1, nuclear pore complex proteins (NUPs) were highly modified, including NUP153, NUP214, and NUP98. Indeed, NUPs are observed to be one of the most heavily O-GlcNAc-modified proteins and were also some of the first O-GlcNAc-modified proteins identified in prior studies (51)(52)(53). Many transcription regulators were also detected as being O-GlcNAc modified, including protein EMSY, Polyhomeotic-like protein 3 (PHC3), Msx2-interacting protein, transcription factor MafK, transcription initiation factor TFIID, Protein SON, SAP30-binding protein. O-GlcNAc modifications on transcription-regulating proteins can influence DNA and RNA binding, transcriptional activity, localization, stability, and interactions with other proteins (5). Finally, we found that no obvious consensus sequences of O-GlcNAc modification were identified by motif analysis; and this may be because of the fact that only one enzyme (OGT) is able to modify serine and threonine with GlcNAc.
Interestingly, about 20% of total identified O-GlcNAc peptides have more than one GlcNAc modification site on the same peptides. supplemental Fig. S3 shows one example peptide, KGTTTNATSTTTTTASTAVADAQK_5,T(HexNAc[T])ϩ1HexNAc on T3 or T4 from SAP30-binding protein with the incorporation of two labeled O-GlcNAc PTMs. The monoisotopic mass at 914.786 m/z is the peptide containing one light and one heavy labeled GlcNAc. The monoisotopic mass at 917.461 m/z is the peptide containing two heavy labeled O-GlcNAc.
We were able to confirm by nanoLC-MS/MS analysis that O-GlcNAc peptides were metabolically labeled with 13 C-GlcNAc. As shown in Fig. 5B, the MS1 between pre-existing light O-GlcNAc peptides and the heavy 13 C-labeled O-Gl-cNAc has a mass difference of 6.020 Da or 8.024 Da, corresponding to 13C(6)-GlcNAc or 13C(8)-GlcNAc modified peptide, respectively. The cause of 13 C 8 labeling is the metabolic production of 13 C-acetyl-CoA from 13 C-glucose. Glucose is converted via glycolysis in the cytosol to pyruvate, and then to acetyl-CoA in the mitochondrion. Citrate is produced from excess acetyl-CoA and exported into the cytosol to give rise to cytosolic and nuclear acetyl-CoA, and acetyl-CoA provides the acetyl group for the GlcNAc moiety, adding two heavy carbons to the heavy hexose group for a total of 8 13 C carbons (35) (Fig. 1A). Similar to the UDP-GlcNAc labeling, O-GlcNAc modification on the peptide can be either 13 C(6)-GlcNAc or 13 C(8)-GlcNAc. Therefore, when we quantified the fractions of heavily labeled GlcNAc-peptide, we used the sum of both 13 C(6)-GlcNAc-peptide and 13 C(8)-GlcNAc-peptide.
Pairs of heavy and light labeled O-GlcNAc peptides coeluted at the same retention time as shown in Fig. 5A. This coelution is helpful for the identification of the heavy O-GlcNAc peptides as well as the quantification of their relative abundance. Additionally, we confirmed the 13 C-GlcNAc labeling by comparing tandem mass spectra between the unlabeled peptide and its labeled counterpart. Fig. 5C shows that In addition, oxonium ions also have signature product ions at the low m/z region of the HCD MS2 spectra. In Fig. 5C, the oxonium ion peak at 204.09 m/z in the spectrum of light peptide compared with the heavy oxonium ion peak at 212.11 m/z in the spectrum of labeled peptide shows an 8-Da mass shift. This information clearly indicates the 13 C labeling happens at the GlcNAc moiety. Fig. 3 shows all GlcNAc signature product ions that were detected in the low region of paired light and heavy labeled MS2. One fragmentation route of the GlcNAc oxonium ion proceeds by a stepwise elimination of two water molecules, first generating an ion with a mass of 186.0761 m/z, then an aromatic ring species with a mass of 168.0655 m/z. It is followed by the loss of a formaldehyde molecule (originally from the hydroxymethyl group) to produce an ion with a mass of 138.0550 m/z. Another route is to eliminate a ketene (originating from the acetyl group) from the 186.0761 m/z ion to form a species with a mass of 144.0655 m/z, followed by a loss of a water molecule, resulting in another aromatic species with a mass of 126.0550 m/z. For the 13  at Ser/Thr/Asn of the peptide. Before metabolic labeling, no heavy labeled GlcNAc peptide could be detected. After six hours, about 27% of this peptide became labeled, reaching 40% after 24 h in heavy glucose media. Among the 105 O-GlcNAc peptides, we were able to quantify 20 paired heavily labeled GlcNAc sites in 14 proteins identified across all time points, and for each of these, the abundances of heavy GlcNAc peptides increase over time. The turnover rates and relative abundances of heavy labeled species are all summarized in supplemental Table S2.
[G*]/[G] T fractions at different time points were fitted to a model representing isotopic labeling under nonsteady state conditions (Fig. 7). As expressed in Eq. 1, the concentration change of labeled O-GlcNAc peptide [G*] can occur in two FIG. 5. A, Extracted ion chromatograms showed that light and heavy O-GlcNAc peptides from NUP153 with the sequence of FGVSSSSSGPSQTLTSTGNFK_1GlcNAc on Ser/Thr/Asn were eluted at the same retention time. The possible GlcNAc site is colored red in the sequence on top of panels. B, MS1 spectra for light and heavy labeled peptide containing GlcNAc. Heavy labeled peptides had mass shifts of 6.020 Da and 8.024 Da compared with light peptides, corresponding to the theoretic mass differences of 6.018 Da and 8.024 Da including 6 and 8 13 C atoms, respectively. The isotopic distribution of heavy peptides suggests that both species exist. C, Tandem mass spectra for the heavy in green and light in red labeled peptides with GlcNAc modification. Fragment ions with the GlcNAc loss have the same mass as the corresponding light one. Depending on the position of the GlcNAc, the b and y ions might also be b/y ions with glcnac loss. Therefore, we label them as bЈ or yЈ representing either b/y ion peaks or b/y with glcnac neutral loss ion peaks.
In our labeling experiment, the building blocks of newly synthesized proteins are partially from labeled amino acids that are metabolized from 13 C 6 -glucose. We also measured the 13 C-labeling of amino acids from 13 C 6 -glucose by extracting metabolites from cells at different time points and using LC-MS based metabolomics. Data are summarized in supplemental Table S3. We found the relative abundances of unlabeled amino acid decay exponentially over time. By fitting the relative abundances over time, the easiest amino acid to label is alanine with a t 1/2 of 0.6 h. Because of the fast turnover and nearly complete labeling of alanine, if there is at least one alanine in the peptide sequence, we assume the newly synthesized peptide is labeled. Because the quantification of turnover rates only considers old peptide, F T OGT is deconvoluted from the synthesis of new peptide with a modification factor of ([U]/[U T ]). This factor was determined from a pulselabeling experiment. In the experiment, HeLa cells were incubated with DMEM replaced with lysine( 13 C 6 , 15 N 2 ) and arginine ( 13 C 6 , 15 N 4 ). Cells were harvested at 0.5, 2, 5, 9 24, 32, and 48 h. Nuclear proteins were isolated and digested. The relative abundances of old protein were determined at different time points. The factor ([U]/[U T ]) was then determined by fitting relative abundances of old protein over time with an exponential equation. All turnover rates were summarized in supplemental Table S2. We did not determine the O-GlcNAc turnover rates if the peptide turnover was not available from the pulse-labeling experiment. Other amino acids were only partially labeled with much slower rates than Ala, for example, Asp has a t 1/2 of 1.6 h with 15% heavy, Glu has a t 1/2 of 1.2 with 12% heavy, and Ser has a t 1/2 of 1.3 with 11% heavy. Additionally, Asn, Pro, and Gly were also labeled, but the labeling rates were even slower. Therefore, if no Ala in the peptide sequence, we assume [U]/[U T ] ϭ 1. For example, the peptide VIIVTTSPSSTFVPNILSK_1HexNAc on any Ser/ Thr/Asn has no Ala. Although there is no protein EMSY measured, we are still able to determine the GlcNAc turnover rates by considering ( Because what we measured in the experiment are the relative abundance of heavy UDP-GlcNAc, annotated as X(t), the relative abundance of heavy GlcNAc peptide [G*]/[G] T , also annotated as Y(t), and the relative abundance of old peptide, annotated as U(t). Eq. 1 is converted to Eq. 2 by dividing the total GlcNAc peptide concentration [G]. Simulation results in O-GlcNAc formation rate k 1 and decay rate k -1 , where Fig. S4 shows the simulation results with turnovers for all GlcNAcpeptides quantified in this study. And supporting Eq. 1-7 illustrates the derivation of Eq. 1 and Eq. 2 The simulation fits the observation well for most measured peptides as shown in Fig. 8 and in supplemental Fig. S4, suggesting the model is sufficient for explaining the turnover of GlcNAc PTMs. We determined the k 1 for the detectable O-GlcNAc peptides within a range from 0.02 h Ϫ1 to 1.6 h Ϫ1 . Turnover rates are different for proteins or even different modified sites on the same protein. For example, turnover rates of the dynamic HCF-1 O-GlcNAc sites are different from one peptide to another as shown in Fig. 8.

DISCUSSION
In this study, we first optimized the enrichment of GlcNAc peptides using boronic acid solid phase extraction with nonaqueous DMSO as the solvent. Boronic acids are known to bind with diol groups through boronate ester formation. Two different forms of boronic acid exist and react with diols. One is the trigonal neutral form and the other is the tetrahedral negative charged form, and both forms have affinity toward GlcNAc peptides. The tetrahedral form has a larger rate constant of ester formation than the neutral form according to literature (54). However, the tetrahedral negative charged form also has a large nonspecific retention of basic peptides that contain positively charged amines. This limitation is similar with the immobilized-metal affinity chromatography (IMAC) method in which there is nonspecific affinity between negatively charged carboxylates in acidic peptides and the positively charged metal ions in the matrix (55). Additionally, both the generation of the tetrahedral negative charged boronic acid and incubation with samples require basic conditions. However, GlcNAc modifications are not stable in basic conditions. Therefore, we optimized PBA enrichment method using nonaqueous solvent. The optimized PBA enrichment method is able to keep intact HexNAc PTM on the peptide, allowing us to study the HexNAc isomerization from ratios of oxonium ions. The optimized PBA enrichment using neutral DMSO as the solvent is also more efficient than PBA enrichment by using basic aqueous solution. This result could be explained by the fact that nonspecific binding caused by trigonal neutral boronate was less than the tetrahedral negative charged boronate anions within a mixture of peptides.
Next, we demonstrated that our novel 13 C 6 -glucose metabolic labeling method can be used to label endogenous O-GlcNAc sites on proteins in complex cell lysates. To our knowledge, this is the first large-scale attempt to specifically label O-GlcNAc modifications on proteins and to determine the protein O-GlcNAcylation rates. It is interesting to note that the most dominant fraction of total proteins identified with dynamic O-GlcNAc sites are involved in biological process that regulate transcription.
We compared the turnover rates of protein O-GlcNAc modification with other types of modifications, such as serine/ threonine phosphorylation and lysine acetylation. In general, turnovers of GlcNAc in proteins are slow (k 1 from 0.02 h Ϫ1 to 1.6 h Ϫ1 ), compared with turnover rates obtained from a large scale quantification experiment on protein phosphorylation (0.06 h Ϫ1 to 2040 h Ϫ1 ) (37) and histone acetylation (0.3-1 h Ϫ1 ) (35,36). It is also worth noting that we observed that the production of heavy UDP-GlcNAc is much slower than the production of heavy acetyl CoA. The half time of generating heavy acetyl CoA is less than 1 h. As the substrate for O-GlcNAc modification is the metabolite (UDP-GlcNAc) directly produced from 13 C-glucose source, the rates of protein GlcNAc labeling can be heavily dependent on the rates of metabolite labeling. Therefore, in order to accurately quantify the rate of protein PTM turnover from metabolic labeling, it is important to consider the contribution of the labeling kinetics of the direct metabolite precursor. This is especially critical when the metabolism labeling process is slow, such as with UDP-GlcNAc. Sophisticated compartment kinetic models based on metabolic isotopic labeling for protein turnover studies have been reported before, which led to a more satisfactory fitting of experimental data (56). To our knowledge, our study is the first to determine protein PTM turnover rates with an additional integration of metabolite kinetics.
It is also worth noting that the factor of ([U]/[U T ]) in the model from 13 C 6 -glucose labeling experiment is theoretically not the same as fractions of the old peptide determined from the pulse-labeling experiment in which lysine and arginine were labeled. But from our metabolomics measurement, we found the labeling of Ala from 13 C 6 -glucose is really fast (t 1/2 ϭ 0.6 h). At our first experimental time points which was 6 h, 70% Ala were already heavy labeled. So we made this assumption that if there is at least one Ala in the peptide sequence, newly synthesized peptides are labeled. A similar analysis was used as in a study that determined histone acetyl PTM turnovers labeled from 13 C 6 -glucose (35). With this assumption, we used the turnover rates determined from lysine and arginine pulse-labeling experiment to calculate the factor of ([U]/[U T ]).
Additionally, for the term [G*]/[G] T , which represents the fraction of old peptide labeled at the GlcNAc modification, the kinetic model was not further adjusted with amino acid labeling. We can statistically calculate the chance of Mϩ6 or Mϩ8 caused by amino acids labeling. The easiest scenario to cause Mϩ6 labeling in a given peptide is to label two Ala. With an average peptide length of 15 residues and 5 amino acids that can be labeled from glucose, Mϩ6 labeling is resulted from fully labeling of two Ala and nonlabeling of the rest 13 residues. From the metabolomics results on amino acids, at steady state of amino acid labeling, labeled Ala had a percentage of ϳ 80%. The averaged percentage of nonlabeled nonessential amino acids was 80%. Therefore, the chance of Mϩ6 labeling in newly synthesized peptides was 0.8 2 ϫ 0.8 5 , which is around 20%. Because our method is enriched for O-GlcNAc peptides and the identified O-GlcNAc-proteins have slow turnovers, the contribution of labeled peptide with labeling on amino acids instead of GlcNAc moiety to total Mϩ6 signals is presumably even lower. If more than two Ala in a sequence, only two out of the total Ala labeled, the rest nonlabeled Ala decreases the chance of Mϩ6 labeling. Last, Lys-C digestion usually results in long peptides, a longer peptide and more nonessential amino acids in the sequence decrease the chance too. Although the small error caused by Ala labeling with the method, this LC-MS/MS metabolomics and proteomics-based 13 C 6 -glucose metabolic labeling method is still good for most of the O-GlcNAc-peptides and is able to provide us novel information on O-GlcNAc-protein turnover rates.
In summary, we have developed an approach to isotopically label O-GlcNAc on proteins by producing 13 C-labeled UDP-GlcNAc through the hexosamine biosynthetic pathway from 13 C 6 -glucose. This metabolic labeling was combined with quantitative mass spectrometry-based proteomics to determine protein O-GlcNAcylation turnover rates. By integrating metabolite isotopic labeling with protein O-GlcNAc labeling, we determined protein O-GlcNAcylation kinetics in vivo for nuclear proteins. In total, we identified kinetics of 20 O-GlcNAc peptides in from 14 proteins extracted from HeLa nuclei with turnover rates varying from 0.02 h Ϫ1 to 1.6 h Ϫ1 . We expect that this labeling methodology can be broadly applied to reveal O-GlcNAcylation dynamics from different biological states, adding important temporal aspects to these types of biological studies.