Mapping and Quantification of Over 2000 O-linked Glycopeptides in Activated Human T Cells with Isotope-Targeted Glycoproteomics (Isotag)*

Post-translational modifications (PTMs) on proteins often function to regulate signaling cascades, with the activation of T cells during an adaptive immune response being a classic example. Mounting evidence indicates that the modification of proteins by O-linked N-acetylglucosamine (O-GlcNAc), the only mammalian glycan found on nuclear and cytoplasmic proteins, helps regulate T cell activation. Yet, a mechanistic understanding of how O-GlcNAc functions in T cell activation remains elusive, partly because of the difficulties in mapping and quantifying O-GlcNAc sites. Thus, to advance insight into the role of O-GlcNAc in T cell activation, we performed glycosite mapping studies via direct glycopeptide measurement on resting and activated primary human T cells with a technique termed Isotope Targeted Glycoproteomics. This approach led to the identification of 2219 intact O-linked glycopeptides across 1045 glycoproteins. A significant proportion (>45%) of the identified O-GlcNAc sites lie near or coincide with a known phosphorylation site, supporting the potential for PTM crosstalk. Consistent with other studies, we find that O-GlcNAc sites in T cells lack a strict consensus sequence. To validate our results, we employed gel shift assays based on conjugating mass tags to O-GlcNAc groups. Notably, we observed that the transcription factors c-JUN and JUNB show higher levels of O-GlcNAc glycosylation and higher levels of expression in activated T cells. Overall, our findings provide a quantitative characterization of O-GlcNAc glycoproteins and their corresponding modification sites in primary human T cells, which will facilitate mechanistic studies into the function of O-GlcNAc in T cell activation.

whereas deletion of the OGA gene leads to perinatal death (11), demonstrating the biological importance of O-GlcNAc cycling. Furthermore, conditional deletion of OGT in numerous cell types leads to senescence and apoptosis (12).
O-GlcNAc is generally viewed as a mechanism of cell signaling that is linked to cellular metabolism through the hexosamine biosynthesis pathway, which generates the nucleotide sugar donor, UDP-GlcNAc, for OGT. Because O-GlcNAc modifies serine and threonine residues, it has the potential to directly compete with phosphorylation at these amino acids (13,14). However, the relationship between O-GlcNAc and phosphorylation is not simply reciprocal antagonism, as the two modifications may also regulate each other even when they target different sites (15)(16)(17)(18)(19)(20). Like other types of PTMs (e.g. phosphorylation), O-GlcNAc has been demonstrated to influence the activity of many proteins, including transcription factors (15), kinases (21), and nuclear pore proteins (22). Owing to its regulatory importance, it is not surprising then that abnormal levels of O-GlcNAc have been linked to several diseases, including diabetes (23), cancer (24,25), and neurodegeneration (26,27). Several studies have correlated these global observations to the functional relevance of O-GlcNAc (23,26,28).
A general role for OGT and O-GlcNAc in T cell activation has been established based on observations that T cell activation induces fluctuations in O-GlcNAc patterns and is more robust in the presence of OGT activity (29 -32). However, the function of O-GlcNAc on most glycoproteins in cellulo remains unknown, in large part because of the difficulty in characterizing O-GlcNAc glycosylation. Although the list of O-GlcNAcylated proteins numbers in the thousands (33), relatively few O-GlcNAc sites have been precisely mapped on those proteins, and predicting O-GlcNAc sites based on amino acid sequence is complicated by the lack of a strong consensus motif. Knowledge of specific O-GlcNAc modification sites is necessary to quantify changes in glycosite occupancy between different cellular states, such as resting and activated T cells, and to gain insight into the possible function of a given glycosite.
Methods to characterize O-GlcNAc sites from complex proteomes commonly involve enrichment of the glycoproteome followed by mass spectrometry (MS) analysis. Enrichment of O-GlcNAc is achievable by lectin weak affinity chromatography (LWAC) (34) or affinity enrichment after metabolic (35,36) or enzymatic labeling (37,38). However, subsequent characterization by MS is challenging because of the fragmentation of glycopeptides across glycosidic bonds in addition to peptide bonds, as well as the lower ionization efficiencies of glycopeptides (39). Thus, specialized fragmentation methods to characterize glycopeptides have emerged, including electron transfer dissociation (ETD), higher-energy collision dissociation product dependent ETD (HCDpdETD), and electron transfer/higher-energy collision dissociation (EThcD) (40,41). Recently, a method to enrich metabolically labeled glycopep-tides and confidently characterize the intact glycopeptides by MS was developed, termed Isotope Targeted Glycoproteomics (IsoTaG) (42). IsoTaG uses mass-independent MS to improve selection and identification of low abundance glycopeptides. This confluence of an efficient enrichment strategy and a sensitive mass spectrometry method enables analysis of samples with limited quantities (e.g. primary human T cells) and establishes a workflow for other biological systems as well.
Although a few hundred proteins have been identified as O-GlcNAc-modified in human T cells (32), details regarding modification sites and more importantly, which sites undergo quantitative changes during activation and thus may be functionally significant, remain unresolved. To this end, we employed IsoTaG to map and quantify O-GlcNAc sites in resting and activated human T cells. We report the identification of 2219 O-linked glycopeptides across 1045 glycoproteins. In line with previous studies, we find no evidence of a strong consensus motif surrounding O-GlcNAc sites from human T cells, though a large fraction of O-GlcNAc sites lie near known phosphorylation sites. Quantitative analysis revealed higher amounts of O-GlcNAc glycosylation on several glycoproteins, such as the transcription factors c-JUN and JUNB, which were also up-regulated at the protein level, in activated T cells. Our results provide valuable information for future studies aimed at a mechanistic understanding of the function of O-GlcNAc on specific proteins, such as c-JUN, through targeted mutagenesis.
Instrumentation-LC-MS experiments were performed on an Ulti-Mate 3000 Rapid Separation LC (Dionex) coupled to a LTQ-Orbitrap Elite mass spectrometer with ETD.
Experimental Design and Statistical Rationale-T cells were purified from five different human donors, representing five biological replicates, to obtain statistical power for detection and label free quantification. After metabolic labeling with Ac 4 GalNAz for 50 h, T cells from each replicate were divided among three stimulation conditions (isotype control, anti-CD3/CD28, or PMA/I; see Cell Culture Procedures) and cultured for an additional 18 h prior to collection. One biological isotype control replicate was separately treated with the DMSO ve-hicle control for 18 h. One of the five replicates was used in a time course experiment with time points at 15 min, 6 h, and 18 h. MS data was separately collected on the enriched proteome and the glycopeptides. At least two technical replicates were collected for each sample. Technical replicates were collected as a "scouting run" (DDA) and an "inclusion list" (IsoStamp directed) run within 24 h of each sampling. In total, 101 MS files representing 77 glycoproteomics, 20 proteomics, and 4 DMSO control data sets were collected. Database searching was performed with SEQUEST HT and Byonic. Confident glycopeptide assignments (FDR ϭ 1%) were used for label free quantification (see Data Analysis Procedures). Glycopeptide abundances possessed a normal distribution and were evaluated using the t test (p value Յ 0.05).
Cell Culture Procedures-Leukocyte reduction shuttles from healthy human donors were obtained from the Stanford Blood Bank under an approved institutional review board protocol. Total T cells were purified using RosetteSep (Stem Cell Tech, Cambridge, MA, #15061) and cultured in RPMI 1640 (Life Tech, Waltham, MA) containing GlutaMAX, 10% fetal bovine serum, and 1 ϫ penicillin/streptomycin at 37°C in a humidified incubator with 5% CO 2 . Cells were metabolically labeled by incubation with 40 M Ac 4 GalNAz (Pierce/ Thermo) for 50 h at a concentration of 10 million cells/ml. At the time of stimulation, labeled cells were kept in the original labeling medium and divided into three aliquots. One aliquot was treated with 150 ng/ml phorbol 12-myristate 13-acetate and 1 M ionomycin (PMA/I, Sigma) to induce receptor-independent activation. The second aliquot was treated with anti-CD3/CD28-coated beads (see below) at a ratio of 1 bead per 2 cells to induce receptor-dependent activation. The third aliquot was treated with isotype-coated control beads to serve as an unstimulated control. Stimulation proceeded for 18 h, after which cells were collected by centrifugation, washed in cold PBS, flash frozen in liquid nitrogen, and stored at Ϫ80°C. A total of four biological replicates were obtained in this manner. A fifth biological replicate was treated analogously except that the labeled cells were divided into six aliquots at the time of stimulation and stimulation was allowed to proceed for 15 min, 6 h, and 18 h, after which cells were collected.
Washed beads were resuspended in 5 mM DTT/PBS (200 l) and incubated for 30 min at 24°C with rotation. Ten mM iodoacetamide (4.0 l, 500 M stock solution) was added to the reduced proteins, and allowed to react for 30 min at 24°C with rotation, in the dark. Beads were pelleted by centrifugation (3000 ϫ g, 3 min) and resuspended in 0.5 M urea/PBS (200 l). For tryptic digests, trypsin (1.5 g) was added to the resuspended beads, and digestion proceeded for 12 h at 37°C. For chymotryptic digests, chymotrypsin (1.5 g) was added to the resuspended beads, and digestion proceeded for 12 h at 24°C. Beads were pelleted by centrifugation (3000 ϫ g, 3 min), and the supernatant digest was collected. The beads were washed with PBS (1 ϫ 200 l) and H 2 O (2 ϫ 200 l). Washes were combined with the supernatant digest to form the trypsin (or chymotrypsin) digest. The IsoTaG silane probe was cleaved with two treatments of 2% formic acid/H 2 O (200 l) for 30 min at 24°C with rotation and the eluent was collected. The beads were washed with 50% acetonitrilewater ϩ 1% formic acid (2 ϫ 400 l), and the washes were combined with the eluent to form the cleavage fraction. The trypsin digest and cleavage fraction were concentrated using a vacuum centrifuge (i.e. a speedvac, 40°C) to 50-100 l. Samples were desalted with a ZipTip P10 and stored at Ϫ20°C until analysis.
Mass Spectrometry Procedures-Samples were reconstituted in 22 l of 0.1% formic acid in water, and 4 l of sample was injected onto a C18 trap column (Thermo Scientific Acclaim PepMap 100, 5 m particle size, 5 mm length, 300 m ID) at 5 l/min using 0.1% formic acid in water for 10 min. Samples were then loaded onto a 25 cm length C18 analytical column (Picofrit 75 m internal diameter, New Objective) packed in-house with Magic C18AQ resin (Michrom Bioresources). Peptides were eluted using a multi-step gradient at a flow rate of 0.6 l/min from 0.1% formic acid in water to 85% acetonitrile-water ϩ 0.1% formic acid over 120 min. The electrospray ionization voltage was set to 2.25 kV and the capillary temperature was set to 200°C. Dynamic exclusion was enabled with a repeat count of 2, repeat duration of 30 s, exclusion list size of 400, and exclusion duration of 30 s.
For scouting runs, peptides were fragmented using HCDpdETD. MS1 scans were performed over m/z 400 -1800 at resolution 30,000 and the top five most intense ions (ϩ2 or higher charge states) were subjected to HCD with 27 eV, default charge state ϩ4, for 0.1 s, at resolution 15,000. If oxonium product ions (m/z 204.0867, 345.1400, 347.1530, 366.1396, 507.1930, and 509.2060) were observed in the HCD spectra, ETD (200 ms) with supplemental activation (35 eV) was performed in a subsequent scan on the same precursor ion selected for HCD. Data from the scouting runs was submitted to IsoStamp v.

to generate inclusion lists for subsequent LC-MS runs.
For inclusion list runs, global parent mass lists were imported from IsoStamp v. 2.0. MS1 scans were performed over m/z 400 -1800 and the top three most intense ions on the global parent mass list (ϩ2 or higher charge states) were subjected to three subsequent fragmentation methods including HCD with 27 eV, default charge state ϩ4, for 0.1 s; ETD for 200 ms with supplemental activation of 35 eV; and CID at 35 eV for 10 ms. The scouting run and the inclusion list run were used to acquire at least two technical replicates for each sample.
Data Analysis Procedures-The raw data was processed using Proteome Discoverer 1.4 software (Thermo Fisher Scientific) and searched against 20,172 proteins within the human-specific Swis-sProt-reviewed database downloaded on July 18, 2014. Indexed databases for tryptic digests were created with full cleavage specificity at K and R. Indexed databases for chymotryptic digests were created with full cleavage specificity at F, L, W, and Y. Both databases allowed up to three missed cleavages, one fixed modification (carbamidomethylcysteine, ϩ57.021 Da), and variable modifications (methionine oxidation, ϩ15.995 Da; N-terminal acetylation, ϩ42.011; N-terminal(Gln3pyro-Glu), Ϫ17.0265; N-terminal(Glu3pyro-Glu), Ϫ18.011; serine/threonine phosphorylation, ϩ79.966), and others as described below). Precursor ion mass tolerances for spectra acquired using the Orbitrap were set to 10 ppm. The fragment ion mass tolerance for spectra acquired using the Orbitrap and ion trap were set to 20 ppm and 0.6 Da, respectively. The SEQUEST HT search engine was used to identify tryptic and chymotryptic peptides from whole protein and non-conjugated peptides. The Byonic search algorithm v2.0 was used as a node in Proteome Discoverer 1.4 for glycopeptide searches. Searches allowed for tagged O-glycan variable modifications (see input file below). A modified HexNAc, termed "HexNAz2Si" (C 13 H 18 D 2 N 4 O 7 , ϩ346.1458 Da) or "HexNAz0Si" (C 13 H 20 N 4 O 7 , ϩ344.1332 Da), with variable attachment to serine or threonine residues was used as a variable modification. The Byonic glycan input file OGlycan modification was used as below. A search was additionally performed allowing for NGlycan and OGlycan modification types. Glycan assignments to "HexNAz0Si" were substituted with "HexNAz2Si" in supplemental Table S1 for subsequent label free quantification.
Glycan custom modification list for Byonic v2.0 HexNAz2Si(1) @ OGlycan | common2 HexNAc(1) @ OGlycan | common2 HexNAz0Si(1) @ OGlycan | common2 HexNAz(1) @ OGlycan | common2 Peptide and glycopeptide spectral assignments passing a false discovery rate (FDR) of 1% based on a target decoy database were used for label free quantification. All label free data analysis was performed using in house scripts written in Python 3.5 and scientific packages numpy, pandas, and pymzml. RAW MS data from a Thermo Orbitrap Elite was converted to mzML using msConvert and the time frame in each data file was normalized to a common species. Glycopeptide assignments across all MS files were extracted from Proteome Discoverer 1.4 and collated. For each spectral assignment, the precursor abundance within the same m/z range (tolerance ϭ 0.01 Da) and time frame (Ϯ 1.5 min) was extracted and summed across all MS files. For glycopeptide assignments across multiple PSMs within the same time frame, the extracted precursor abundances were averaged. These values were used as the raw input for label free quantification.
Precursor abundances were normalized within each technical replicate. Normalization was performed according to the following process: (1) the raw abundances were converted to their natural logarithm values, (2) the log value for each glycopeptide was subtracted from the mean log value for the technical replicate, and (3) the subtracted value was divided by the standard deviation within the technical replicate. Following normalization, a standard distribution of glycopeptide abundances was obtained. Quantitative comparison between the treatment (e.g. anti CD3/CD28, PMA/I) and the isotype control consisted of subtraction of normalized values for the isotype control from the normalized values for the treatment for each glycopeptide. Statistically enriched glycopeptides were evaluated using the t test (p value Յ 0.05).

Identification of Over 2000 O-GlcNAz Containing Peptides
from Human T Cells-Human T cells isolated from healthy blood bank donors were cultured with peracetylated N-azidoacetyl galactosamine (Ac 4 GalNAz, 40 M) for 50 h to metabolically label O-GlcNAc. Ac 4 GalNAz is converted intracellularly to UDP-GlcNAz via the UDP-galactose 4-epimerase (GALE) pathway to yield O-GlcNAzylated proteins (35). Labeled T cells from each biological replicate were then divided into three aliquots and cultured for an additional 18 h in one of three conditions (Fig. 1A). One aliquot was incubated with control beads whereas the other two were incubated with either anti-CD3/CD28-coated beads or PMA/ionomycin to induce polyclonal T cell activation. Cells were collected and lysed.
Azide-labeled cell lysates were then tagged with a cleavable and isotopically-encoded biotin probe via click chemistry, affinity enriched, and digested on-bead with trypsin or chymotrypsin to release nonconjugated (i.e. non-glycosylated) peptides from the captured O-GlcNAz glycoproteins (Fig. 1B) (43). Glycopeptides were recovered via cleavage of the IsoTaG biotin probe (2% formic acid) and analyzed on a Thermo LTQ-Orbitrap Elite mass spectrometer. IsoTaG uses mass-independent, targeted glycoproteomics to (1) direct tandem MS time to isotopically-recoded glycopeptides that are immediately identifiable by full scan MS, and (2) produce high confidence assignments of glycopeptide spectra. Thus, for each glycopeptide sample an HCDpdETD scouting run was collected, followed by an inclusion list-driven run with an HCD/ETD/CID duty cycle (Fig. 1C). In previous work on an LTQ-Orbitrap XL, a 4-fold improvement in glycopeptide selection using the inclusion list was found (42). Using an LTQ-Orbitrap Elite, a HCDpdETD method yielded the most glycopeptide assignments, with the majority of glycopeptide assignments derived from HCD alone (50% for trypsin digests, 70% for chymotrypsin digests). Application of inclusion list driven runs with an HCD/ETD/CID duty cycle improved rates of glycopeptide selection (Ͼ50% of total spectral assignments were glycopeptides, supplemental Fig. S1). MS data was also acquired on the non-glycosylated peptides from the enriched glycoproteins. For four biological replicates, MS data derived from two protease digests and three stimulation conditions was collected on the glycoproteome and glycopeptides. For one biological replicate, MS data from tryptic digests was collected. Approximately 250 O-glycopeptides were typically assigned in a single 2 h MS run. A Byonic search performed with a glycan input file allowing for N-linked and O-linked glycosylation showed negligible N-linked assignment and localization primarily to known O-linked glycopeptides. IsoTaG enabled the identification of 1000 glycopeptides from a single donor on average, which in aggregate generated a data set containing 2219 unique O-linked glycopeptides from 1045 glycoproteins from human T cells (supplemental Tables S1, S2).

T Cell O-GlcNAc Is Found on Intracellular Proteins at Sites
Modified By Other PTMs-Using the UniProt database as a reference, the identified glycoproteins and glycopeptides were analyzed for their subcellular localization, molecular function, and co-localization with previously annotated PTMs. In sum, 73% of identified glycoproteins were localized to the nucleus and cytoplasm, which is expected for O-GlcNAc proteins ( Fig. 2A). Another 23% localized to the secretory pathway, including plasma membrane proteins, secreted proteins, and proteins from the endoplasmic reticulum and Golgi apparatus. Glycosites from secretory proteins were found in both intracellular and extracellular domains. Though the specific localization of the protein at the time of glycosylation is unknown, glycosites localized to the extracellular domain may represent extracellular O-GlcNAc or the Tn antigen (47). Mitochondrial proteins represented 4% of the identified glycoproteins.

FIG. 1. Schematic of workflow for O-GlcNAc metabolic labeling, T cell activation, glycopeptide enrichment, and mass spectrometry analysis.
A, After incubation with Ac4GalNAz for 50 h to metabolically label O-GlcNAc as O-GlcNAz, primary human T cells were stimulated with control beads, anti-CD3/CD28 beads, or PMA/ionomycin for an additional 18 h. B, Proteins from whole cell lysates were tagged with the IsoTaG silane probe, an acid-cleavable biotin reagent containing an isotopic label and a terminal alkyne for "click" chemistry with O-GlcNAz groups. Biotin-labeled glycoproteins were then affinity purified and digested on-bead with trypsin or chymotrypsin to release non-conjugated peptides. Finally, glycopeptides were recovered by acid-mediated cleavage of the biotin tag. C, Isotope-labeled glycopeptides were sequenced by LC-MS/MS. First, a scouting run with a HCDpdETD fragmentation method was obtained to create an inclusion list containing peptides with altered isotopic distributions due to the isotope label (red peaks). In the second MS run, the inclusion list was used to direct precursor selection and tandem MS (HCD/ETD/CID fragmentation) to isotope-recoded glycopeptides (inset).
The list of identified glycoproteins covers an array of molecular functions and features many transcription factors and epigenetic regulators, such as SP1 and CBP (supplemental Table S3, S4). Accordingly, a basic bioinformatics analysis with the DAVID web tool (48, 49) revealed a statistically significant enrichment of proteins with functions related to the regulation of transcription (supplemental Table S4) as previ-ously noted (32). The enrichment of this class of proteins was stronger in primary T cells compared with transformed cell lines, possibly because of more labeling of complex glycans as opposed to O-GlcNAc in immortalized cell lines (43).
Based on 851 unique unambiguous site assignments across 381 glycoproteins, derived from ETD spectra with a DeltaMod score Ͼ10 or glycopeptides that possess a single S/T residue, the median number of glycosites mapped per protein was two. A single unambiguous glycosite was found on 244 (29%) of the glycoproteins (Fig. 2B). T cell glycosites were further analyzed for co-localization within ten amino acids to annotated functional sites within the protein (Uniprot, April 29, 2016). Evaluated sites included active sites, protein binding sites, and sites that bind to metal, nucleotides, and calcium. Of the nearly 1300 documented sites for these proteins, a 0 -4% co-localization rate with the observed O-GlcNAc sites was found (supplemental Fig. S2). Of all annotated PTMs on the identified glycoproteins, 6% mapped to within 10 amino acids of an O-GlcNAc site (supplemental Fig.  S2). Selection of a random Ser/Thr from the same glycoprotein revealed that these rates are slightly lower than the background co-localization rates except in the case of metal binding (supplemental Fig. S2). O-GlcNAc sites were found at a 0% frequency with annotated metal binding sites (count ϭ 343), whereas random Ser/Thr were distributed at a 17% co-localization rate with these proteins. With the caveat that several functionally relevant sites may not be annotated in the present database, the low rate of O-GlcNAc co-localization with metal binding sites and other types of modifications indicates O-GlcNAc may not regulate these interactions directly in T cells.
To explore further the potential for phosphorylation and glycosylation cross-regulation, we next correlated the unambiguous glycosite assignments found herein to all reported phosphosites. This analysis uncovered a high degree of colocalization between the two modifications with 45% of unique glycosites being within ten amino acids of a known phosphosite (Fig. 2C) and nearly 15% overlapping with a known phosphosite (Fig. 2C, inset). In contrast, comparison of all Ser/Thr sites within these same glycoproteins revealed lower co-localization with the nearest phosphosite (35% within ten amino acids; 4% on the same amino acid, supplemental Fig. S3). Notably, global levels of both O-GlcNAc and phospho-Ser/Thr increase in the hours following T cell activation as assayed by immunoblotting and flow cytometry (2). Database searches with Byonic for glycophosphopeptides revealed 50 peptide sequences containing both modifications across nearly 400 PSMs (supplemental Table S5). Future sitespecific profiling of both modifications will distinguish between the two modifications co-occurring on the same protein in a cooperative manner or on different proteins in an antagonistic manner. the mapped O-GlcNAc glycosites revealed a weak homology sequence with a minor preference for proline at the Ϫ2 and Ϫ3 positions and a high frequency of serine, threonine, alanine, and proline at other positions, similar to previous reports ( Fig. 3) (50,51). Homology sequences were similar regardless of the protease used to generate peptides (trypsin or chymotrypsin) or the MS fragmentation method (ETD or HCD, supplemental Fig. S4). These data suggest that OGT lacks a specific consensus motif, consistent with the notion that OGT binds to its substrates mainly by contacting the peptide backbone as opposed to specific side chains (52).

O-GlcNAc Sites From T Cells Lack a Strict Consensus Sequence-Comparison
Certain glycopeptides, such as one found on c-JUN, contained O-GlcNAc sites in sequences bearing little resemblance to the homology sequence. To identify sequences that lay outside of the homology sequence, a dissimilarity index for each sequence was created based on the amino acids at positions Ϫ4 to ϩ4 surrounding the glycosite with less frequent amino acids contributing more heavily. Sequences that are greater than three standard deviations (Ͼ99%) away from the average were considered dissimilar (supplemental Table  S6). In several cases, only a single serine or threonine was available in the immediate vicinity of the identified glycosite. Conversely, sequences that exactly matched the homology sequence were also identified.
Label Free Quantification of O-GlcNAc Sites During T Cell Activation-After qualitatively mapping O-GlcNAc sites in T cells, we next sought to identify O-GlcNAc sites undergoing changes in occupancy during T cell activation through labelfree quantitative analysis of four of the five biological replicates. Label-free quantitation was performed by alignment of retention times and extraction of the intensity of each glycopeptide ion across different MS runs. A confidence index based on at least two technical replicates was built for each glycopeptide. Approximately 65% of peptides had less than 40% discrepancy in fold change across four technical replicates (supplemental Fig. S5). The difference between stimulated cells (e.g. anti-CD3/CD28 or PMA/ionomycin) and isotype control cells for each glycopeptide was used for comparative analysis of deviation from the norm and compared across biological replicates. In sum, 518 glycopeptides from 227 glycoproteins were responsive to T cell activation across at least two biological replicates (p value Ͻ 0.05, n ϭ 4, supplemental Table S7).
Of the significantly regulated glycopeptides, 17% were derived from nuclear pore proteins, which account for 9% of all identified glycopeptides. Analysis of the functional impact, if any, of an individual O-GlcNAcylation event on the nuclear pore complex is challenging to study because of the high modification levels of some nucleoporins. Our data reveal that only certain glycosites on each nucleoporin were significantly up-regulated in response to T cell activation. For example, of 27 glycopeptides identified from NUP98, seven were significantly increased across biological replicates. The glycopeptides from NUP98 detected by chymotrypsin digestion are highlighted in Fig. 4A. Other highly O-GlcNAcylated nucleoporins (e.g. NUP62, NUP153, NUP214) followed a similar trend (supplemental Fig. S6 -S7).
T cell activation is a dynamic process with signaling events occurring over multiple timescales. O-GlcNAc modifications likely occur over multiple timescales as well. Large structural perturbations to O-GlcNAc may prevent its removal by OGA, precluding its ability to faithfully recapitulate signaling dynamics, although O-GlcNAz is a reported substrate for OGA (35). To demonstrate the ability of O-GlcNAz to act as a reporter of O-GlcNAc dynamics, we performed IsoTaG on T cells activated by anti-CD3/CD28 at both early (15 min) and late (18 h) time points. Quantification of glycopeptides from a single biological replicate revealed dynamic shifts in the abundance of a subset of glycosites during T cell activation. Glycosites with large differential abundances after activation are presented in Fig. 4B.
Validation of Label Free Analysis of Selected T Cell Glycoproteins by Western Blot and PEG Mass Tags-A gel shift assay, in which azide groups are labeled with 5 kDa polyethylene glycol (PEG-5kDa) mass tags instead of the cleavable biotin tag (53), was used as an independent method to vali- date our quantitative MS analysis on ten selected glycoproteins. A whole cell lysate from one of the biological replicates was treated with 100 M DBCO-PEG-5kDa and then analyzed by immunoblotting to detect shifts in electrophoretic mobility for proteins of interest. Because the PEG mass tag introduces a discrete 5 kDa shift for every labeled O-GlcNAz group, the shift in mobility indicates how many O-GlcNAz groups exist per protein molecule. Additionally, the intensity of the shifted bands relative to the unshifted band allows determination of O-GlcNAc stoichiometry.
Western blot results were generally reflective of label-free quantification trends (Fig. 5). The number of O-GlcNAc sites on a given protein and their relative stoichiometry varied considerably. Some proteins (e.g. MYPT1, CREB) were highly and multiply O-GlcNAcylated whereas others (e.g. ZAP70, STAT3) were only mono-glycosylated at sub-stoichiometric levels. In a few cases (e.g. SHIP1, FYB/ADAP/SLAP130), shifted bands were not apparent, potentially because of modification rates below the limit of detection for this method (est. Ͻ5% modification), which does not involve an enrichment step like IsoTaG. An increase in global glycosite modification, likely related to increased expression of the protein, was observed for c-JUN and JUNB proteins as predicted by label-free quantification. A decrease in overall glycosite occupancy was observed in both label free data and Western blot for CREB, STAT3, CBL, MYPT1 proteins. DISCUSSION Although many studies have reported functionally important alterations in O-GlcNAc levels in various biological settings, such as the regulation of T cell activation (23,31,32), comparatively few have identified specific glycosites and their molecular function (28), and fewer still the interplay between O-GlcNAc and other PTMs (e.g. phosphorylation) (15). The ability to directly observe and quantitatively profile O-GlcNAc sites has inhibited the advancement of such studies to date.
The application of IsoTaG, a method to enrich and directly characterize intact, metabolically-labeled glycopeptides, to the study of human T cells has enabled quantitative insight into glycosite changes during T cell activation. Metabolic labeling with Ac 4 GalNAz installs a reporter specifically for glycosites as they are formed. We identified 2,219 glycopeptides from 1045 glycoproteins in this manner. Although the majority of these are nuclear and cytoplasmic proteins (Fig. 2), 23% were associated with the secretory pathway. GlcNAz is the primary end point for Ac 4 GalNAz labeling in cell culture (95% of incorporated label (43)), but identified glycoproteins falling in the secretory category may bear glycans other than O-GlcNAz as Ac 4 GalNAz has the potential to label multiple types of glycans at O-glycosites (e.g. O-GalNAz) (42). Although it is rare to observe the Tn antigen in healthy human T cells, it is possible for activated human T cells to display the Tn antigen (47) or for an extracellular glycoprotein to become intracellular through the ERAD pathway. An alternative possibility is that some of these sites may represent extracellular O-GlcNAc, which has been described on proteins with EGF-like domains (54,55). In addition to O-glycosites, another metabolic end point for Ac 4 GalNAz is N-GlcNAz. Using the isotopic pattern as a handle for manual annotation of all glycopeptides in one anti-CD3/CD28 stimulated sample, we found only one N-glycopeptide. These results point to intracellular O-GlcNAc as the major species labeled by Ac 4 GalNAz in T cells based on most of the identified glycoproteins having nuclear and cytoplasmic localizations, manual validation, and the homology sequence of identified glycosites. A systematic study of fragment ion ratios produced by O-GlcNAz and O-GalNAz glycopeptide standards will allow for a definitive assignment between the two glycan structures.
Glycosites from human T cells show only a weak homology sequence, in agreement with previous reports (50,51). The absence of a strict consensus motif as well as the possible transfer of UDP-GlcNAc to a Ser/Thr near the originally targeted Ser/Thr on a protein has stymied the study of O-GlcNAc function. An intriguing possibility, albeit speculative, is that a subset of O-GlcNAc sites that are more discordant from the homology sequence may be more functionally relevant, especially if the targeted Ser/Thr lacks neighboring Ser/Thr, which would make the modification site inherently more site specific. Our data reveals a subset of highly dissimilar sequences from the homology sequence (supplemental Table S6). Future studies will assess the FIG. 5. Validation of identified glycosites by Western blot. A, Western blot validation by PEG-5kDa mass tags. Metabolically labeled T cell lysates were tagged with PEG-5kDa mass tags and probed for the protein of interest. B, Label free quantification data across two biological replicates for specific glycosites during stimulation as compared to the isotype control. Starred sequences were identified as significantly different across at least two biological replicates. Unambiguous glycosites identified on the glycopeptide are lower case and bold. functional significance of these sites in cell signaling pathways.
The potential intersection of phosphorylation and O-Gl-cNAcylation in cellular signaling pathways, whether in an antagonistic or cooperative manner, has been hypothesized for some time, and is only beginning to be tested on a broad scale (50,56). Studies with primary T cells and the Jurkat T cell line have shown a rapid increase in phosphorylation within the first 5-15 min after T cell receptor (TCR) engagement followed by a decay after 60 min (2). However, this pattern may be more specific to phosphorylation of proximal signaling molecules, such as LCK and ZAP-70. Global levels of Ser/Thr phosphorylation have been observed to gradually increase over 18 h, like the kinetics of O-GlcNAc (2,32). The high rate of co-localization of O-GlcNAc and known phosphorylation sites revealed herein supports a potential interaction between the two modifications during cellular signaling. Although 50 glycophosphopeptides were identified in the collected data sets, further studies to elucidate whether this represents above or below average co-occurrence rates will need to be performed (supplemental Table S5). Furthermore, glycosite occupancy levels in activated T cells vary over time as shown in Fig. 4B, suggestive of either differential protein expression or glycosite occupancy during T cell activation. These data suggest a potential site-specific role for O-GlcNAc during T cell activation that will be explored further in future studies comparing glycosylation and phosphorylation from individual biological replicates.
Label-free quantification of glycopeptides revealed over 500 glycopeptides as significantly changed in response to T cell activation across at least two of four biological replicates. Some of these glycopeptides originate from proteins with well-established importance to T cell activation, such as c-JUN. We find that both the expression and glycosylation of c-JUN are remarkably enhanced in activated T cells. One glycosylation site on c-JUN is located at the unambiguous site Ser84, which lies near phosphorylation sites at Ser63/ Ser73 and Thr91/Thr93. Phosphorylation of these residues by JNK promotes the transcriptional activation as well as degradation of c-JUN (57). Given that one proposed function of O-GlcNAc is to regulate protein stability (15,58), it is thus tempting to speculate that O-GlcNAc protects c-JUN from degradation by suppressing its interaction with an E3 ubiquitin ligase. Indeed, Qiao and colleagues recently reported that expression of c-JUN correlates with its level of O-GlcNAcylation as well as the expression of OGT in hepatocellular carcinoma cells (59). Mutation of Ser73 reduced O-GlcNAcylation of c-JUN in these studies, suggesting Ser73 may be modified by O-GlcNAc. However, direct evidence of O-Gl-cNAcylation at this residue by mass spectrometry was not obtained. Therefore, another possibility is that phosphorylation of Ser73 promotes glycosylation of c-JUN at other serine residues. Our data revealed that c-JUN molecules contain up to three O-GlcNAc sites (Fig. 5A), one of which is Ser84, in primary human T cells. In the future, it will be important to map the remaining glycosylation sites on c-JUN for site-directed mutagenesis and subsequent studies on how O-GlcNAc, together with phosphorylation, affects c-JUN function.
As expected, our data set also contains O-GlcNAc sites from several nucleoporins. O-GlcNAcylation of nuclear pore proteins has been proposed to regulate their stability, and in addition, their permissiveness to nucleocytoplasmic transport (22,60). Specific O-GlcNAc sites (e.g. on NUP98) are significantly up-regulated during T cell activation, as shown in Fig.  4A. Validation of label-free quantification data by Western blot for ten glycoproteins generally reflected the label-free quantification trends and the number of identified modification sites in each protein (with the exception of c-JUN). These data in aggregate expose the number of O-GlcNAcylated proteins and interacting proteins found within the T cell receptor signaling pathway (supplemental Fig. S8).
In conclusion, we report a comprehensive list of 2,219 metabolically-labeled O-linked glycopeptides and glycosites found within resting and activated primary human T cells. The identified glycoproteins are enriched in proteins related to transcriptional regulation, and some, such as ZAP-70, play integral roles in T cell receptor signaling. Quantitative analysis of resting versus activated T cells reveals significant changes in the abundance of a subset of these glycopeptides, supporting the functional significance of O-GlcNAc to T cell activation. We also show that a large percentage of the glycosites occur near known phosphorylation sites, suggesting the potential for combinatorial regulation of protein activity by these two PTMs. Western blot analysis of ten glycoproteins corroborated the MS data both qualitatively and quantitatively. These data will drive the further elucidation of the functional impact of O-GlcNAc within human biology and enable the study of the interplay between multiple PTMs.

DATA AVAILABILITY
The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the data set identifier PXD004559 (61). ʈʈ To whom correspondence should be addressed: Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA 02138. E-mail: cwoo@chemistry.harvard.edu.
¶ ¶ These authors contributed equally to this work.