Cysteine S-linked N-acetylglucosamine (S-GlcNAcylation), A New Post-translational Modification in Mammals*

Intracellular GlcNAcylation of Ser and Thr residues is a well-known and widely investigated post-translational modification. This post-translational modification has been shown to play a significant role in cell signaling and in many regulatory processes within cells. O-GlcNAc transferase is the enzyme responsible for glycosylating cytosolic and nuclear proteins with a single GlcNAc residue on Ser and Thr side-chains. Here we report that the same enzyme may also be responsible for S-GlcNAcylation, i.e. for linking the GlcNAc unit to the peptide by modifying a cysteine side-chain. We also report that O-GlcNAcase, the enzyme responsible for removal of O-GlcNAcylation does not appear to remove the S-linked sugar. Such Cys modifications have been detected and identified in mouse and rat samples. This work has established the occurrence of 14 modification sites assigned to 11 proteins unambiguously. We have also identified S-GlcNAcylation from human Host Cell Factor 1 isolated from HEK-cells. Although these site assignments are primarily based on electron-transfer dissociation mass spectra, we also report that S-linked GlcNAc is more stable under collisional activation than O-linked GlcNAc derivatives.

Protein glycosylation is one of the most frequently occurring post-translational modifications. The generic term "glycosylation" includes a diverse set of modifications featuring different sugar units bound to amino acid side-chains (1). Based on the atom through which the peptide-saccharide linkage is established, C-, N-, O-, and S-glycosylation can be distinguished. C-glycosylation is a mannose modification on the side-chain of Trp (2). N-glycans, which feature a common core structure of GlcNAc 2 Man 3 elongated with a variety of different moieties, are linked to Asn side-chains. This modification occurs within the consensus motif NX(S/T/C), where X cannot be Pro (3,4). O-glycoproteins belong to two distinct categories: GlcNAcylation of Ser and Thr residues occurs within the cytoplasm or nucleus (3,5); proteins destined for secretion or presentation on the cell membrane are modified in the ER and the Golgi, where GalNAc, Xyl, Fuc, Glc, or Man may be directly attached onto hydroxy amino acid side-chains (3). With the introduction of better analytical tools our knowledge about protein glycosylation is constantly expanding. Nonconsensus motif N-glycosylation has been reported (6 -9) and O-glycosylation of Tyr residues has recently been discovered (9,10). S-glycosylation is also a new discovery in prokaryotes: the sugar modification of Cys side-chains has been reported in Bacillus subtilis (S-glucosylation) and in Lactobacillus plantarum (S-GlcNAcylation) (11)(12)(13). Most of these discoveries are the result of mass spectrometric analysis of glycopeptides found in proteolytic digests of protein mixtures. It has become common place to use high mass accuracy collision-induced dissociation (CID) 1 techniques to generate mass spectra of these substances, but the alternative nonergodic technique, electron-transfer dissociation (ETD), has unique advantages for the assignment of sugar/protein linkage sites. The CID strategy yields extensive information about the structural nature of glycans in the observation of nonreducing end oxonium (B) ions and reducing end (Y) fragments (14) in the mass spectra, both series formed by glycosidic bond cleavages (15,16). In addition, beam-type CID (in Thermo instruments termed "higher-energy C-trap dissociation," i.e. HCD) usually induces sufficient peptide fragment ions to identify the peptide (17). However, as noted above because of the highly labile nature of the glycosidic bond upon collisional activation, information required for glycan site-localization is typically lost. This drawback may not be a problem for studies of N-glycosylation because of its consensus motif, but it is most certainly an obstacle in those cases where consensus motifs do not exist, i.e. in O-GlcNAcylation and extracellular O-glycosylation. The ETD process triggers peptide backbone cleavages without internal energy randomization; thus, preserving the side-chains intact, including the fragile oligosaccharide structures attached (9,(17)(18)(19)(20)(21). Thus, the information required to assign modification sites is usually present in ETD spectra. There are search engines, for example Protein Prospector and Byonic, that automatically perform the spectral assignments, but an array of potential glycans and the amino acids to be modified have to be specified prior to the database search (9,22,23). Automated interpretation of glycopeptide mass spectra is still in its infancy, therefore, careful inspection of such search results is recommended. Human intervention may help to identify misassignments as well as flag new, unexpected structural clues. In this article we report that S-linked glycosylation occurs in mammals, including humans. The side-chains of Cys residues are modified by a single GlcNAc, and O-GlcNAc transferase (OGT) (24,25), the enzyme required for intracellular O-GlcNAcylation, is responsible for it.

EXPERIMENTAL PROCEDURES
Purification and Proteolytic Digestion of Mouse Synaptic Membranes-Isolation and proteolytic digestion of the mouse synaptosome was performed as described previously (20). Briefly, brains (minus cerebellum) from several adult C57BL/6J mice were harvested and immediately frozen in liquid nitrogen. Tissue was homogenized in sucrose buffer in the presence of 20 M PUGNAc and phosphatase inhibitors and cleared by centrifugation. The membrane containing fraction was layered on a sucrose density gradient and fractionated by centrifugation. Synaptic membranes were collected at the 1.0 -1.2 M interface. The synaptosome fraction (10 mg) was denatured in 6 M guanidine hydrochloride containing 20 M PUGNAc and 6ϫ phosphatase inhibitor mixture (I and III Sigma); samples were reduced with 2 mM Tris(2-carboxyethyl)phosphine hydrochloride at 57°C for 1 h and alkylated with 4.2 mM iodoacetamide in the dark for 45 min at room temperature, then diluted 6-fold and incubated with 5.0% (w/w) TPCK treated trypsin (ThermoFisher Scientific, Rockford, IL) at 37°C for 18 h.
Lectin Weak Affinity Chromatography-Desalted synaptosome tryptic peptides were resuspended in 600 l LWAC buffer (100 mM Tris pH 7.5, 150 mM NaCl, 10 mM MgCl 2 , 10 mM CaCl 2 , 5% acetonitrile) and 100 l were run over a 2.0 ϫ 250 mm POROS-WGA column at 100 l/min under isocratic conditions with LWAC buffer and eluted with a 100 l injection of 40 mM GlcNAc. Glycopeptides from the rightmost 10% of the flow-through peak through the 40 mM GlcNAc peak were collected inline on a Luna 10u C18 30 ϫ 4.6 mm column (Phenomenex, Torrance, CA). Glycopeptides collected from all 10 mg of total peptides were eluted with 50% acetonitrile 0.1% formic acid in a single 500 l fraction. Glycopeptides from the eluate were subsequently enriched twice more for a total of three enrichment steps.
High pH Reverse Phase Chromatography-LWAC enriched glycopeptides were further fractionated by high pH RP chromatography with a 1.0 ϫ 100 mm Gemini 3 C18 column (Phenomenex). Peptides were loaded onto the column in 100% buffer A (20 mM HCOONH 4 , pH 10). Buffer B consisted of buffer A with 50% acetonitrile. A gradient of 1% B to 21% B over 1.1 ml, to 62% B over 5.4 ml, and then directly to 100% B with a flow rate of 80 l/min was used and 24 fractions were collected.
In Vitro GlcNAcylation and Deglycosylation-Peptides MCAALNS-MDQYGGR, QKAPFPATCEAPSR, LDFGQGSGSPVCLAQVK, AVC-CDMVYKLPFGR, QKAPFPAACEAPAR, LDFGQGAGAPVCLAQVK were synthesized by Elim Biopharmaceuticals (Hayward, CA). O-GlcNAc standard peptide, TAPTS(GlcNAc)TIAPG was purchased from ThermoFisher Scientific. Recombinant O-GlcNAc transferase expressed and purified from E. coli (30 M) and synthetic peptide (33 M) were combined with 30 M UDP-GlcNAc in 50 mM Tris-HCl, 12.5 mM MgCl 2 , 1 mM DTT for 18 h at 37°C. Reactions were stopped with the addition of formic acid to 0.3% (v/v) and desalted prior to MS analysis. As a control the above reactions were also carried out without addition of OGT.
Cys-GlcNAc modified peptides produced with OGT under the above reaction conditions were separated from the enzyme using a 10 K Amicon Ultra spin filter. Peptides were then desalted, dried, and resuspended in 20 mM HEPES pH 7.2 with or without 7 nM recombinant O-GlcNAcase from Streptococcus pyogenes (Prozomix, Haltwhistle, UK). The activity of the glycosidase was also tested with the standard O-GlcNAcylated peptide.
Mass Spectrometry Analysis-All samples were analyzed on an LTQ-Orbitrap Velos mass spectrometer (Thermo) equipped with a nano-Acquity UPLC system (Waters, Milford, MA). Peptides were fractionated on a 15 cm ϫ 75 m ID 3 m C18 EASY-Spray column using a linear gradient, from 2-35% solvent B developed in 60 min. Mass measurements were performed in the Orbitrap. The three most abundant multiply charged ions were computer-selected for HCD as well as ETD analysis. The trigger intensity was set to 2000. Supplemental activation was enabled. The ETD fragments were measured in the linear trap and the HCD fragments were analyzed in the Orbitrap. Each sample was injected twice. In the first analysis, precursor ion selection was restricted to 2ϩ ions, in the second analysis doubly charged precursors were excluded. Raw files have been uploaded to MassIVE under the accession# MSV000079722. Peaklists were extracted using Proteome Discoverer 1.4. ETD and HCD data was searched against the UniProt Mus musculus database (73,955 entries, downloaded June 6, 2013) (and concatenated with a randomized sequence for each entry) using Protein Prospector (v5.10.15). Cleavage specificity was set as tryptic allowing for 2 missed cleavages. Carbamidomethylation of Cys, acetylation of protein N termini, oxidation of Met, cyclization of N-terminal Gln, and HexNAc modification of Ser, Thr, and Asn, and a mass modification of 203-203.1 Da of Cys were set as variable modifications. Three modifications per peptide were permitted. The required mass accuracy was 20 ppm for precursor ions, and 30 ppm or 0.8 Da for HCD and ETD fragments, respectively. Spectra identified as representing peptides featuring a Cys-modification close to the exact mass of HexNAc (203.0794) with a minimum score of 22, a maximum E value of 0.05 and with a SLIP (site localization) score of 5 were tentatively accepted, and carefully scrutinized. A new mouse synaptosome mixture was analyzed after glycopeptide enrichment in LC/MS EThcD mode on an Orbitrap Fusion Lumos Tribrid Mass Spectrometer in 1 h gradient fractionation/ data acquisition. In this experiment both the precursor and fragment ions were measured in the Orbitrap. Thus, the database searches were performed as described above but with a 10 ppm and 20 ppm mass accuracy requirement for the precursors and fragments, respectively. This search was performed with Protein Prospector v5.14.20 and the CysGlcNAcylation was accurately defined. All other parameters remained the same. When generating the XIC chromatogram for the comparison of the relative amounts of S-and O-GlcNAcylated sequences the monoisotopic masses Ϯ20 ppm were extracted from the MS1 data.

RESULTS
Glycopeptide enrichment studies were conducted with a mouse synaptosome tryptic digest using lectin affinity chromatography with wheat germ agglutinin as described previ-ously (20). The resulting mixtures were further fractionated by high pH reverse phase liquid chromatography and subjected to LC/MS/MS analyses. Precursor ions were fragmented with beam-type collisional activation ("higher energy C-trap dissociation," HCD) as well as electron-transfer dissociation (ETD). Manual inspection of particular spectra revealed that in certain instances the spectral assignments would be more convincing if the HexNAc-modification of Cys-residues was considered. A database search was then performed with a 203-203.1 Da variable modification on Cys side-chains and a significant number of Cys-modified sequences were identified with high confidence (Table I, supplemental Table S1). The ETD spectra in question were carefully evaluated along with their corresponding HCD spectra. In all rigorous cases the interpretation of the ETD spectra unambiguously established the Cys as the site of modification (Fig. 1, supplemental Table  S1; supplemental Figs. S1-S10). The single modifying Hex-NAc was assigned as GlcNAc based on its characteristic fragment ions ( Fig. 2A, supplemental Figs. S1-S17, S26) (26). Interestingly, S-linked GlcNAcylation seems to be more stable than O-linked GlcNAcylation as illustrated with the fragmentation of an O-GlcNAcylated peptide of identical amino acid sequence (Fig. 2B). It was also observed that the ETD spectra of these S-GlcNAcylated peptides usually featured significant S-GlcNAc losses from the charge-reduced precursor ions (Fig. 1, supplemental Figs. S1-S17, S26). Additional database searches with rat sciatic nerve (SN) and mouse embryonic stem cell (ESC) glycopeptide datasets, where the sample  1931 . From these data the identity of the sugar unit cannot be determined, but the modification site assignment is unambiguous. "ࡗ" labels the precursor ion and its charge-reduced form. The insert shows the bond cleavages detected. The modified residue is labeled with an asterisk.
preparation, glycopeptide enrichment, and mass spectrometry analysis were performed in a similar manner, yielded 7 S-linked GlcNAcylated peptides (Table I, supplemental Table  S1, supplemental Figs. S11-S17). One of these peptides, NAC(GlcNAc)IAPAAFSGQPQK from MLX-interacting protein was also found in the synaptosome data set, whereas a homologous sequence was present among the rat glycopeptides: SAC(GlcNAc)IAPAAFTGQPQK of Protein Mlxip. In addition, identical Protein Fam222b S-linked GlcNAcylated peptides were identified from the mouse ESC and rat SN samples (Table I, supplemental Table S1, supplemental Figs. S12 and S17). The S-linked GlcNAc peptides were reproducibly en-  (26). "ࡗ" labels the precursor ion. Y 0 stands for the peptide after the gas-phase GlcNAc elimination (Nomenclature (11)). Peptide fragments that retained the GlcNAc feature a 'G' next to their position number. Combined fragmentation pattern for A and B is shown. B, HCD spectrum of m/z 771.0135(3ϩ) that was identified from ETD data as Bassoon's 1912 FPFGSSC(Carbamidomethyl)T(HexNAc)GTFHPAPSAPDK 1931 . The glycosylation site cannot be determined, but the fragmentation pattern of the m/z 204.087 oxonium ion identifies the modifying sugar as GlcNAc (26). No glycosylated fragments were detected. "ࡗ" labels the precursor ion.
riched and detected from a recent additional mouse synaptosome preparation. The sample was analyzed after glycopeptide enrichment but without further fractionation. In this new study three bassoon peptides, LDFGQGSGSPVC (GlcNAc)LAQVK, FPFGSSC(GlcNAc)TGTFHPAPSAPDK, and QKAPFPATC(GlcNAc)EAPSR were identified (supplemental Figs. S18 -S20). Quite a few of the proteins detected with this new modification have been reported previously as being O-GlcNAcylated (18,20,21). Sequences identified as S-Gl-cNAc-modified were also observed O-GlcNAcylated within the same synaptosome data sets (see supplemental Table S1, supplemental Figs. S21-S24). We attempted to assess the relative amounts of S-and O-linked modifications on the same sequence using the LC/MS data set recorded from the unfractionated synaptosome glycopeptides. Peptide FPFGSS-CTGTFHPAPSAPDK was identified both S-and O-GlcNAcylated, residues 1918 and 1919 respectively, under these circumstances, and based on the extracted ion chromatograms and relative peak-intensities the two variants were present approximately in the same relative amounts (supplemental Fig. S25); for the other two sequences only the S-GlcNAcylated peptide was detected. To assess whether such modification occurs in humans, we also reanalyzed glycopeptide data sets obtained from human tissue culture samples, and found that recombinant Host Cell Factor 1 isolated from HEK cells (18) was Cys-GlcNAcylated at position 1139 (supplemental Fig. S26). With these assignments in hand we hy-pothesized that OGT may be flexible concerning the sugar acceptor, and thus, Cys-GlcNAcylation may be part of its repertoire.
In vitro Glycosylation and Deglycosylation-In vitro glycosylation experiments were performed with selected synthetic peptides MCAALNSMDQYGGR, QKAPFPATCEAPSR, LDF-GQGSGSPVCLAQVK, AVCCDMVYKLPFGR, and recombinant OGT. Once the reaction mixture was analyzed by LC/ MS/HCD and ETD analysis it became obvious that the enzyme indeed modified two of the synthetic peptides, QKAPFPATCEAPSR, and LDFGQGSGSPVCLAQVK, and the GlcNAc was linked to the Cys residue. Fig. 3 shows that the ETD spectrum of the in vitro modified QKAPFPATC-(GlcNAc)EAPSR is identical to that acquired from the glycopeptide isolated from the mouse synaptosome. We also performed these experiments with versions of the above peptides with Ala for Ser/Thr substitutions and again found that recombinant OGT was able to modify Cys residues (supplemental Figs. S27 and S28). Supplemental Fig. S1B shows the HCD and ETD data for the other in vitro S-GlcNAcylated sequence. In the control mixture where no enzyme was added Cys-GlcNAc-modification was not detected.
The addition and removal of GlcNAc from Ser and Thr residues is each performed by a single enzyme, OGT and O-GlcNAcase (OGA), respectively (27,28). Although our data shows that OGT is capable of modifying Cys residues with GlcNAc, we also examined whether or not OGA could remove this S-linked modification. In vitro OGT reactions were carried out as above; peptides were separated from OGT, and then incubated with or without OGA. Unfortunately, in both reactions there is a mixture of modified and unmodified peptides. Ratios of these peptides clearly show that although OGA was capable of deglycosylating the standard O-GlcNAcylated peptide (supplemental Figs. S29A, S29B, S29E), it was unable to remove the sugar from the S-linked GlcNAc peptide (supplemental Figs. S29C, S29D, S29F). This observation was consistent with earlier findings and the proposed mechanism of GlcNAc hydrolysis by the enzyme (29). DISCUSSION We discovered GlcNAcylation of Cys side-chains on a series of intracellular proteins. ETD analysis was essential for the identification of the modified residues and sites described here. The ETD spectra of S-linked GlcNAcylation feature the expected fragmentation caused by side-chain loss, based on a similar side chain loss from alkyl-Cys residues following electron-capture (30). Upon collisional activation S-GlcNAcylated sequences undergo gas-phase deglycosylation similar to O-linked glycopeptides (16,17,31,32). However, when identical O-and S-modified sequence ion series are compared, the latter ones frequently retain the GlcNAc, something rarely observed for their O-linked counterparts (31,33). Some CID data even contain sufficient information for the unambiguous assignment of S-GlcNAcylation.
S-GlcNAcylation has been observed in some proteins that have previously been reported to be O-GlcNAc modified on Ser and Thr residues (supplemental Table S1) (18,20,21). Based on our in vitro experiments the O-GlcNAc transferase that glycosylates the hydroxy amino acids modifies Cys residues as well, and in certain sequence stretches Cys may be preferred. This behavior is not entirely unique. The glycosyltransferase responsible for the S-glycosylation in prokaryote Bacillus subtilis is also able to perform O-glycosylation (34). Our data also suggests, however, that OGA is not able to remove GlcNAc from Cys residues. Further experimentation will be necessary to assess whether Cys-GlcNAc is a dynamic modification removed by an enzyme other than OGA. Based on our very limited data set we believe that the modification may be conserved among species. Identical sequences (AGISTTSVC(GlcNAc)EGQIANPSPISR representing protein Fam222b) and homolog peptide pair (NAC(GlcNAc)IAPAAFS-GQPQK and SAC(GlcNAc)IAPAAFSGQPQK from Protein Mlxip) were detected in mouse and rat samples, respectively (see Table I).
The interplay of different post-translational modifications on numerous proteins has been reported (35,36), and is documented for phosphorylation and GlcNAcylation in some cases (15,20,26,27,37,38). Both S-and O-GlcNAcylation were detected on the same peptide sequences in some synaptosome glycoproteins but generally were not found concurrently within the same peptide. Based on these results we may speculate that S-GlcNAcylation may prevent/replace O-GlcNAcylation in the proximity of the modified Cys. However, the biological significance of S-GlcNAcylation and potential interactions with other modifications remains to be investigated.