Characterizing Ubiquitination Sites by Peptide-based Immunoaffinity Enrichment*

Advances in high resolution tandem mass spectrometry and peptide enrichment technologies have transformed the field of protein biochemistry by enabling analysis of end points that have traditionally been inaccessible to molecular and biochemical techniques. One field benefitting from this research has been the study of ubiquitin, a 76-amino acid protein that functions as a covalent modifier of other proteins. Seminal work performed decades ago revealed that trypsin digestion of a branched protein structure known as A24 yielded an enigmatic diglycine signature bound to a lysine residue in histone 2A. With the onset of mass spectrometry proteomics, identification of K-GG-modified peptides has emerged as an effective way to map the position of ubiquitin modifications on a protein of interest and to quantify the extent of substrate ubiquitination. The initial identification of K-GG peptides by mass spectrometry initiated a flurry of work aimed at enriching these post-translationally modified peptides for identification and quantification en masse. Recently, immunoaffinity reagents have been reported that are capable of capturing K-GG peptides from ubiquitin and its thousands of cellular substrates. Here we focus on the history of K-GG peptides, their identification by mass spectrometry, and the utility of immunoaffinity reagents for studying the mechanisms of cellular regulation by ubiquitin.

Post-translational modification by ubiquitin and ubiquitinlike proteins represents a major regulatory system in eukaryotic organisms (1,2) and certain pathogenic bacteria (3). A conserved enzymatic cascade couples the C terminus of ubiquitin to the epsilon amino group of lysine residues on substrate proteins (4). Evidence has also emerged implicating ubiquitination via cysteine, serine, and threonine, as well as N-terminal residues (5)(6)(7)(8). Depending on signaling context and the enzymes involved, protein substrates can be monoubiquitinated, multiubiquitinated, or polyubiquitinated. Ubiquitin-dependent processes are commonly modulated via formation of polyubiquitin chains, whereby the C terminus of a chain extending ubiquitin becomes linked to the N terminus or one of seven lysine residues (Lys-6, Lys-11, Lys-27, Lys-29, Lys-33, Lys-48, and Lys-63) within a substrate-bound ubiquitin molecule (9,10). The functions of these structurally diverse modifications have been extensively studied through in vitro biochemistry (11)(12)(13), ubiquitin replacement in cellular models (14,15), linkage specific antibodies (16 -19), and mass spectrometry (20 -26). A growing body of evidence indicates that polyubiquitin chains regulate biological processes not only by eliciting proteasomal degradation but also by altering subcellular localization, modulating enzymatic activity, and facilitating protein-protein interactions of ubiquitinated substrates (9,24). Although in depth characterization of model substrates has led to significant advances in our understanding of biochemical mechanisms and cellular processes, much remains to be elucidated with relation to the dynamics of ubiquitination on individual substrates. The need for improved methodologies has driven innovation in the field of mass spectrometry proteomics, particularly in the characterization of protein ubiquitination sites. Here we focus on recent advances in ubiquitination site mapping and the questions that are driving this area of discovery biology.
Initial Characterization of Ubiquitination Sites-In 1977, while characterizing peptides from the unique chromosomal protein then known as A24, Ira Goldknopf and Harris Busch (27) described an isopeptide bond between lysine 119 of histone 2A and the tryptic C-terminal diglycine remnant of a "non-histone-like" sequence. The well chronicled flurry of work that followed established the non-histone-like molecule to be ubiquitin and defined its central role in cell regulation, leaving A24 as a remnant of history (28 -32). However, from this groundbreaking work, two ideas emerged that define our current understanding of and experimental approaches toward protein ubiquitination. The fundamental characteristic of ubiquitin is indeed conjugation to protein substrates through an isopeptide bond. Moreover, the position of these substrate-ubiquitin bonds may be studied by generating and identifying tryptic K-GG signature peptides where the C-terminal diglycine of ubiquitin remains covalently attached to the substrate (Fig. 1A).
The earliest attempts at mapping ubiquitination sites on protein substrates focused on uncoupling ubiquitin-dependent degradation and stabilizing otherwise labile proteasome substrates. Work on the model substrate ␤-galactosidase (13), cell cycle regulator cyclin B1 (33), and the NF-B inhibitor IB␣ (34) established several themes. The ligase-binding site on a substrate often defines the region in which available lysines may be modified by ubiquitin. Although ubiquitination occurs within defined regions of a protein, individual lysines are typically dispensable. In this model, stabilization of a proteasome substrate through mutagenesis or truncation can be accomplished either by disrupting the ligase-substrate docking site or by eliminating all lysines capable of being  1. Identification of K-GG peptides. A, the first K-GG peptide originally identified by Goldknopf and Busch. The diglycine signature of trypsin-digested ubiquitin extends from Lys-119 of histone 2A. B, general scheme for immunoaffinity enrichment of K-GG peptides from either cultured cells or tissue lysates. Protein lysates are prepared under denaturing conditions, digested to peptides, and desalted, and K-GG peptides were captured using antibodies recognizing the isopeptide-linked diglycine signature. C, dot blot analysis performed showing specificity of anti-K-GG for tryptic peptides with isopeptide-linked diglycine signature. A mixture of either synthetic K-GG peptides (upper left) or unbranched ubiquitin peptides (lower left) was spotted adjacent to yeast protein lysate or purified monoubiquitin, respectively. A digestion time course was performed in parallel on both yeast lysate and monoubiquitin. Aliquots were collected at the indicated time points, spotted onto PVDF membrane, and immunoblotted using either anti-KGG or anti-ubiquitin (P4D1 epitope). Immunoreactivity confirms the presence of K-GG peptides in digested lysate. WB, Western blot; Ub, ubiquitin. targeted by the E3 1 ligase. In ubiquitin signaling, the concept of functional redundancy of adjacent lysines extends beyond proteasomal degradation to ligand mediated internalization and trafficking, as extensive combinatorial mutagenesis of the epidermal growth factor receptor kinase domain was required to yield loss of function (35,36). More recently, monoubiquitination of SEC31 was shown to drive the assembly of large COPII coats on vesicles through a mechanism independent of any single lysine (37). Proliferating cell nuclear antigen, the clamp protein involved in post-replicative DNA repair, and Met4, a transcription factor involved with nutrient sensing, represent exceptions to functional redundancy, as single sites appear responsible for coordinating specific ubiquitin-dependent functions (38,39).
Although these and many other studies involving mutagenesis have shaped our fundamental understanding of protein ubiquitination, a cautionary note remains: correlating loss of function with elimination of ubiquitin acceptor sites provides only indirect evidence for ubiquitination at the mutated residues. One alternative mechanism envisions lysine to arginine substitutions preventing ubiquitination by interfering with binding of the ligase to the substrate. The two essential pieces of information needed to discriminate between these closely related mechanisms are direct evidence of the linkage between ubiquitin and the modified lysine and confirmation that the abundance of the modified peptide responds to biological perturbation. Demonstrating ubiquitination in vivo is complicated further by the multiplicity of ubiquitin ligases that often act toward individual substrates. The specific lysine residues modified on a particular substrate may vary depending on which enzymes are involved and how each interacts with the substrate. Additional complexity is conferred by deubiquitinating enzymes, which have the potential to selectively edit ubiquitin signals. Focused quantitation of individual ubiquitination sites promises to further our understanding of ligasesubstrate dynamics.
Nearly three decades passed between identification of the first diglycine-modified peptide by Edman sequencing and the emergence of LC-MS/MS as the primary tool for mapping ubiquitination sites. Initial proof of concept came through identification of the K-GG peptide of Lys-48-linked polyubiquitin (22,40) and Lys-165 of the yeast G-protein coupled receptor Gpa1 (41). These K-GG peptides require cleavage between Arg-74 and Gly-75 of ubiquitin, typically by trypsin or ArgC proteases (Fig. 1A). An amino group extends from the attached diglycine remnant, giving these peptides an additional N terminus that plays a prominent role in current enrichment strategies.
Mass spectrometry has proven effective at mapping protein ubiquitination sites on individual substrates of interest. The two main strategies have been identification of K-GG peptides following in vitro ubiquitination of a purified substrate or immunoprecipitation of ubiquitinated substrates directly from cells. Analyses from cells have the advantage that modified proteins occur in biological context and can be induced by relevant stimuli, as in the cases of epidermal growth factor receptor, inositol 1,4,5-trisphosphate receptor, and dopamine transporter ubiquitination following treatment with epidermal growth factor, gonadotropin-releasing hormone, or protein kinase C activation, respectively (36,42,43). In the case of caspase-8, ubiquitination was shown to occur on Lys-461 in response to activation of death receptors by an extrinsic stimulus (44). Analogous studies have been carried out in yeast to identify ubiquitination sites on proteins including Gpa1, Met4, and Ypt7 (39,41,45), as well as in tissue from Alzheimer's patients to identify ubiquitination sites on the pathological form of Tau (46). Alternatively, physiologically relevant ubiquitination sites can be identified from in vitro ubiquitination reactions. In the context of neurodegeneration, ubiquitination sites have been mapped on ␣-synuclein and the E3 ligase CHIP (47,48). For CHIP, studies demonstrated that ubiquitination on Lys-2 near the N terminus mediates interaction with the polyglutamine expanded ataxin-3 and may contribute to spinocerebellar ataxia (48). Mass spectrometry likewise identified lysine residues within the proteasome-associated ubiquitin receptor Rpn10 that when modified by ubiquitin affect its ability to interact with ubiquitinated proteasome substrates (49).
Efforts to successfully purify ubiquitinated substrates from lysate for identification by mass spectrometry initially employed epitope-tagged ubiquitin and immobilized metal affinity resins (22,50). Subsequent efforts taking advantage of improvements in epitope tagging approaches (51)(52)(53) or tandem ubiquitin-binding domain affinity resins (54,55) in combination with more sensitive mass spectrometers advanced these studies to the point that hundreds of ubiquitination sites could be identified from a single sample. Following isolation of an enriched population of ubiquitinated substrates, SDS-PAGE can be used to separate proteins based on size. For substrates modified by different numbers of ubiquitin molecules, a technical limitation is that the corresponding K-GG peptides are distributed across multiple gel bands (56). For single substrate ubiquitination site mapping, digestion and MS analysis can be performed on large gel sections spanning a range of 50 -100 kDa, effectively pooling ubiquitinated substrate (36). Such an approach is less useful in global ubiquitin enrichment studies because of the added sample complexity. Even for single protein analyses, analysis of large gel sections can compromise sensitivity because of gel-based losses and abundant trypsin autolysis products. These limitations can be partially overcome by instead performing in solution digestion of the complex mixture followed by peptide level fractionation. Digesting proteins together that would otherwise have distributed across a wide mass range effectively pools K-GG pep-tides, improving sensitivity (56). This must be balanced against losses encountered in fractionation and the instrument time required to analyze fractionated samples. Given the technical challenges and the low stoichiometry of K-GG peptides relative to unmodified peptides from substrates and ubiquitin itself, methods that rely on protein level ubiquitin enrichment remain suboptimal for global ubiquitination site mapping. Maximal sensitivity requires focused and efficient enrichment of modified peptides from complex mixtures of unmodified peptides.
Enrichment of Diglycine-modified Peptides-Several recent papers have described immunoaffinity enrichment of K-GG peptides from complex protein lysates ( Fig. 1B) (57)(58)(59)(60)(61)(62)(63). The method is based on a similar strategy originally described for enriching low abundance tyrosine phosphorylated peptides (64) and initially shown to be successful using polyclonal antibodies generated against the -GG signature of ubiquitin (65). In contrast to protein level enrichment of ubiquitin, which has yielded up to ϳ750 unique ubiquitination sites in a single study (53), direct enrichment of K-GG peptides from cellular lysates has yielded over Ͼ10,000 unique sites on Ͼ4,000 proteins (57,58). This focused approach has enabled discovery of ubiquitination sites to a similar depth as global phosphorylation and acetylation studies (66 -69) and added support to the idea that many proteins are ubiquitinated at multiple lysine residues within their sequences. Antibodies broadly recognizing K-GG peptides have been generated using two distinct approaches. In one case, a peptide library immunization strategy was employed to generate a rabbit monoclonal antibody against peptides with the sequence CXXXXXXK GG XXXXXX (where X indicates any amino acid except Cys and Trp) (57). In the other case, an antigenic protein was created by chemical modification of purified histones using t-BOC-Gly-Gly-N-hydroxysuccinimide, followed by release of the BOC with TFA. This -GG-modified protein was injected into mice and used to generate a hybridoma for monoclonal production (59). Generation of diglycine-modified proteins such as histones, lysozyme, lactoglobulin, or whole cell lysates provided an additional benefit in visualizing antibody-antigen interactions during validation studies. Similarly, dot blots of undigested and digested lysates demonstrate the high level of specificity for these reagents (Fig.1C).
A comparison of the K-GG peptides captured with these two antibodies suggests a notable difference in their preference toward or tolerance for certain amino acids adjacent to GG-modified lysines. Of the 23,439 ubiquitination sites reported by Kim et al. (57) and Wagner et al. (58), ϳ4,300 are held in common. K-GG peptides enriched using the peptide library derived rabbit monoclonal reagent display increased frequency of small (Ala and Gly) and acidic residues (Asp and Glu) in positions adjacent to the diglycine-modified lysine. In contrast, no such preference toward small residues was observed for the mouse monoclonal, whereas Asp and particularly Glu appear to be disfavored (58). Instead, peptides cap-tured by the GG-histone derived mouse monoclonal antibody more frequently displayed the hydrophobic residues Leu, Ile, Phe, Tyr, and Trp, particularly in the Ϫ3 to ϩ3 region surrounding the modified lysine. Given the notable differences between existing studies in terms of lysis, digest conditions, ratios of input material to antibody, resin incubation, and wash times, a direct, quantitative comparison of two antibodies will be valuable to the community. Detailed analytical comparisons of binding kinetics and effects of the biological matrix on binding between immunoaffinity reagents and modified peptides will be particularly informative. Based on the existing data, an optimized mixture of the two currently available antibodies may be an even more effective tool for profiling ubiquitinated substrates and may further advance efforts aiming to match E3 ligases and deubiquitinating enzymes with the substrates they regulate.
Previous reports have suggested the potential for cross-talk between lysine acetylation and ubiquitination in the context of proteins such as p53 and the transcriptional coactivator p300 (70,71). Comparison of the newest ubiquitination mapping studies to similar global acetylation mapping efforts reveals that ϳ30% of sites previously identified as being acetylated were also found to be ubiquitinated (57,58,66,69). Bioinformatic analysis suggests that peptides from this overlapping group displayed less significant increases in ubiquitination upon proteasome inhibition (58). Of these ϳ1000 sites, a subset represent proteins involved in translation and mitochondrial metabolism, systems where these two modifications are known to exert significant regulation.
Despite the current data, further research remains to be done to ascertain the extent of cross-talk between lysine ubiquitination and acetylation. Defining the absolute and relative abundances of ubiquitination and acetylation at residues known to accommodate both modifications would shed light on a number of questions. A competitive mechanism whereby acetylation prevents ubiquitination through blocking available modification sites is mechanistically appealing, but in equilibrium would be expected to shift the stoichiometry of acetylation to very high levels for stabilized proteins. Although such a mechanism may operate for specific substrates or in certain subcellular locations, it seems unlikely to be broadly applicable. Because ubiquitinated and acetylated lysines often reside on surface-exposed patches, future work should consider whether individual acetylation events induce or inhibit ubiquitination through modulating interactions between ligases and substrates. More orchestrated mechanisms involving sequential addition and removal of modifications can also be envisioned but may only be defined through detailed time course and dose-response experiments with concurrent readouts of both pathways. A first step will be to understand the abundance of each modification in the basal state and the changes that occur following perturbation of the ubiquitination and acetylation systems with common proteasome and histone deacetylase inhibitors.
One limitation of current enrichment strategies is that tryptic digestion of substrate-bound ubiquitin-like modifiers NEDD8 and ISG15 generates K-GG signatures indistinguishable from those of ubiquitin. Although a detailed analysis comparing ubiquitin, NEDD8, and ISG15 levels has not been carried out in mammalian cells, work from yeast suggests that ubiquitin is expressed at Ͼ30ϫ higher levels (not accounting for UBI3/ UBI4 expression) than the yeast NEDD8 homolog Rub1 (72). Likewise, global proteomic profiling in nine mouse tissues revealed that peptides from ubiquitin were identified with markedly higher frequency than those from either NEDD8 or ISG15 (68). By extension, it has been argued that ubiquitin (as opposed to other ubiquitin-like modifiers) accounts for the vast majority of K-GG peptides identified during immunoaffinity enrichment. Spectral counting data from immunoaffinity enriched K-GG peptides lends some support to this conclusion, showing that K-GG peptides of ubiquitin far outnumber those from ISG15 or NEDD8. The utility of this data is limited, however, given that substrates modified by a single ISG15 or NEDD8 cannot be accounted for. To more directly assess the fraction of K-GG signature peptides resulting from neddylation, Kim et al. (57) took a two-pronged approach. In one arm of the study, SILAC was used to compare the abundance of individual K-GG peptides between untreated lysates and lysates incubated with the deubiquitinase USP2. USP2 has the ability to strip ubiquitin modifications from substrates, but not NEDD8 modifications, making it possible to assess the fraction of a given K-GG peptide derived from neddylation. In a complementary approach, a SILAC experiment was performed using MLN4924 to inhibit the NEDD8-activating enzyme NAE1. Based on these data, it was estimated that a maximum of 6% of K-GG peptides resulted from neddylation, even under conditions where the proteasome was inhibited and free ubiquitin pools were depleted (57).
Interestingly, because the ubiquitin E1 enzyme UBA1 is capable of charging NEDD8 (73), it is possible NEDD8 may substitute for ubiquitin as a covalent modifier of otherwise ubiquitinated substrates under conditions where the ubiquitin pool is depleted. For this reason, the finding that the K-GG peptide representing Lys-27-linked polyubiquitin was refractory to USP2 suggests two intriguing possibilities. Lys-27 is a fully conserved residue that plays a key role in maintaining structure of the ubiquitin fold. Future studies will hopefully reveal whether this form of polyubiquitin is indeed "invisible" to the deubiquitinating enzyme active site or rather that the Lys-27 polyubiquitin signature originates from a mixed chain of ubiquitin with either NEDD8 or ISG15.
Recent work has shown that a fraction of ubiquitinated peptides decrease in abundance after inhibition of the proteasome. These sites likely represent ubiquitination events that are not associated with proteasomal degradation. A subset of K-GG peptides might be expected to decrease in abundance because of degradation of their target substrates via the lysosomal pathway in response proteasome inhibition.
Another critical subset of ubiquitinated substrates expected to display this pattern would be proteins that function as storage depots for cellular ubiquitin. Ubiquitinated histones have historically been considered as such and have been shown to release ubiquitin in response to stress (74). Recent proteomic studies in yeast indicate that approximately half of all conjugated ubiquitin is bound to substrates in the form of monoubiquitin or multiple monoubiquitin (75). In conjunction with current data, one hypothesis is that reserves of conjugated ubiquitin may be more widely distributed in cells than originally appreciated. In principle, distribution of the cellular ubiquitin pool would offer the cell flexibility in maintaining equilibrium under a wider array of conditions. Emphasizing the importance of dynamics between the free and conjugated ubiquitin pools, depletion of cellular ubiquitin is a key mediator of toxicity during inhibition of protein synthesis (76).
K-GG peptide enrichment will be critical to future work in defining the substrates of ubiquitin ligase and deubiquitinating enzymes. Two recent studies have taken advantage of immunoaffinity-based K-GG enrichment in this capacity, focusing on the quality control ligase HRD1 (61) and more broadly at cullin-RING ligases (60). In each case, K-GG enrichment studies were used in parallel with other methods to characterize substrates of each. HRD1 is a RING E3 ligase that functions in the endoplasmic reticulum-associated degradation pathway, and mutations in HRD1 have been associated with autoimmune disorders. Lee et al. (61) performed a SILAC comparison of control and HRD1 small interfering RNA-treated cells to compare ubiquitination events. HRD1 substrates were identified as those proteins whose peptides were decreased in abundance following ligase knockdown, either in K-GG-enriched samples or following multistep protein level enrichment of ubiquitinated substrates. In their K-GG immunoaffinity purification analyses, ϳ1800 ubiquitination sites were identified on ϳ900 proteins. Although the majority did not change upon HRD1 small interfering RNA, a series of putative substrates including several previously characterized proteins such as MHC, CD44, and integrins showed consistent decreases of ϳ2-fold across three biological replicates. The two-pronged approach focusing on K-GG peptides and isolated ubiquitin conjugates in parallel is particularly appealing. As revealed in the study of HRD1, immunoaffinity enrichment of K-GG peptides provides a greater depth of sensitivity than protein level enrichment followed by in-gel digest. In contrast, analysis of isolated ubiquitin conjugates ensures that abundant, modified proteins are not overlooked because of ubiquitination in analytically challenging sequences that might not be well suited to either trypsin digestion or detection by the mass spectrometer. In ideal cases, complementary data were acquired through both methods as cross-validation.
In a study by Elledge and co-workers (60), K-GG enrichment was used to complement global protein stability profiling to examine substrates of the cullin-RING ligase family. As described by Kim et al. (57), these authors used the NAE1 inhibitor MLN4924 in a SILAC experiment to impair neddylation of and inactivate cullin E3 ligases. The results revealed that ϳ10% of the ϳ10,000 unique K-GG peptides decreased in abundance by at least 2-fold when neddylation was inhibited. Between biological replicates, Ͼ75% of these K-GG peptides were shown to decrease more than 2-fold in both. Global protein stability profiling provided an orthogonal validation strategy, reading out at the total protein level rather than at the level of ubiquitination. Particularly for rapidly turned over proteins, global protein stability profiling offers the opportunity to detect changes that would otherwise be overlooked if considering only ubiquitin-modified proteins or peptides.
Although proteasome inhibition stabilizes labile substrates and can improve detection, proteasome inhibitors themselves elicit biological changes that are worth considering when designing ubiquitination site mapping experiments. The cascade of events downstream of proteasome inhibition and consequent changes in transcription and translation may alter the substrates available to be modified, depending on the extent and duration of inhibition. Depletion of the free ubiquitin pool and the corresponding stress responses could likewise alter substrate prioritization by the ubiquitination and degradation machineries (75,77). Because proteasome inhibition can trigger cell death programs, an additional consideration is the activation of ubiquitin ligases that regulate apoptosis. Although proteasome inhibitors increase the overall abundance of many K-GG peptides, it remains to be seen whether such increases affect sensitivity toward pathwayspecific ubiquitination events of interest. The recent paper from Udeshi et al. (62) extends this one step further, assessing the effects of inhibiting deubiquination on the profile of ubiquitinated substrates. In that study, ϳ4900 ubiquitin-modified peptides were quantified including a subset for which rapid deubiquitination appeared to be a key regulatory process. More work remains to be done to see whether these cellular treatments improve the ability to determine enzyme-substrate relationships using K-GG immunoaffinity enrichment.
Given the large number of uncharacterized ubiquitin ligases, it is conceivable that many ubiquitination events only occur following a specific biological stimulus. Such stimuli have the potential to alter the activity, specificity, and localization of enzymes within the ubiquitin system. An example of allosteric modulation occurs in plant growth and development with the SCF TIR1 ligase. The dual binding of the plant hormone auxin and inositol hexakisphosphate activates SCF TIR1 , stimulating ligase activity by enhancing substrate binding to the TIR1 F-box protein (78). Similar observations have been made with regards to SCF FBXL5 , a ligase in which the F-box component directly senses iron levels through its hemerythrin-like domain as a means of modulating its activity (79,80). In the case of mammalian DNA damage signaling, recruitment of ubiquitin ligases such as RNF8, RNF168, and BRCA1 to sites of DNA damage is initiated through phosphorylation of H2AX (81). The activity of ubiquitin ligases toward their substrates may also be controlled via post-translational modification of the substrate, as in the case of the cellular oxygen sensor HIF1␣ by the VHL-Elongin BC-CUL2 ligase complex (82) or the prosurvival protein Mcl1 by the SCF Fbw7 ligase (83,84). Although K-GG immunoaffinity enrichment studies thus far have revealed important insights into the function of ubiquitin ligases in the basal state, connecting biological stimuli to E3 ligase activity and substrate targeting remains an area of significant interest.
Alkylating Reagents, C-terminal Lysines, and Pseudo K*GG Peptides-Iodoacetamide is commonly used in mass spectrometry proteomics to alkylate cysteine residues and create uniform, nonreactive populations of cysteine containing peptides for analysis. Previous work has also shown the potential for iodoacetamide to generate undesired modifications on amino acid side chains, as well as the N and C termini of peptides (85). The elemental composition of these modifications is C 2 H 3 N 1 O 1 , the same as the amino acid glycine. Because artifactual alkylation can occur twice on free amines such as the N terminus of a peptide or the side chain of lysine, it was recently reported that these modifications could be mistaken for diglycine remnants of ubiquitin (86). Previous work has shown that modification of free amines can be significant when peptides are incubated with iodoacetamide at high temperatures for extended periods of time (85). The use of chloroacetamide in place of iodoacetamide can limit, but does not eliminate, these artifactual 114.0429-Da lysine modifications. In consideration of these findings, detailed studies have been carried out using peptides from ubiquitin (24). This work has shown that for 1-h reactions at 37°C in 50 mM iodoacetamide, ϳ0.004% of ubiquitin peptides are adducted, although lower concentrations (10 -30 mM) or lower temperatures (21°C) decreased adducts to undetectable levels.
For existing data sets, a number of features can be used to identify the presence and determine the extent of acetamide adducts. As originally reported, bona fide K-GG peptides and doubly acetamide adducted "pseudo-K-GG" peptides are typically resolved by standard reversed phase HPLC gradients (86). For samples where enrichment of ubiquitin or the ubiquitin-modified substrate has been performed at the protein level, unmodified peptides from ubiquitin represent among the most abundant peptides in the sample. The absence of peak doublets for the predicted K-GG peptides of ubiquitin provides evidence against the presence of pseudo-K-GG peptides in the larger data set. Another analytical diagnostic is the presence of confidently scoring peptide spectral matches displaying pseudo-K-GG signatures on C-terminal lysines (87). Because trypsin cannot cleave at lysines modified by ubiquitin (88 -90), a fully cleaved C-terminal K-GG signature may stem from post-digestion alkylation by iodoacetamide. We have found that ubiquitin peak doublets and Cterminal pseudo-K-GG peptides effectively diagnose samples that contain prevalent internal pseudo K-GG peptides. Interestingly, these more deceptive pseudo-K-GG peptides with the adduct on an internal lysine are themselves distinct from K-GG spectra, because they display a high frequency of Ϫ57 and Ϫ114 Da neutral losses, as seen the HCD MS/MS shown in Fig. 2A. This feature is uncommon among bona fide ubiquitinated peptides. Recent data have further shown that K-GG peptides are enriched Ͼ1000-fold relative to adducted peptides during immunoaffinity purification (Fig. 2). While in depth inspection of chromatographic and spectral data is paramount, careful application of reduction and alkylation procedures and the post-analysis diagnostic checks make the continued use of iodoacetamide a viable option for mapping ubiquitination sites.
Although current database search algorithms efficiently match post-translationally modified peptides, work on phosphorylation from many groups has shown that they fall short when it comes to determining the precise position of the modification within the peptide (91). One benefit of mapping ubiquitination sites in trypsin-digested samples, over modifications such as phosphorylation, is that the frequency of internal lysine residues is relatively low. Nonetheless, the mislocalization of diglycine signatures to C-terminal lysine residues still occurs when using standard search algorithms. Along with the recently published K-GG data sets, the field has seen the extension of tools, originally designed to localize phosphorylation modifications based on site-determining ions, for use with K-GG-modified peptides (57,58). Even in the absence of a corrective tool, we routinely notice that peptide spectral matches proposed to carry C-terminal K-GG signatures commonly have an unmodified lysine residue within several positions of the C terminus, permitting efficient manual relocalization. To minimize false positive identifications where the reported peptide displays a K-GG signature on the C-terminal lysine, these peptide spectral matches have customarily been omitted from final lists of modified peptides. Because the presence of peptides with C-terminal K-GG signatures within a data set can be diagnostic and informative, we would instead argue in favor of marking and retaining these putative false positives within systematically filtered data sets to permit post hoc data assessment. An additional consideration is that in the rare cases when a protein sequence ends in a C-terminal lysine that is modified by ubiquitin, discarding these K-GG peptides would be inappropriate.
Along with the emergence of large scale K-GG data sets has come the need for robust and efficient methods of validating substrate-specific ubiquitination events. Although the reliance on automated tools for peptide spectral matching and site localization will remain for some time, interaction with the MS and chromatographic results remain essential to identifying systematic issues at the raw data level. Moving into the future, spectral libraries of validated K-GG-modified peptides, along with their relative retention times, would be a valuable resource to biology focused groups that will study individual substrates in detail. Likewise, isotopically labeled internal standard peptides that can be used to assay critical ubiquitination events would be valuable, as they have been in characterizing polyubiquitin linkages (21,23). Methods that combine site-specific elimination of ubiquitin-modified lysines by mutagenesis, with focused, multiplexed mass spectrometry assays that read out the modified and unmodified forms of a protein should be expected to emerge as a means of assessing the temporal dimension of substrate-specific ubiquitination. Experimental systems capable of high throughput expression and enrichment method will likewise be valuable, as they begin to provide both Western blot and mass spectrometry validation of fleeting ubiquitination events demonstrated in other systems. Ultimately, however, orthogonal validation methods and functional readouts in physiologically relevant systems remain the gold standard. The multipronged strategy utilized to define HRD1 (61) substrates and the orthogonal functional validation of SCF substrates (60) provide a template that can be replicated moving into the future, whereas the growing availability of genetically engineered mouse models promises to open the door to true in vivo exploration.
Current Challenges in Ubiquitination Site Mapping-While ubiquitination site mapping by LC-MS/MS and database searching has revealed thousands of modification sites, bottom-up approaches have inherent limitations. By their nature, the proteolysis event that generates the branched K-GG signature peptide effectively destroys information about which polyubiquitin linkages were attached at the substrate modification site. Similarly, it is difficult to ascertain whether ubiquitination events at nonadjacent lysines co-exist or represent modifications on separate members of a population. Methods that permit analysis of concurrent modifications on a single protein molecule will open the door to connecting functions with combinatorial modification states (93). Middle-down approaches with partial proteolysis have shown potential with regards to characterizing intact polyubiquitin chains (25), although throughput has currently limited widespread application. Extension of these and top-down methods capable of distinguishing isoforms will be essential to determining the extent to which a "ubiquitin code" directs cellular processes.
A complementary approach involves serial application of linkage specific enrichment at the protein level, followed by K-GG peptide immunoaffinity enrichment at the peptide level. Such studies have thus far been limited by the amounts of protein captured during high stringency linkage-specific immunoprecipitation and by the observation that an unexpectedly large fraction of substrates may be modified by more than a single polyubiquitin linkage (17,21). Nonetheless, further improvements in enrichment methods and mass spectrometry will better enable researchers to probe these complex ubiquitin signals.
Another challenge associated with bottom-up mapping of ubiquitination sites is that many occur within analytically in- accessible sequences. Dual digestion methods where ubiquitinated proteins are serially digested, first with trypsin or ArgC to generate the K-GG signature and subsequently by an alternative enzyme like GluC, chymotrypsin, or pepsin, may be useful for mapping ubiquitination sites in regions of sequence where lysines and arginines are sparse. Because generation of the K-GG signature requires cleavage C-terminal to an arginine residue, modifications residing in highly basic regions are also inherently insensitive to trypsin-based mapping. Even when generated, the chromatographic profiles and irregular MS/MS spectra for highly basic peptides can pose additional challenges. Chemical modification methods that block lysines and convert trypsin into an arginine-specific protease hold potential for revealing modifications in these situations, as do second generation antibodies with the ability to recognize alternative ubiquitin remnants longer than the K-GG signature. Such antibodies, along with alternative digestion methods, would be useful in dispelling ambiguity between ubiquitin, NEDD8, and ISG15, while making it possible to systematically investigate other ubiquitin-like modifiers such as SUMO1/2/3.
Unexpected post-translational modifications pose a particular challenge in the context of matching K-GG peptides by traditional database search approaches. It remains unclear how frequently post-translational modifications within a protein sequence trigger ubiquitination of nearby lysine residues such that tryptic peptides contain both modifications. The existence of phosphodegrons (94) and the role of the ubiquitin-proteasome system in degrading oxidatively damaged proteins (95) suggest that certain ubiquitination events may only be identified if the correct concurrent modification is considered. Error tolerant or wildcard style searches have thus far provided only a limited solution. Application of automated, iterative search algorithms and advanced tools for discerning true positives from false positive matches in such searches should provide improvements in this area.
An area of acute interest within the ubiquitin field lies in the investigation of non-lysine ubiquitination. Multiple reports have described ubiquitination on residues such as cysteine, serine, and threonine (6,7,96,97), although to date, there remains a dearth of mass spectral data to support the existence of ester-and thioester-linked ubiquitination. In the case of cysteine-linked thioester ubiquitin, the basic biochemistry of the ubiquitin system provides a strong mechanistic precedent. HECT and IBR/RBR family E3s are modified on cysteine residues as part of their biological activities, mandating that E2 enzymes have the capacity to transfer ubiquitin to certain cysteine residues. The inherent instability of cysteine-linked thioesters has been a limitation in structural studies (98) and likely explains the difficulty in detecting them by mass spectrometry.
Consensus is less clear for ester-linked ubiquitination through serine or threonine, because focused efforts toward mapping ester-linked -GG peptides have been undertaken by several groups without definitive resolution (97). It remains possible that ester-linked diglycine is unstable under standard LC-MS conditions or that the exact nature of this bond remains to be elucidated. Arguing against analytical instability, we have observed Ser-GG peptides from a ubiquitin charged E2 variant (C 3 S) under standard LC-MS conditions. 2 Database searches for Ser/Thr-GG peptides in single protein ubiquitination analyses and among immunoaffinity enriched K-GG peptides have not yielded evidence to support the existence of ester-linked ubiquitin remnants. Definitive mass spectrometry evidence showing the biological nature of interaction between ubiquitin and Ser/Thr will be essential to establishing the significance of this modification in cellular regulation.
Quantitative analysis of specific ubiquitination sites has been performed using both metabolic labeling and isotopically labeled internal standards. With growing interest in multiplexing approaches, chemical labeling of K-GG peptides also has potential for dose-response and temporal profiling studies (99). One current challenge is that labeling must be performed following K-GG enrichment. The amount of starting material required to identify ubiquitination sites by K-GG enrichment is beyond the practical scale of iTRAQ or TMT labeling methods. Optimization studies have shown that for mouse brain lysate, the number of unique K-GG peptides identified correlates with input material up to ϳ40 mg. 2 Even if peptides were labeled prior to enrichment, modification of the N terminus of the K-GG signature itself by an amine reactive multiplexing reagent would interfere with immunoaffinity enrichment. Preliminary studies involving post-enrichment labeling have been successful by coupling carefully controlled sample handling and internal standards that report on IP efficiency across samples. It is unclear yet whether recent improvements in mass spectrometry methods for performing multiplexed chemical labeling (100, 101) will be sensitive enough for analysis of enriched K-GG peptides. Future advances in K-GG peptide enrichment and multiplexed quantitation hold great promise for characterization of ubiquitination in vivo where the need for replicate measurements is a paramount consideration.
While the published work using K-GG immunoaffinity enrichment has focused on identifying ubiquitination events globally from whole cell lysates, it seems likely that these methods may have additional utility in the analysis of a single proteins purified from eukaryotic cells. Identification of ubiquitination sites on individual substrates remains a challenge, particularly for proteins that are large in size or are substoichiometrically modified on many lysines. For these studies, two main strategies seem to hold promise. Performing substrate-specific enrichment by immunoprecipitation prior to enriching K-GG peptides is a viable strategy, particularly when the substrate is of relatively low abundance compared with other cellular targets of ubiquitin. For cells overexpress-ing a putative substrate, another approach might involve using targeted MS methods such as multiple reaction monitoring-initiated detection and sequencing workflow (MIDAS) (92) or inclusion list approaches to focus specifically on ions representative of K-GG peptides from the substrate. In this approach, pre-enrichment of the target would not necessarily be a prerequisite to K-GG immunoaffinity enrichment. Replacing in gel separation with K-GG immunoaffinity enrichment might be expected to increase sensitivity toward individual K-GG peptides by minimizing the losses stemming from postdigest extraction, avoiding dilution of individual peptides across multiple gel regions, and eliminating competition between modified and unmodified peptides in data-dependent MS analysis.
The extent to which K-GG immunoaffinity reagents can be used to identify low level substrate peptides from limited sample amounts remains uncharacterized. The presence of abundant polyubiquitin species represents a challenge, as these species can suppress ionization of lower abundance K-GG peptides. Although preclearing abundant K-GG peptides of polyubiquitin and histones might be possible, it seems more likely that the number of possible species may preclude establishment of a sufficiently comprehensive depletion reagent. Using linkage-specific antibodies for protein level depletion would seem particularly suboptimal, as this would altogether remove the ubiquitinated substrates covalently attached to polyubiquitin chains. Instead, fractionation of K-GG peptides before or after immunoaffinity enrichment seems the likely path forward for characterization of low abundance species (58,62). The existing data suggest that it is now possible to detect ubiquitination events on important regulatory proteins, albeit from large amounts of starting material. Future work will focus on pushing the limits of detection and utilizing these new methods to understand the ever expanding roles of ubiquitin.