A novel proteomics approach to identify SUMOylated proteins and their modification sites in human cells

: The small ubiquitin-related modifier (SUMO) is a small group of proteins that are reversibly attached to protein substrates to modify their functions. The large-scale identification of protein SUMOylation and their modification sites in mammalian cells represent a significant challenge due to the relatively small number of in vivo substrates and the dynamic nature of this modification. We report here a novel proteomics approach to selectively enrich and identify SUMO conjugates from human cells. We stably expressed different SUMO paralogs in HEK293 cells, each containing a 6xHis tag and a strategically located tryptic cleavage site at the C-terminus to facilitate the recovery and identification of SUMOylated peptides by affinity enrichment and mass spectrometry. Tryptic peptides with short SUMO remnants offer significant advantages in large-scale SUMOylome experiments including the generation of paralog-specific fragment ions following CID and ETD activation, and the identification of modified peptides using conventional database search engines such as Mascot. We identified 205 unique protein substrates together with 17 precise SUMOylation sites present in 12 SUMO protein conjugates including three new sites (K380, K400 and K497) on the protein promyelocytic leukaemia (PML). Label-free quantitative proteomics analyses on untreated and arsenic trioxide-treated cells revealed that all identified SUMOylated sites of PML were differentially SUMOylated upon stimulation.


Introduction
The small ubiquitin-like modifier (SUMO) proteins are structurally similar to ubiquitin although they share less than 20% sequence identity [1]. Like ubiquitylation, protein SUMOylation is regulated by a cascade of reactions involving SUMO-activating enzymes (SAE1/SAE2), conjugating enzymes (Ubc9) and one of several SUMO-E3 ligases (e.g. PIAS1, PIAS3, PIASxα, PIASxβ, PIASy, RanBP2 and Pc2) that covalently attach SUMO to specific protein substrates [2,3]. SUMO proteins are expressed as an immature proform that comprise an invariant Gly-Gly motif followed by a C-terminal stretch of variable length (2-11 amino acids). Removal of this C-terminal extension by sentrin-specific proteases (SENPs) to expose the di-glycine motif is necessary for the conjugation of SUMO to protein targets. These SUMO proteases are able to cleave both a peptide bond during the formation of mature SUMO, and an isopeptide bond to deconjugate modified protein substrates [4]. This covalent modification arises from the formation of an isopeptide bond between the ε-amino group of a lysine within the protein substrate and the C-terminus carboxy group of the SUMO glycine residue. SUMO conjugation frequently occurs at the lysine residue within the consensus motif ψKxE (where ψ is an aliphatic residue and x any amino acid) that is recognized by Ubc9 [5,6].
rare exceptions or reflect the presence of other E2-conjugating enzymes is presently unknown.
In lower eukaryotes, a single SUMO gene is expressed (Smt3 in Saccharomyces cerevisiae), whereas in vertebrates three paralogs designated as SUMO1, SUMO2 and SUMO3 are ubiquitously expressed in all tissues. The human genome also encodes a forth gene for SUMO4 that seems to be uniquely expressed in the spleen, lymph nodes and kidney [12]. However, its role remains enigmatic, as its in vivo maturation into a conjugation-competent form still remains unclear [13]. Interestingly, SUMO2 and SUMO3 share 97% sequence identity, and are expressed at much higher levels than SUMO1, with which they only share about 50% identity [1]. Although SUMO paralogs use the same conjugation machinery and have partial overlapping subsets of target proteins, they respond differently to stress [14] and can be distinguished by their ability to form self-modified polymers in vivo and in vitro [15,16]. SUMO1 lacks a consensus modification site and does not form polySUMO1 chains in vivo, although RanBP2 was reported to be hypermodified by SUMO1 chains in vitro [17]. In contrast, SUMO2 and SUMO3 can form polymeric chains in vivo and in vitro through their consensus motif [15], whereas SUMO1 forms terminating chain on poly-SUMO2 or poly-SUMO3 conjugates [16].
Protein SUMOylation is an essential cellular process conserved from yeast to mammals and plays an important role in the regulation of intracellular trafficking, cell cycle, DNA repair and replication, cell signaling and stress responses [2,18,19]. Protein SUMOylation imparts significant structural and conformational changes on the substrate proteins by masking and/or by conferring additional scaffolding surfaces for protein interactions. At present, a few hundred protein substrates are known to be SUMOylated in vivo. These protein targets include regulators of gene expression (e.g. transcription factors, co-activators or repressors) as well as oncogenes and tumor suppressor genes, such as promyelocytic leukaemia (PML), Mdm2, c-Myb, c-Jun, and p53 whose misregulation leads to tumorigenesis and metastasis [20]. There is growing evidences of cross-talk between protein SUMOylation and ubiquitylation processes [21,22]. Earlier reports indicated that SUMOylation can antagonize the ubiquitylation of NFκB [23] whereas recent data also suggest that SUMOylation can be a prerequisite for ubiquitylation and subsequent proteasome-dependent degradation. A case in point is the identification of RNF4, an E3 ubiquitin ligase that specifically recognizes and ubiquitinylates polySUMO-chains of PML [24,25]. Interestingly, PML SUMOylation can also be enhanced using arsenic trioxide (As 2 O 3 ), a therapeutic agent used for the treatment of acute promyelocytic leukemia (APL) [25][26][27][28].
The relatively low abundance of protein SUMOylation is a significant analytical challenge for the identification and quantitation of this modification in vivo.

Results
To facilitate in vivo identification of SUMOylated proteins, we developed pcDNA-His-SUMO expression vectors comprising strategically located mutations at the C-terminus of each SUMO paralog (Figure 1a). These mutations confer important properties to the stably expressed protein products. First, His 6 -SUMO mutants with an Arg substitution introduce a convenient tryptic cleavage site on the side chain of modified Lys residues whereby individual paralog can be identified by mass-specific signature fragment ions.
Second, the short five amino acid segment appended to the modified Lys residues (e.g.

His 6 -SUMO mutants are functional and can be use to monitor protein SUMOylation in vitro and ex vivo
To determine that site-directed mutagenesis did not impair the transfer of His 6 -SUMO mutants by the SAE1/2-Ubc9 conjugation machinery, we conducted in vitro assays using well-established protein SUMOylation substrates. We compared the SUMOylation profiles of His 6 -SUMO-1/2/3 wild type (WT) and the corresponding mutants using RanGAP1 and the ubiquitin-conjugating enzyme E2 (E2-25K), two proteins that are SUMOylated ex vivo and in vitro [11,40]. An intact E2-25K and a GST-tagged C- proteins in APL cells [25][26][27][28]. Three SUMOylation sites (K65, K160 and K490) within PML have been reported previously [43] though only K160 is required for As 2 O 3triggered degradation [28,44]. Immunoblots showed increased polySUMOylation of PML for the His 6 -SUMO WT and mutants upon As 2 O 3 treatment (Figure 3a Figure 4).
Altogether, these experiments established that His 6 -SUMO mutants have functional characteristics similar to those of their WT counterparts.

Subcellular distribution and induction of protein SUMOylation by As 2 O 3
To determine the global distribution of SUMOylated proteins, we performed subcellular fractionation to isolate cytosol and nuclear extracts from HEK293 cells stably expressing WT or mutant His 6 -SUMO. Immunoblot analyses of these extracts using anti-His antibody revealed that a higher proportion of SUMOylated proteins was found in nuclear fractions of cells expressing His 6 -SUMO1 and His 6 -SUMO3 mutants (Figure 4a, lanes 4 and 6). While polySUMOylation chains were observed for high molecular weight bands of these two paralogs, higher polymerization levels were noted for proteins modified with His 6 -SUMO3 consistent with previous reports [48]. Interestingly, free His 6 -SUMO1 and His 6 -SUMO3 were more abundant in the cytosol compared to nuclear extracts ( Figure 4a, lanes 3 and 5) as previously noted by Seeler et al [49]. It is noteworthy that anti-His immunoblots also revealed the presence of non-specific proteins in nuclear extracts of mock HEK293 cells (Figure 4a, lanes 1 and 2). MS analyses of these NTApurified nuclear extracts (see below) identified several non-specific proteins including Forkhead box and homeobox proteins, POU domain transcription factors, Histidine triad nucleotide-binding proteins, that contain multi-His sequences and Zn metal-binding proteins known to bind to Ni 2+ ions (Supplementary table 1).
Overall changes in protein SUMOylation were also evaluated in NTA-purified nuclear extracts from cells treated or not with As 2 O 3 (Figure 4b).

Large scale identification of protein SUMOylation
To identify SUMOylated proteins present in nuclear extracts, we performed large scale NTA-affinity purification experiments from HEK293 cells expressing His 6 -SUMO3 mutant exposed or not to As 2 O 3 . Similar experiments were also performed on mock HEK293 cells to identify proteins binding non-specifically to the NTA affinity column. We typically obtained 40-60 μg of NTA-purified proteins from 10 8 HEK293 cells in any of the conditions and biological replicates examined. Proteins extracts following NTA purification (2 μg/injection) were subjected to MS analyses using a nano 2D-LC system (SCX/C 18 ) coupled to a LTQ-Orbitrap Velos instrument. Tandem mass spectra were acquired using CID and ETD in a decision tree manner to enhance the overall number of identification [38]. In total, we acquired more than 15,000 MS/MS spectra corresponding to 6282 unique peptides identified using Mascot database search engine.
To reduce the number of ambiguous identification, we compared proteins that were identified by at least 2 peptides in each condition with a FDR of less than 2 %. By using these conservative selection criteria, we identified a total of 639 unique proteins, of which 232 proteins (36%) were common to all three different cell extracts (Figure 5a).  We also developed a script to make use of the specific SUMO fragment ions in order to retrieve all MS/MS spectra of potential SUMOylated peptide candidates (Experimental procedures). Altogether, we identified 17 unique SUMOylation sites on 12 different protein substrates from these large-scale proteomics experiments (Table 1) To profile proteins that showed differential regulation upon cell treatment with As 2 O 3 , we fragment ions arising from the cleavage of the SUMO3 mutant side chain (e.g. c 2 *, c 3 *, c 4 *). As indicated in Figure 5b, the abundance of this peptide was increased in samples from cells treated with As 2 O 3 . Residue K490 is one of three known PML SUMOylation sites, the other two being K65 and K160 [43]. However, we could not identify tryptic peptides harboring these two modified residues in any of the cell extracts examined, presumably due to the relatively large molecular weight and hydrophobicity of the corresponding peptides that precluded their successful separation by C 18 chromatography.
Our MS analyses also revealed the presence of three new PML SUMOylation sites at K380, K400 and K497. Residues K380 and K400 were previously identified as sites of polyubiquitylation in response to As 2 O 3 [25]. Site-directed mutagenesis indicated that mutation of K400 delayed but did not prevent PML ubiquitylation and its subsequent proteasome-mediated degradation [25]. Residues K380 and K400 are located between the B box domains and the nuclear localization sequence (NLS) whereas K497 is next to the NLS of PML (Figure 6a). To confirm the identification of these new SUMOylation sites, we first examined possible site-specificities of different SUMO paralogs with transfected PML and SUMOylation-site mutants thereof. We compared the ex vivo SUMOylation efficiency by each SUMO WT paralog in PMLIII WT and a PMLIII 3K mutant (K65R, K160R and K490R) in extracts from HEK293 cells co-transfected with the different PML and SUMO constructs (Figure 6b). Anti-PML immunoblots showed an increase in PMLIII WT SUMOylation by SUMO2 and SUMO3 when cells were exposed to As 2 O 3 , consistent with results presented in Figure 3a. This was clearly evidenced for the same protein extracts purified using NTA columns (Figure 6b, His pull down). It is noteworthy that similar experiments performed using more sensitive ECL immunoblots revealed the SUMOylation of PML by SUMO1, but to a lower level than that observed for SUMO2 and SUMO3 (Supplementary Figure 6). In all cases, PML showed significantly higher SUMOylation levels by SUMO3 compared to other SUMO paralogs.
In contrast, PMLIII 3K displayed one band in immunoblot analysis of extracts from cells co-transfected with PMLIII 3K and His 6 -SUMO WT treated or not with As 2 O 3 (see input  Table 2).
The subcellular localization of SUMOylated proteins using immunoblotting and immunofluorescence experiments revealed that a large proportion of substrates are nuclear, an observation that also account for the significant role of this modification in transcription, DNA repair, nuclear bodies and nucleocytoplasmic transport. This distribution is partly attributed to the enrichment of SUMO-modifying enzymes in this compartment, although a sizable number of substrates are also present in the cytoplasm, plasma membrane, mitochondria and endoplasmic reticulum [51]. We performed large-scale proteomics analyses of nuclear protein extracts from mock and HEK293 cells stably expressing His 6 -SUMO3 mutant to identify the nature of SUMOylated substrates including those that could be regulated by As 2 O 3 . By using strict comparison criteria, we found more than 205 proteins unique to the His 6 -SUMO3 mutant such as proteins involved in chromatin remodeling, organelle organization, and nuclear transport (Supplementary Figure 5). Interestingly, we found several proteins involved in the regulation of ribosome biogenesis including hnRNP proteins, RNA helicases, and ribosomal subunits, suggesting that SUMO3 modification may regulate the assembly of these macromolecular complexes. Recent reports indicated that several of these substrates were identified in nucleolus extracts and appeared to be suppressor through the regulation of p53 response to oncogenic signals [54].
Quantitative proteomics revealed that PML showed more than 15-fold increase in abundance upon cell stimulation with As 2 O 3 . In response to As 2 O 3 , PML is phosphorylated through the mitogen-activated protein kinase (MAPK) pathway leading to its transfer from the nucleoplasm to the nuclear matrix, and to an increase in PML SUMOylation and NBs size [44,55] Interestingly, a PMLIII 3K where all three known sites of SUMOylation were mutated to Arg is still transferred to the nuclear matrix but is resistant to As 2 O 3 -induced PML degradation [44]. The exact mechanism by which PML is transferred to the nuclear matrix in a SUMO-independent manner upon As 2 O 3 treatment is still unclear, but could involve its prior phosphorylation. It is noteworthy that the SUMOylated forms of PML were barely detectable when total extracts from control and As 2 O 3 -treated cells expressing PMLIII 3K and SUMO paralogs were analyzed by immunoblot with anti-PML antibody ( [44] and Figure 6b). Furthermore, we observed that NTA enrichment of protein extracts revealed residual SUMOylation of PMLIII 3K by SUMO3 and that SUMOylation of PMLIII 3K by SUMO2 and SUMO3 increased in response to As 2 O 3 . Our data demonstrated that As 2 O 3 -mediated SUMOylation of PMLIII could still occur at sites other than the three known residues K65, K160 and K490. Detailed proteomics analyses enabled the unambiguously identification of K380, K400, and K497 as additional SUMOylation sites regulated by As 2 O 3 treatment. Interestingly, two of these sites (K380 and K400) were previously shown to be ubiquitylated in vitro [24].