Methylation-dependent SUMOylation of the architectural transcription factor HMGA2

High mobility group A2 (HMGA2) is a chromatin-associated protein involved in the regulation of stem cell function, embryogenesis and cancer development. Although the protein does not contain a consensus SUMOylation site, it is shown to be SUMOylated. In this study, we demonstrate that the first lysine residue in the reported K66KAE SUMOylation motif in HMGA2 can be methylated in vitro and in vivo by the Set7/9 methyltransferase. By editing the lysine, the increased hydrophobicity of the resulting 6-N-methyl-lysine transforms the sequence into a consensus SUMO motif. This posttranslational editing dramatically increases the subsequent SUMOylation of this site. Furthermore, similar putative methylation-dependent SUMO motifs are found in a number of other chromatin factors, and we confirm methylation-dependent SUMOylation of a site in one such protein, the Polyhomeotic complex 1 homolog (PHC1). Together, these results suggest that crosstalk between methylation and SUMOylation is a general mode for regulation of chromatin function. © 2021 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


Introduction
Protein structure and function is regulated by post-translational modifications (PTMs), and over the past decades, lysine has emerged as a crucial amino acid residue being subject to SUMOylation, acetylation, and methylation. Lysine methylation is known to regulate gene transcription by targeting histones, but also a number of transcription factors. SET (Su(Var)3e9, Enhancer of Zeste and Trithorax) domain-containing proteins are able to transfer one, two, or three methyl groups to lysine residues. SET7/9, which was initially found to monomethylate histone H3 lysine 4 [1], was subsequently shown to also target a variety of non-histone proteins, especially transcription-related factors such as TAF10 [2], p53 [3], pRB [4], ERa [5], SOX2 [6], FOXO3 [7] and several others. Lysine methylation of histones provides a docking site for association of certain "reader" domains, like chromo, tudor, and PHD domains [8], in line with the histone code hypothesis [9]. Methylation of non-histone chromatin factors is important for regulating their activity and interactions [10] and could provide a platform for therapeutic targeting [11].
The high mobility group A (HMGA) family of proteins contains three members: HMGA1a and HMGA1b, both encoded by the HMGA1 gene, and HMGA2. The HMGA family is characterised by three conserved DNA-binding AT-hooks and an acidic tail. They preferentially bind to the minor groove of DNA [12] and are often referred to as "architectural transcription factors'' [13] because they regulate the expression of a large number of target genes by altering chromatin structure [14]. HMGA2 is frequently disrupted or aberrantly expressed both in benign tumours [15,16] and in cancers [17], and is one of the most frequently rearranged genes in human neoplasias.
HMGA2 can be modified by SUMO (small ubiquitin-related modifier) [18]. SUMO is covalently coupled to target proteins in a three-step process including an E1 activating enzyme (AOS1-UBA2), an E2 conjugating enzyme (Ubc9), and an E3 ligase (PIAS, Pc2) [19]. SUMOylation is reported to modulate protein function through nucleocytoplasmic translocation, and proteineprotein and proteineDNA interaction [20e25]. The majority of the presently identified SUMO targets are transcription factors or co-regulators. Thus, SUMOylation has turned out to be a key player in shaping of the epigenetic landscape, and a major regulator of pluripotency, stemness and cell identity [26]. In this study, we show that HMGA2 is lysine-methylated in vitro and in vivo, and that this PTM is required for its subsequent SUMOylation. This is the first report showing a direct interdependency between methylation and SUMOylation, mediated through editing of the actual SUMO modification site. Importantly, such sites are found in many chromatin-associated proteins, indicating a widely used coupling of methylation and SUMOylation.
In vitro methyltransferase assay. Recombinant, active SET7/9 protein (Upstate) was mixed with equal amounts of purified GST or GST-HMGA2 protein on glutathione-Sepharose beads using either S-Adenosyl-L-[methyl-14 C]methionine (Amersham) or S-adenosylmethionine (NEB) as the methyl donor. After incubation at 4 C for 1 h, the proteins were separated by 4e12% SDS-PAGE and Coomassie blue stained.
In vitro SUMOylation assay. This assay was performed with the SUMOlink™ SUMO-1 kit (40120, Active Motif) as described by the manufacturer.

Results
HMGA2 is lysine methylated in vitro. As an increasing number of chromatin factors have been shown to be methylated, we used the MeMo prediction server [28] and identified a putative site for lysine methylation in HMGA2 at position 66 (K66), in a region conserved from zebrafish to humans (Fig. 1A). This was confirmed by in vitro methylation of a recombinant GST-HMGA2 fusion protein by the lysine methyltransferase SET7/9, known to prefer non-histone substrates such as TAF10 and p53 [1,3]. Using S-Adenosyl-L-[methyl-14 C]methionine as methyl donor, a clear autoradiographic signal was observed with GST-HGMA2, but not the GST control ( Fig. 1B). Substitution of K66 with alanine (K66A) completely abolished the methylation signal, confirming that lysine 66 in HGMA2 is the methylation site (Fig. 1C). The methylation was not affected by truncation of the C-terminal portion of the protein (GST-HMGA2-Tr), a mutation that is observed in cancer [17,29]. The positive control p53 seems to be a better substrate for SET7/9 ( Fig. 1C), although being a truncated version.
HMGA2 is lysine methylated in mammalian cells. We investigated the methylation status of HMGA2 in vivo by studying a mesenchymal stromal cell line (hMSC-Tert20) that ectopically overexpresses HMGA2 fused to EGFP [30]; (Fig. 1D, lane 2). Immunoprecipitation of HMGA2 followed by Western blotting using a pan-methyl-lysine antibody showed that the protein is methylated in this cell line. This result was further confirmed in a liposarcoma (SW872) and an osteosarcoma (OSA) cell line (Fig. 1E). HMGA2 is amplified and rearranged in OSA [31], resulting in a truncated protein that still maintains K66. In both the liposarcoma and osteosarcoma cell lines the proteins were strongly methylated, confirming that this modification occurs in vivo. The methylation level was similar in both cell lines, and therefore not affected by the truncation of the C-terminal part [29], consistent with the in vitro results (GST-HMGA2-Tr; Fig. 1C).
HMGA2 directly interacts with SET7/9. Co-immunoprecipitation experiments in the hMCS-Tert20 cell line overexpressing HMGA2 showed that HMGA2 and SET7/9 physically interact (Fig. 1F). To determine whether this interaction is direct, wild type HMGA2 fused to GST or GST alone was challenged with recombinant SET7/9 in a pull-down assay. This showed that SET7/9 interacted directly with HMGA2 in vitro (Fig. 1G), and not with GST alone.
Methylation of K66 is required for SUMOylation in vitro. HMGA2 is reported to be SUMOylated and lysines 66 and 67 are required for SUMOylation [18]. This sequence (KKAE) is not a canonical SUMOylation motif (JKXE/D) [32]. However, we reasoned that methylated K66 might mimic a bulky, hydrophobic group (J) and thus create a functional SUMOylation motif. Indeed, premethylated recombinant HMGA2 was SUMOylated in vitro ( Fig. 2A). GST-HMGA2, which when unmodified migrates as a 40 kDa protein (Fig. 1B), now migrated as a 50 kDa protein ( Fig. 2A). By replacing K66 with a hydrophobic valine residue, creating a consensus SUMO motif, the K66V mutant was SUMOylated directly and did not require any prior methylation. This contrasted wildtype HMGA2, which could not be SUMOylated directly ( Fig. 2B; lower panel). However, when HMGA2 was methylated ( Fig. 2B; middle panel), this allowed SUMOylation ( Fig. 2B; upper panel). Mutating the SUMO acceptor lysine to an alanine (K67A) did not affect methylation of HGMA2. However, it completely abrogated HGMA2 SUMOylation, as expected. The same was seen with the non-methylatable K66A (Fig. 2B).
Methylation of HMGA2 may facilitate interaction with Ubc9. The interaction surface between the SUMO E2 conjugase Ubc9 and its targets has been investigated, using both biochemical [33,34] and structural approaches [35,36]. Nevertheless, the J position in the SUMOylation motif has not been given much attention. The Ubc9 surface is rather flat, and the hydrophobic residue at the J position does not dock into a conserved pocket [35,37]. Instead, it is thought that sequence conservation at this position arises from exclusion of hydrophilic residues. Indeed, a huge variety of hydrophobic side chains (V, I, L, M, A, F, and P) have been reported in this position. For the canonical SUMOylation target RanGAP1, the side chain atoms of Leu525 in the J position have been shown to form van der Waal (VDW) contacts with atoms from the Ubc9 residues Pro128, Ala129, Gln130, and Ala131 [35]. Using the published structure of complexed RanGAP1 and Ubc9 [35], we modelled the putative interaction surface between Ubc9 and a theoretical structure of the HMGA2 SUMOylation motif (Fig. 2C). Using PROPKA 2.0 [38,39], the predicted pKa value for K66 was 8.5, suggesting that this charged, hydrophilic residue resides in a hydrophobic milieu. A nonproductive interaction therefore seems likely (Fig. 2C, left panel). Although the pKa value does not change when Lys66 is methylated, the methyl group seems to engage in favourable VDW-contacts with Tyr87 and Pro88 in Ubc9 (Fig. 2C, right panel). This might explain why the methylated SUMO motif supports a productive interaction with Ubc9.
The KKxE motif is found in other chromatin factors. Based on the relationship between methylation and SUMOylation of HMGA2, we searched the proteome for proteins containing the KKxE motif and also having predicted methylation sites at the first lysine using the MeMo server [28], followed by prediction of non-canonical SUMOylation motifs at the same site, in the same proteins, using In vitro methylation of methylation-deficient (K66A) wild-type, and cancer-associated, truncated HMGA2 (HMGA2-Tr) fused to GST. GST-p53 was used as a positive control. D) Immunoprecipitation of exogenous, stably integrated EGFP-HMGA2 in hMSC-Tert20 cells, and E) endogenous HMGA2 in SW872 and OSA cell lines, followed by Western blot analysis with indicated antibodies. F) Coimmunoprecipitations of HMGA2 and SET7/9 from hMCS-Tert20 cell lysates followed by Western blot analysis with the indicated antibodies. G) GST pull down of recombinant SET7/9 using GST-HMGA2, followed by SDS-PAGE and visualized by Coomassie Blue staining of the gel. SUMOsp 2.0 [40]. This generated a list of proteins containing putative methylation-dependent SUMOylation motifs (Table 1). Almost all of these proteins are involved in chromatin remodelling. We chose the Polycomb group protein PHC1 (Polyhomeotic complex 1) for further study. A recombinant N-terminal fragment (aa 1e443; DPHC1 in Fig. 3A) was expressed to avoid interference by other methylation and SUMOylation motifs present in the C-terminal part (Fig. 3A). Also in PHC1 a prior methylation step strongly activated SUMOylation of the protein (Fig. 3B). Although a fraction of DPHC1 is SUMOylated in the unmethylated control, methylation is needed for complete SUMOylation as seen by exhaustion of free SUMO (Fig. 3B). Taken together, our results show that both HMGA2 and PHC1 are primed for SUMOylation when the adjacent, pseudo-J lysine is methylated, generating a functional SUMOylation motif.

Discussion
The HMGA2 protein is an architectural transcription factor closely associated with chromatin [41,42]. Since an increasing number of chromatin factors have been shown to be methylated, we investigated if the HGMA2 could be methylated, and indeed confirmed a methylation site (K66) both in vitro and in vivo. This modification has probably escaped detection in previous screens using mass spectrometry of trypsinized peptide fragments [12] because the resulting fragment containing K66 is too small (4 amino acids). We could show that SET7/9 was able to methylate HMGA2 in vitro and also verified an interaction between this enzyme and HMGA2 in vivo, indicating that it at least partly is responsible for methylating HMGA2. The lysine K66 is located in a region conserved from zebrafish to humans and localized in a linker region between the DNA binding domains of HMGA2 (Fig. 1A). This specific conserved lysine residue is not present in HMGA1 proteins, indicating that the function is specific to HMGA2. Consequently, methylation could affect several of the numerous functions of HMGA2.
Lysines 66 and 67 have been reported to be SUMOylated [18], although this site does not match the consensus for SUMO sites. Here we present a possible mechanism where only K67 is SUMOylated, while the previously reported inhibition of SUMOylation by mutation of K66 most likely is due to loss of the required methyl-lysine. A number of SUMOylated proteins lacking the classical consensus motifs have been reported [40,43], and experimental data have shown that about 33% (313/954) of real SUMOylation sites do not follow the classical consensus sequence [44]. The mechanism described here might explain some of these discrepancies. By methylation of the conserved lysine 66 (K 66 KAE), a motif resembling the consensus JKXE/D site is formed, with the methyl-lysine substituting the hydrophobic J residue. Indeed, we could demonstrate a requirement for methylation of K66 for subsequent SUMOylation of K67 in HGMA2. SUMOylation of HMGA2 has been shown previously to regulate its ability to modulate PML [18]. Most likely both SUMOylation and methylation of HMGA2 has further implications that need to be determined. One obvious possibility is that these modifications are important for regulating the protein-protein interaction repertoire of HMGA2. Pathway integration is another, given that SET7/9 methylates both p53 [3] and pRB [4], suggesting that SET7/9 integrates different tumor suppressor pathways.
It is well established that some protein marks are mutually exclusive, whereas others act in concert. This kind of modification crosstalk has primarily been studied with histones [9]. In recent years, it has also become evident that the "histone code" has a parallel in a "transcription factor code" [10,45,46], which has been studied extensively for e.g. p53 [47] and SRC-3 [48]. This is to our knowledge the first report that shows a direct dependency of SUMOylation on methylation, a modification crosstalk which could be relevant for several chromatin-associated proteins, in particular proteins involved in chromatin remodelling ( Table 1). As the mechanism was confirmed for the Polyhomeotic homolog 1 (PHC1), this shows that HMGA2 is not an isolated case.
Recent mass spectrometry approaches aiming at identifying novel endogenous SUMO sites have detected certain KKxE sites [49e52], however, most of the putative sites in Table 1 are still not confirmed. This is most likely due to technical limitations stemming from small fragments, unusual branched-chain peptide tryptic digests, as well as protective effects of the SUMO and/or methyl group [53e55]. Interestingly, this raises the question of how long these modifications co-exist on a target protein and their stoichiometry. It is tempting to speculate that the priming event in the form of a methyl group gets stuck under the bulky SUMO moiety [56]; protected against demethylases [57], protecting the lysine-lysine peptide bond against trypsin cleavage [55]. Deciphering this part of the transcription factor code is necessary to design a proteomics protocol that will return all methylationdependent SUMOylation sites [53].

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments
This work was supported by grants from the Research Council of Norway to the FUGE NucPro Consortium, from HelseSørøst, and All but two (marked by asterisks) of the proteins are involved in chromatin remodelling. The candidate lysine residues subject to methylation are shown in italics, and the putative SUMOylation motifs are highlighted. Proteins in bold were investigated in this study. from the Norwegian Cancer Society. The GST C-terminal p53 construct plasmid was a generous gift from D. Reinberg.

HMGA high mobility group A PHC1
Polyhomeotic homolog 1 PTM post-translational modification SET Su(Var)3e9,Enhancer of Zeste and Trithorax SUMO small ubiquitin-related modifier GST glutathione S-transferase EGFP enhanced green fluorescence protein