Lysine Ethylation by Histone Lysine Methyltransferases

Abstract Biomedicinally important histone lysine methyltransferases (KMTs) catalyze the transfer of a methyl group from S‐adenosylmethionine (AdoMet) cosubstrate to lysine residues in histones and other proteins. Herein, experimental and computational investigations on human KMT‐catalyzed ethylation of histone peptides by using S‐adenosylethionine (AdoEth) and Se‐adenosylselenoethionine (AdoSeEth) cosubstrates are reported. MALDI‐TOF MS experiments reveal that, unlike monomethyltransferases SETD7 and SETD8, methyltransferases G9a and G9a‐like protein (GLP) do have the capacity to ethylate lysine residues in histone peptides, and that cosubstrates follow the efficiency trend AdoMet>AdoSeEth>AdoEth. G9a and GLP can also catalyze AdoSeEth‐mediated ethylation of ornithine and produce histone peptides bearing lysine residues with different alkyl groups, such as H3K9meet and H3K9me2et. Molecular dynamics and free energy simulations based on quantum mechanics/molecular mechanics potential supported the experimental findings by providing an insight into the geometry and energetics of the enzymatic methyl/ethyl transfer process.


Introduction
Histone proteins are subject to diverse post-translational modifications (PTMs), including methylation, acetylation, crotonylation, phosphorylation, citrullination, and ubiquitination, which regulate the activity of human genes through epigenetic mechanisms. [1] Methylation of lysine residues in unstructured histone tails is associated with both gene activation and repression, depending on the histone, methylation state, and methylation site. [2] Histone lysine methylation is catalyzed by Sadenosylmethionine (AdoMet)-dependent histone lysine methyltransferases (KMTs) that install one (Kme), two (Kme2), or three (Kme3) methyl groups on the N e -amino group of lysine (Figure 1 A). [3] Histone lysine methylation is removed by flavindependent lysine-specific demethylases and Fe II /2-oxoglutarate (2OG)-dependent histone demethylases (KDMs), [4] and recognized by a large number of N e -methyllysine-binding epigenetic reader proteins, [5] which collectively spread the epigenetic landscape of post-translational modifications.
Histone KMTs contain the conserved SET (Su(var)3-9, enhancer of zeste, trithorax) domain responsible for the enzymatic activity; DOT1L is the only member of the histone KMT family, known to date, that does not contain the SET domain. [6] Structural analyses revealed that KMTs possess distinct binding pockets for AdoMet cosubstrate and histone substrate (Figure 1 B). [6e] In the ternary complex, the nucleophilic N e -amino group of lysine is well aligned with the electrophilic methyl group of AdoMet for an efficient S N 2 reaction that takes place in a narrow hydrophobic channel typically comprised of side chains of Tyr and Phe residues (Figure 1 B). The presence of Tyr and Phe in the active sites of KMTs appears to define the methylation state of the product; Tyr to Phe substitutions result in the formation of higher methylation states of lysine. [6e] The target lysine needs to be deprotonated for nucleophilic attack and an active site Tyr may also be responsible for deprotonation of the protonated lysine, although a water channel has also been suggested to play a role as a general base. [7] Despite recent success in structural, mechanistic, and inhibition studies on KMTs, [3a, 8] the biocatalytic potential of KMTs remains to be established. [9] Enzymatic assays revealed that human KMTs exhibited a high degree of specificity for the methylation of lysine analogues that differed in stereochemistry, side-chain length, and main chain. [10] On the other hand, Biomedicinally important histone lysine methyltransferases (KMTs) catalyze the transfer of a methyl group from S-adenosylmethionine (AdoMet) cosubstrate to lysine residues in histones and other proteins. Herein, experimental and computational investigations on human KMT-catalyzed ethylation of histone peptides by using S-adenosylethionine (AdoEth) and Se-adenosylselenoethionine (AdoSeEth) cosubstrates are reported. MALDI-TOF MS experiments reveal that, unlike monomethyltransferases SETD7 and SETD8, methyltransferases G9a and G9a-like protein (GLP) do have the capacity to ethylate lysine residues in histone peptides, and that cosubstrates follow the efficiency trend AdoMet > AdoSeEth > AdoEth. G9a and GLP can also catalyze AdoSeEth-mediated ethylation of ornithine and produce histone peptides bearing lysine residues with different alkyl groups, such as H3K9meet and H3K9me2et. Molecular dynamics and free energy simulations based on quantum mechanics/molecular mechanics potential supported the experimental findings by providing an insight into the geometry and energetics of the enzymatic methyl/ethyl transfer process.

Results and Discussion
Analogues of AdoMet with methyl group replacements, including AdoEth and AdoSeEth, can be enzymatically synthesized from l-methionine derivatives and adenosine triphosphate (ATP) using methionine adenosyltransferases (MATs) from different organisms. [12] A pronounced product inhibition of the MAT enzymes, however, often limits the synthesis to small amounts with isolated enzymes. [13] Larger amounts of cosubstrate analogues can be obtained by chemical synthesis (Scheme 1). AdoHcy [14] or Se-adenosylselenohomocysteine (AdoSeHcy) [11a,b,h] are typically reacted with alkylating agents under slightly acidic conditions with a mixture of formic and acetic acid. These conditions guide regioselective alkylation of the sulfur or selenium atom because all other nucleophilic positions are transiently protected by protonation. During the synthesis of the more reactive selenonium analogue AdoSeEth, we noticed many byproducts. Fortunately, the formation of these byproducts could be efficiently suppressed by adding water to the mixture of formic and acetic acid (Scheme 1). Generally, alkylations of AdoHcy under acidic conditions lead to both diastereoisomers (epimers) at sulfur in almost equal amounts. [14a] In the case of AdoEth, the two epimers formed in a 45:55 ratio (S/R). Both epimers were separated by means of reversed-phase HPLC and only the S epimer (corresponding to the biologically active S epimer of AdoMet) was used in this study. However, in the case of AdoSeEth, the separation of both epimers by reversed-phase HPLC was not possible and AdoSeEth was used as an epimeric mixture.
We then performed comparative enzymatic assays for KMTcatalyzed methylation (with AdoMet) and ethylation (with AdoEth and AdoSeEth) of synthetic histone peptides by using MALDI-TOF MS, as recently described; [10a,b] histone H3 1-15 was used for studies with SETD7 (also known as KMT7), G9a (also known as KMT1C and EHMT2), and GLP (also known as KMT1D and EHMT1), and histone H4 [13][14][15][16][17][18][19][20][21][22][23][24][25][26][27] was used for studies with SETD8 (also known as KMT5A). MALDI-TOF MS data confirmed that human KMTs catalyzed nearly quantitative methylation of histone peptides in the presence of AdoMet: H3K4me, H4K20me, H3K9me3, and H3K9me3 were formed in the presence of SETD7, SETD8, G9a, and GLP, respectively (Figure 2, top). Unlike monomethylation, SETD7 and SETD8 did not catalyze the ethylation of H3K4 and H4K20 with AdoEth or more reactive AdoSeEth within detection limits (  Having shown that G9a and GLP had the ability to catalyze monoethylation of H3K9, we next investigated potential enzymatic ethylation of biologically important methylated histones H3K9me and H3K9me2. G9a and GLP both poorly catalyzed ethylation of H3K9me (traces detected) in the presence of AdoEth, even upon prolonged incubation (Figures 3 A and B and S7). Both enzymes, however, produced detectable amounts of H3K9meet in the presence of more reactive Ado-SeEth (Figures 3 A and B and S8). G9a, in particular, produced significant amounts (55 % after 3 h and 75 % after 5 h) of H3K9meet (Figures 3 A and B and S8). The observation that G9a and GLP have the capacity to catalyze the formation of H3K9meet is interesting because functionally related histone lysine demethylases PHF8, FBXL11, and JMJD2E were found to catalyze the removal of methyl and ethyl groups in H3K9meet; thus producing unmodified H3K9 (a substrate for G9a and GLP). [15] Moreover, our MALDI-TOF MS assays revealed that G9a and GLP also catalyzed the ethylation of H3K9me2, producing approximately 25 % of bulky H3K9me2et in the presence of AdoSeEth; only traces of H3K9me2et were observed if AdoEth was used as a cosubstrate (Figure 3 C and D). Prolonged incubation (5 h at 37 8C) led to increased amounts (41 %) of H3K9me2et in the presence of G9a and AdoSeEth, whereas AdoEth did not enhance the formation of the trialkylated product (Figures S9 and S10). Control experiments in the absence of G9a/GLP or AdoSeEth showed no ethylation of H3K9me and H3K9me2; thus implying that the reactions are catalyzed by KMT and that the ethyl groups in the H3K9meet and H3K9me2et products are derived from the AdoSeEth cosubstrate (Figures S11 and S12).
Despite the fact that AdoSeEth and AdoEth can both act as ethylating agents in enzymatic assays, they do exhibit significant differences with respect to reactivity. In analogy with AdoMet/AdoSeMet, [11b, 16] AdoSeEth appears to be more reactive, that is, a better alkylation agent than AdoEth. One notable difference between the two molecules is the bond length; the CÀSe bond is longer (2.0 ) than the CÀS bond (1.8 ), which makes the selenonium analogues more reactive. Due to the longer CÀSe bond and higher reactivity of AdoSeEth, we hypothesized that KMTs might have the ability to catalyze the ethylation of ornithine, which is the lysine analogue shorter by one methylene group. In line with our earlier observation, [10b] we observed that G9a and GLP did not methylate H3Orn9, in the presence of AdoMet, within the limits of detection ( Figure 4, top). Similarly, no G9a/GLP-catalyzed ethylation of H3Orn9 was observed if AdoEth was used as a cosubstrate, even upon longer incubation times (Figures 4, middle, and S13). Interestingly, our MALDI-TOF MS data showed that G9a and GLP predominantly catalyzed monoethylation of H3Orn9 in the presence of AdoSeEth; as in the case of H3K9, no diethylation of H3Orn9 was observed (Figure 4, bottom). A longer incubation time (5 h at 37 8C) led to nearly complete (96 %) and significant (78 %) formation of H3Orn9et with G9a and GLP, respectively ( Figure S14). Controls without G9a/GLP or AdoSeEth again verified that ethylation reactions were catalyzed by KMT ( Figure S15). We also examined potential G9a/GLP-catalyzed ethylation of H3Dab9 peptide that possessed the lysine analogue 2,4-diaminobutyric acid (Dab), which was shorter by two methylene groups. However, we did not detect any ethylated products in the presence of AdoEth and AdoSeEth, within detection limits (Figures S16 and S17). Finally, we also performed enzymatic assays with ornithine-containing histone peptides H3Orn4 and H4Orn20 with human SETD7 and SETD8. Unlike monomethylation of H3K4, human SETD7 did not catalyze monoethylation of H3Orn4 in the presence of AdoEth or Ado-   Figure S18). Similarly, despite high degrees of monomethylation of H4K20 by SETD8, the enzyme did not yield any H4Orn20et in the presence of AdoEth or AdoSeEth (Figure S19). This is in line with the absence of ethylation of H3K4 and H4K20 with AdoEth and AdoSeEth, as discussed above ( Figure 2 A and B). Collectively, our enzymatic assays revealed that human G9a and GLP possessed the biocatalytic potential for ethylation of the shorter ornithine residue.
To obtain a better understanding of the G9a-and GLP-catalyzed ethylation of H3K9 and H3Orn9, we performed kinetic experiments with different cosubstrate concentrations (Table 1 and Figures S20 and S21). However, G9a-and GLP-catalyzed ethylation reactions with AdoEth were so slow that no multiple turnovers were obtained within the extended reaction time. Comparing the single turnover rate constant, k = 0.046 min À1 , of G9a in the presence of saturating AdoEth concentrations with k cat = 11.6 min À1 obtained with AdoMet under multiple turnover (steady-state) conditions indicates that alkylation with the natural cosubstrate is at least 200-fold faster. Very similar kinetics results were obtained with GLP. Such a reduction in activity of two to three orders with AdoEth, compared with that of AdoMet, was also observed with DNA methyltransferases, [14a] and could be attributed to increased steric strain in the S N 2type transition state for ethyl transfer, relative to that of methyl transfer. With AdoSeEth, the ethylation rate under saturating cosubstrate concentrations was about three to four times faster for both G9a and GLP; however, the true rate enhancement upon going from AdoEth to AdoSeEth might even be larger because AdoSeEth was employed as an epimeric mixture at the selenonium center (separation of the epimers by means of reversed-phase HPLC was not possible, see above) and the epimer with the non-natural R configuration might act as an inhibitor, as observed for (R)-AdoMet. [17] Furthermore, the ethylation rate of the side-chain-shortened substrate H3Orn9 with AdoSeEth is only reduced by 50 % compared with that of the H3K9 substrate with the natural target amino acid.
We then performed quantum mechanics/molecular mechanics (QM/MM) investigations to rationalize experimental observations on the KMT-catalyzed ethylation of lysine residues. Because the parameters for selenium are still not available in the semiempirical QM DFTB3 method, only simulations for the ethylation reactions involving AdoEth have been performed herein. The average active-site structures of the reactant complexes for the first methylation and first ethylation in SETD8 are compared in Figure 5 (active-site structures near the transition state are shown in Figure S22); the distribution maps of r(C M ÀN e )/r(C M1 ÀN e ) and q are also given in each case. As observed, the alignment of the electron lone pair on N e of the target lysine with transferable ethyl groups (Figure 5 B) is significantly worse than that with the transferable methyl group (Figure 5 A). For instance, the average distance between N e of lysine and C M1 is 4.3 in Figure 5 B, whereas it is 3.4 between N e and C M in Figure 5 A. The poor alignment between the ethyl donor and acceptor positions is also reflected in the distribution map. The free energy profiles for the first methylation and first ethylation reactions in SETD8 are given in Figure 5 C. Consistent with the structures of the reactant complexes, the free energy barrier for the ethylation reaction (22.7 kcal mol À1 ) is considerably higher than that of methylation (19.4 kcal mol À1 ). The simulation results indicate that, although SETD8 can catalyze the monomethylation of H4K20, it may not be able to do so for the ethylation reaction; this is consistent with the experimental data (Figure 2 B).
The average active-site structures of the reactant complexes for the first and second ethyl transfers in GLP are given in Figure 6 A and B, respectively (active-site structures near the transition state are shown in Figure S23). Figure 6 A and B shows that the alignment of the electron lone pair on N e of the target lysine with the transferable ethyl group in GLP is significantly better for the first ethyl transfer step (with a shorter average C M1 ···N e distance of 4.08 and higher population of near attack conformations) compared with that for the second ethyl transfer step (with an average C M1 ···N e distance of 4.56 ). This is in contrast with cases for the first and second methyl transfers in GLP in which the target lysine and methyllysine can be well aligned, respectively, with the transferable methyl group (Figure S24). The free energy profiles for the first and second ethylation reactions in Figure 6 C demonstrate that, although GLP may catalyze the first ethylation reaction (with a free energy barrier of 18.3 kcal mol À1 , which is higher than that of 17.0 kcal mol À1 for the first methylation reaction [10b] ), it is unlikely to be able to catalyze the second ethylation reaction with AdoEth because the free energy barrier becomes significantly higher (23.4 kcal mol À1 ). These results are in agreement with the observed monoethylated product, H3K9et, and lack of the diethylated product H3K9et2 in MALDI-TOF assays (Figure 2 D).

Conclusion
We have demonstrated that human KMTs G9a and GLP have the capacity to catalyze monoethylation of H3K9 in the presence of AdoEth and AdoSeEth cosubstrates. Enzymatic assays revealed that AdoSeEth was a superior ethylating agent to AdoEth, but comparatively poorer cosubstrate than that of natural AdoMet. The ability of AdoEth to act as an ethylating agent for histone ethylation in cells [18] and in vitro, as investigated herein, might have some biological relevance because its precursor, ethionine, is a toxic compound that can be con-  [19] Computational work revealed that the molecular origin for more efficient enzymatic methylation over that of ethylation of lysine residues in histones lay in more optimal alignment of the smaller methyl group of AdoMet, relative to that of the larger ethyl group of AdoEth. Our examinations also revealed that G9a and GLP catalyzed the ethylation of histone H3, bearing  biologically relevant methylated lysine residues (H3K9me and H3K9me2), and histone H3 peptide possessing ornithine (H3Orn9), which is shorter by one methylene group, in the presence of AdoSeEth. Collectively, this work highlights the biocatalytic potential of selected human KMTs and expands the substrate scope for KMT-catalyzed alkylation of histones. It is envisioned that this work, along with recent investigations into KMT-catalyzed alkylation of proteins, including histones, will advance our basic understanding of KMT catalysis.

Synthesis
Ethyl triflate: Ethyl triflate was obtained by following a slightly modified literature procedure. [20] Polyvinylpyridine (1.21 g) was suspended in dichloromethane (27 mL AdoSeEth: AdoSeHcy (20 mg, 46.4 mmol) was dissolved in a 1:1 mixture of formic and acetic acid (1.95 mL) and water (0.98 mL) was added. The solution was supplemented with ethyl triflate (1.49 g, 8.37 mmol), and the reaction mixture was stirred at room temperature for 1 h. The reaction was supplemented with water (6 mL), and the aqueous phase was extracted three times with diethyl ether (7.5 mL each time). Purification of the product in the aqueous phase was performed by means of preparative reversedphase HPLC (Prontosil-ODS 5 mm, 120 , 250 20 mm, Bischoff, Leonberg, Germany). Compounds were eluted with methanol (linear gradients from 0 to 7.8 % in 15 min and to 78 % in 5 min) in aqueous trifluoroacetic acid (0.01 %) and a flow of 10 mL min À1 . Compounds were detected at l = 260 and 272 nm. The two epimers (at selenium) both eluted with a retention time of around 9.9 min and could not be separated. Product-containing fractions were collected and the solvents were removed by lyophilization. The remaining solid was dissolved in water (1.5 mL) and the yield Expression and purification of KMTs: The four wild-type human KMTs (SETD7, SETD8, G9a, and GLP) were expressed and purified as described previously. [10a,b]  water bath at all times. After centrifugation, the supernatant was loaded onto Ni-charged His-tag binding resin equilibrated with column buffer. Resins were washed thoroughly with column buffer, followed by washing buffer, and protein was eluted with elution buffer under a linear gradient concentration of imidazole. The protein was then applied to size exclusion chromatography (SEC) by using a Superdex 75 column (GE Healthcare). Purified proteins were concentrated by employing Amicon ultra centrifugal filter units (Millipore) with suitable molecular weight cutoffs (10 kDa). Protein concentration was determined by employing a Nanodrop DeNovix DS-11 spectrophotometer and the purity was monitored by means of SDS-PAGE on a 4-15 % gradient polyacrylamide gel (Bio-Rad). Enzymes were aliquoted and stored at À80 8C for future use.
Enzymatic assays by means of MALDI-TOF MS: The reactions were performed in a total volume of 25 mL in an Eppendorf vial by using a thermomixer. A typical enzymatic assay included histone peptides (40 mm), cosubstrate AdoMet (200 mm with SETD8 and SETD7; 500 mm with GLP and G9a), AdoEth (1 mm) or AdoSeEth (1 mm), and KMT enzyme (2 mm) in a reaction buffer of 50 mm Tris·HCl at pH 8.0. The reactions were incubated at 37 8C and aliquots were removed from the reaction vial at different time points (1, 3, and 5 h) to measure the conversion of histone peptide substrates into alkylated products. The reaction was stopped by mixing the reaction mixture (3 mL) with MeOH (3 mL). An aliquot (3 mL) of these samples was directly mixed onto the MALDI target plate with a-cyano-4-hydroxycinnamic acid (CHCA) matrix (3 mL, 5 mgmL À1 in 50 % (v/v) acetonitrile/water) and dried in air. Peptide substrate masses were measured in positive-ion reflector mode. Full mass scans were acquired in the m/z range of 500-4000. Each mass spectrum was generated from data derived from 3-5 single laser shots in 200 shot steps from different positions of the sample spot. All spectra were manually acquired by using a Microflex mass spectrometer and FlexControl software, and the data were annotated by employing FlexAnalysis software (Bruker Daltonics, Germany). The following methylation and ethylation species were observed: mono-(+ 14 Da), di-(+ 28 Da), and trimethylation (+ 42 Da), and monoethylation (+ 28 Da). All methylation and ethylation experiments were performed in replicate.
Enzyme kinetics analyses: The kinetics parameters (k, K) were determined by incubating G9a or GLP (2 mm for AdoEth; 1 mm for AdoSeEth) and histone peptide (25 mm 15-mer H3K9 or H3Orn9) in 50 mm Tris·HCl, pH 8.0, in the presence of various concentrations of AdoEth (0-250 mm) or AdoSeEth (0-250 mm). The reactions (final volume = 20 mL) were incubated at 37 8C (700 rpm) for 15 min, after which they were quenched by adding an equal volume of MeOH. For analysis by MALDI-TOF MS, the quenched reaction mixture (2 mL) was mixed with a saturated solution of CHCA (6 mL; 1:1 From this, 1 mL was spotted onto the MALDI plate for crystallization. The enzymatic activity was determined by using the peak areas (including all isotopes) for each alkylation state. To obtain the kinetic parameters, data were plotted and fitted to the nonlinear regression enzyme kinetics function k cat by using GraphPad PRISM software ( Figure S20). The kinetics profiles for AdoMet ( Figure S21) were obtained by incubation of G9a or GLP (50 nm), histone peptide (10 mm 15-mer H3K9), and AdoMet (0-15 mm) in 50 mm Tris·HCl, pH 8.0, buffer for 3 min at 37 8C (700 rpm). MALDI-TOF MS measurements and data analysis was performed in the same way as that for AdoEth and AdoSeEth described above.
QM/MM studies: QM/MM free energy (potential of mean force) and MD simulations were performed to study the active-site dy- namics of SETD8 and GLP and to calculate the free energy profiles of ethyl transfers from AdoEth to lysine and its ethylated form by using the CHARMM program. [21] The ÀCH 2 ÀCH 2 ÀS + (Et)ÀCH 2 À part of AdoEth and lysine/ethyllysine chain were treated by QM and the rest of the system by MM. The link-atom approach [22] was applied to separate the QM and MM regions. A modified TIP3P water model [23] was employed for the solvent, and the stochastic boundary MD method [24] was used for the QM/MM simulations. The system was separated into a reaction zone and a reservoir region, and the reaction zone was further divided into a reaction region and a buffer region. The reaction region was a sphere with radius, r, of 20 , and the buffer region extended over 20 r 22 . The reference center for partitioning the system was chosen to be the N e atom of the target lysine. The resulting systems contained around 5800 atoms for GLP (or 5400 atoms for SETD8), including about 700-900 water molecules. The DFTB3 method [24,25] implemented in CHARMM was used for the QM atoms. The semiempirical approach adopted herein was used previously on a number of systems, and the results seemed to be reasonable. [26] The all-hydrogen CHARMM potential function (PARAM27) [27] was used for the MM atoms.
The initial coordinates for the reactant complexes of methylation/ ethylation were based on the crystallographic complexes (PDB IDs: 2BQZ and 3HNA for SETD8 and GLP, respectively); a methyl/ethyl group was manually added to SAH to change it to AdoMet/AdoEth and the methyl group on methyllysine was manually deleted to generate the target lysine. The initial structures for the entire stochastic boundary systems were optimized by using the steepest descent (SD) and adopted-basis Newton-Raphson (ABNR) methods. The systems were gradually heated from 50.0 to 298.15 K in 50 ps. A 1 fs time step was used for integration of the equation of motion, and the coordinates were saved every 50 fs for analyses. The 1.5 ns QM/MM MD simulations were performed for each of the reactant complexes, and similar approaches have been used previously. [7a, 28] The umbrella sampling method [29] implemented in the CHARMM program, along with the weighted histogram analysis method (WHAM), [30] was applied to determine changes of the free energy (potential of mean force, PMF) as a function of the reaction coordinate for methyl/ethyl transfer from AdoMet/AdoEth to the target lysine/ethyllysine in SETD8 and in GLP, respectively. The reaction coordinate was defined as a linear combination of r(C M ÀS d ) and r(C M ÀN e ) for methylation (R = r(C M ÀS d )Àr(C M ÀN e )) or r(C M1 ÀN e ) and r(C M1 ÀS d ) for ethylation (R = r(C M1 ÀS d )Àr(C M1 ÀN e ); see Figure 5 for the atom designation). Thirty windows were used and 50 ps production runs were performed for each window after 50 ps equilibration. The force constants of the harmonic biasing potentials used in the PMF simulations were 50-400 kcal mol À1 À2 .