Rapid Cross-Metathesis for Reversible Protein Modifications via Chemical Access to Se-Allyl-selenocysteine in Proteins

Cross-metathesis (CM) has recently emerged as a viable strategy for protein modification. Here, efficient protein CM has been demonstrated through biomimetic chemical access to Se-allyl-selenocysteine (Seac), a metathesis-reactive amino acid substrate, via dehydroalanine. On-protein reaction kinetics reveal a rapid reaction with rate constants of Seac-mediated-CM comparable or superior to off-protein rates of many current bioconjugations. This use of Se-relayed Seac CM on proteins has now enabled reactions with substrates (allyl GlcNAc, N-allyl acetamide) that were previously not possible for the corresponding sulfur analogue. This CM strategy was applied to histone proteins to install a mimic of acetylated lysine (KAc, an epigenetic marker). The resulting synthetic H3 was successfully recognized by antibody that binds natural H3-K9Ac. Moreover, Cope-type selenoxide elimination allowed this putative marker (and function) to be chemically expunged, regenerating an H3 that can be rewritten to complete a chemically enabled “write (CM)–erase (ox)–rewrite (CM)” cycle.

D eveloping strategies for site-specific protein modifications is an ongoing challenge in chemical biology. 1−3 For over a decade, allylic alcohols have been identified as reactive in olefin metathesis. 4,5 However, only in recent years has a generalized allylic heteroatom effect begun to emerge in cross-metathesis (CM). 5 A mechanism involving pre-coordination to the metal center of catalyst 1 has been proposed to explain the observed reactivity of privileged substrates, such as amino acid S-allylcysteine (Sac, Scheme 1). 6 This "heteroatom-relay" has enabled CM to be applied as a bioconjugation technique in selective protein modifications, 7−11 and so greatly increased the benchmark for complexity in the substrate scope. Such putative beneficial effects of allylic chalcogens suggest a likely increase in the reactivity of higher chalcogens as Lewis bases that may be more preferred, softer ligands for ruthenium complexes. Indeed, simple allyl selenides are reactive substrates in aqueous CM. 11 We show here that installation into peptides and proteins of the simplest amino acid residue containing an allyl selenide moiety, the unnatural amino acid Se-allyl-selenocysteine (Seac), enables rapid and efficient CM.
Possible approaches to incorporating Seac into proteins were considered that relied on installation of the Se heteroatom either co-(either using Seac directly or from a precursor such as selenocysteine (Sec)) or post-translationally. Direct co-transla-tional incorporation of Seac has not yet been demonstrated, likely due to rapid oxidative cellular metabolism, also observed for selenomethionine and S-methyl-selenocysteine. 12−14 Despite the precedence for incorporation of Sec into recombinant proteins in Escherichia coli, this approach requires more complex genetic manipulation of both DNA and RNA 15,16 that limits widespread utility. Moreover, our attempts to convert Sec to Seac via a desulfurative 2,3-sigmatropic rearrangement failed in model systems (see Supporting Information (SI) p S9). Notably, a recent study has revealed a biosynthetic mechanism for selenocysteine generation via conjugate C−Se bond formation from a dehydroalanyl-tRNA Sec intermediate. 17 Inspired by nature, we therefore considered an alternative biomimetic route to incorporate Seac into proteins via dehydroalanine (Dha), which has recently emerged as a versatile chemical handle for protein modifications. 18−22 First, the generation of an appropriate conjugate, allyl selenolate nucleophile, was explored on model Dha 3 (Scheme 2a). In situ generation of allyl selenolate via cleavage of methyl benzoselenoate 2a failed (Table S1). Next, we considered reductive access. Diallyl diselenide (2b) is unstable and exists in mixture with diallyl selenide. 23 Moreover, allyl selenolate has previously only been reductively generated under aprotic conditions. 23 Nevertheless, treatment of 2b with a stoichiometric amount of NaBH 4 in degassed MeOH generated allyl selenolate; subsequent addition of this solution to Dha 3 in water yielded 53% of the desired Seac product 4 (Table S1). Allyl selenocyanate (2c), a potentially more stable, 24 alternative allyl selenolate source, was investigated next. Pleasingly, due to cleaner pre-reduction, the desired addition product 4 was obtained with an improved yield of 86% after 1 h at room temperature (Scheme 2a). Notably, although allyl selenolate is sensitive to acidic conditions (allylselenol quickly decomposes to 2b in the absence of radical inhibitor 23 ), under buffered basic conditions the lifetime of allyl selenolate is apparently long enough to successfully react with Dha in aqueous media.
With promising results observed on amino acid models, we next explored installation of Seac via this biomimetic strategy in two differing protein structural motifs/folds: three-layer α/β-Rossman-fold protein subtilisin from Bacillus lentus (SBL) and all-β-helix protein 275−276 from Nostoc punctiforme (termed Npβ), 25 respectively. Accordingly, Dha substrate proteins SBL-156Dha (5) and Npβ-61Dha (7) were synthesized from corresponding cysteine mutants following reported bis-alkylative elimination procedures. 21 Use of selenolate derived from 2a or 2b failed on proteins and led to only a mixture of products. However, allyl selenolate solution generated from 2c allowed successful biomimetic conjugate addition (Scheme 2b); LC-MS analysis revealed >95% conversion of SBL-156Dha (5) to the desired product SBL-156Seac (6a) after 1 h at room temperature. Even addition to the more sterically demanding 26 61 site in Npβ-61Dha (7) single Dha mutant under slightly elevated temperature (37°C) afforded expected Seac-containing protein with >95% conversion after 2 h (8a, see SI). Importantly, resulting SBL-derivative proteins retained fold and functional (peptidase) activity (see SI p S23). Generation of Dha and addition likely leads to the formation of a 1:1 dr of adducts. 26 With model Seac-tagged protein 6a in hand, CM with allyl alcohol was tested using catalyst 1 under previously optimized reaction conditions 11 (which can be used at pH 4−8). This reaction was strikingly rapid and was complete after only 15 min at room temperature (Scheme 3). In comparison, the sulfurequivalent protein substrate SBL-156Sac (9a) required 4 h to reach completion under the same reaction conditions. Rapid protein modification reactions are rare but powerful. 27, 28 We conducted a quantitative comparison of the enhanced rate observed here for CM of SBL-156Seac with popular bioconjugation techniques such as Staudinger ligation, 29,30 azide−alkyne cycloaddition, 31,32 and inverse electron-demand Diels−Alder cycloaddition, 33−36 which have typically been studied in small-molecule models ("off-protein"). Under pseudo-first-order conditions with respect to the protein, starting material and product were monitored by LC-MS at several time points during reactions (Figure 1a). 37 Data analyzed using nonlinear, single-exponential regression gave pseudo-first-order rate constants (Figure 1b).
Determination of pseudo-first-order rate constants at various allyl alcohol concentrations and analysis of the results using linear regression revealed second-order rate constants for Seacmediated CM. The on-protein second-order rate constant for the CM of allyl alcohol with Seac-protein was determined to be 0.31 ± 0.004 M −1 s −1 (Figure 1c,d), whereas the value for onprotein Sac-mediated CM in an identical protein context (site 156) was some 10-fold lower (0.031 ± 0.0015 M −1 s −1 ) under the same conditions (Figures S21 and S22). Interestingly, Figure 1d indicated an initially linear increase in observed rate with respect to allyl alcohol concentration and gradually plateaued at concentrations >20 mM; this suggests that the catalyst plays a critical role under these conditions, presumably becoming "saturated" at high concentrations of allyl alcohol, leading to a shift in rate-limiting step (see also SI section 2.10 for off-protein catalyst dependency). Preliminary computations (using methyl allyl chalcogenides, see SI section 3) suggest a reaction profile in which the Ru-carbene intermediate after "Se-relay" is unusually stable and enhanced in stability over "S-relay"; this intermediate is also accessed through a lowered transition state.
When comparing the second-order rate constants with those from widely used bioconjugation techniques in chemical biology, Scheme 2. Biomimetic Synthesis of Se-Allyl-selenocysteine via Conjugate Addition of Allyl Selenolate to Dehydroalanine (Table S1) Scheme 3. Enhanced CM of Seac-SBL 6a, cf. Sac-SBL 9a  (Figure 2). 27,28,43,44 It seems likely that local environment strongly affects on-protein reactions compared with off-protein; indeed, here, crowded (e.g., Npβ-61Seac) sites showed negligible reactivity (see SI). However, detailed onprotein measurements are rare, 42,44 so other relevant direct comparisons are not possible. Although here directly comparable reaction conditions could not be achieved, we extrapolate (see SI section 2.10) off-protein rates for Seac-mediated CM that are at least comparable and may be enhanced, consistent with CM's steric sensitivity. 5 Next, having established usefully enhanced on-protein rates, we examined substrate breadth. Model Seac-tagged protein SBL-156Seac 6a was tested with challenging metathesis substrate GlcNAc 10; this olefin is essentially unreactive in sulfur-relayed CM with SBL-156Sac (9a). Pleasingly, reaction reached 75% conversion after 1.5 h at 37°C (Scheme 4). Another biologically relevant substrate, N-allyl acetamide (11), was used in CM with protein 6a. Acetamide 11 has been suggested to poison catalyst 1, and also fails with SBL-156Sac (9a). 11,45,46 Despite poisoning, 47 after 20 min at 37°C, LC-MS revealed conversion to desired 6d (Scheme 4), albeit in 30% yield.
After demonstrating CM on SBL-Seac, we tested Se-relayed CM on more functionally relevant protein systems. Histone proteins are key chromatin components; post-translational modifications (e.g. K-Ac) are critical regulators of structure and function. 48,49 First, Seac was installed into key epigenetic site 9 of histone H3. After generation of H3-9Dha, 22 subsequent allyl selenolate addition to Dha 12 proceeded successfully to yield Seac protein 13 after 1 h at room temperature (Scheme 5).
With H3-9Seac (13) in hand, we carried out model CM with allyl alcohol. Coloration implying metal coordination to 13 was observed; such coordination 6,50 can necessitate metal scavenging prior to MS analysis. Use of 3-mercaptopropionic acid 51,50 allowed ready MS monitoring of H3 CM reactions. Pleasingly, this revealed CM of H3-K9Seac and allyl alcohol with nearly full conversion to CM-modified H3 14 (Scheme 6a). This ability to modify site 9 suggested a strategy for chemical recapitulation (under benign conditions) of a "write−read−erase" cycle that is observed (enzyme-mediated) in histone epigenetics. 48,49 As a "write" step we used the more challenging N-allyl acetamide olefin in CM; complete conversion was observed to 15 (Scheme 6a). As a "read" step, CM-modified H3 (15) was analyzed by an antibody raised to natural epigenetic marker N-acetyl lysine at position 9 of H3. We were pleased to find that this antibody successfully cross-reacted and recognized the CM-installed modification at position 9 as a K9Ac PTM mimic (Scheme 6b). Next, as an "erase" step the CM-installed K9Ac mimic "mark" was removed with full conversion using Cope-type elimination through the generation of labile selenoxide using mild peroxide oxidation to regenerate Dha. Met-containing H3 may be prone to side reaction; 52 this was tested and confirmed through the use of a Met-free H3 variant in which the mark was cleanly expunged without side reaction (see SI sections 2.11− 2.14). Finally, a full chemical "write−read−erase−rewrite−read" cycle was performed (SI section 2.12) to demonstrate iterative histone "switching".
In conclusion, we have demonstrated efficient chemical incorporation of Seac into proteins. This has facilitated rapid cross-metathesis for protein modifications; determination of onprotein rates shows that these outstrip or are comparable to many of the so-called "click" reactions in chemical biology. Using such allyl selenide (Seac) proteins, we were able to access a broader substrate scope in Se-relayed CM than was possible with allyl sulfides. Direct access to Seac in proteins creates opportunities for potential PTM mimicry via CM as demonstrated here: a K9Ac mimic was successfully installed, recognized, and removed (write−read−erase) in histone H3. Although, to the best of our knowledge, such a chemical "write−read−erase" approach has not been explored before, one could envisage a system based on creation, e.g., of disulfide-linked modification mimics 53 and subsequent reductive cleavage. This synthetic manipulation of biology also highlights opportunities for new metathesis catalysts; uses of CM on protein/cell systems may be enabled by tuned solubilities, compatibilities, and permeabilities. We note too that this work provides access to a protected (allylated) Sec in proteins (that could be revealed using, e.g., Pd(0)) 54 and also further motivates developments in genetic incorporation of allyl chalcogenide amino acids into proteins to access more generally the wider biological applications of crossmetathesis.