Towards Reassignment of the Methionine Codon AUG to Two Different Noncanonical Amino Acids in Bacterial Translation

Genetic encoding of noncanonical amino acids (ncAAs) through sense codon reassignment is an efficient tool for expanding the chemical functionality of proteins. Incorporation of multiple ncAAs, however, is particularly challenging. This work describes the first attempts to reassign the sense methionine (Met) codon AUG to two different ncAAs in bacterial protein translation. Escherichia coli methionyl-tRNA synthetase (MetRS) charges two tRNAs with Met: tRNAfMet initiates protein synthesis (starting AUG codon), whereas elongator tRNAMet participates in protein elongation (internal AUG codon(s)). Preliminary in vitro experiments show that these tRNAs can be charged with the Met analogues azidohomoalanine (Aha) and ethionine (Eth) by exploiting the different substrate specificities of EcMetRS and the heterologous MetRS / tRNAMet pair from the archaeon Sulfolobus acidocaldarius, respectively. Here, we explored whether this configuration would allow a differential decoding during in vivo protein initiation and elongation. First, we eliminated the elongator tRNAMet from a methionine auxotrophic E. coli strain, which was then equipped with a rescue plasmid harboring the heterologous pair. Although the imported pair was not fully orthogonal, it was possible to incorporate preferentially Eth at internal AUG codons in a model protein, suggesting that in vivo AUG codon reassignment is possible. To achieve full orthogonality during elongation, we imported the known orthogonal pair of Methanosarcina mazei pyrrolysyl-tRNA synthetase (PylRS) / tRNAPyl and devised a genetic selection system based on the suppression of an amber stop codon in an important glycolytic gene, pfkA, which restores enzyme functionality and normal cellular growth. Using an evolved PylRS able to accept Met analogues, it should be possible to reassign the AUG codon to two different ncAAs by using directed evolution.


Classical Methods for in vivo Multiple
Incorporation of ncAAs ver the last years, the genetic encoding of noncanonical amino acids (ncAAs) has allowed the introduction of many new chemical functionalities into target proteins.Generally, this is achieved by two main methods.On one hand, ncAAs that are structurally analogous (isostructural) to the canonical ones, and that can be recognized by the endogenous host cell machinery, can be encoded residue-specifically (i.e., global reassignment) using the supplementation-based incorporation method (SPI). [1,2]n the other hand, ncAAs which are orthogonal to the host cell translation system are usually incorporated sitespecifically by suppression of stop codons (SCS) using orthogonal aminoacyl-tRNA synthetase (aaRS) / tRNA pairs (o-pairs). [3]lthough the incorporation of one single ncAA into proteins has provided fascinating insights into sequencefunction relationships [2][3][4] as well as potential applications in enzymology, [5][6][7] it is often desirable to introduce simultaneously different chemical functionalities by using two or O more ncAAs. [8,9]For example, Budisa and coworkers introduced three ncAAs in parallel into the model protein barstar using the SPI method: [10] the incorporation of the Trp analogue 4-azatryptophan (4AzaTrp) endowed the protein with a blue fluorescent probe for bio-imaging, [11] the Met analogue homopropargylglycine (Hpg) equipped it with a biorthogonal handle for post-translational conjugation by copper(I)-catalyzed azide-alkyne cycloaddition (CuAAC), [12] while the Pro analogue (4S)-FPro ((4S)-FPro) provided a stabilizing effect on its structure. [13]lobal replacement of two or more amino acids, however, requires bacterial polyauxotrophic strains, its effects on protein function are not always predictable and it is difficult to fully substitute a high number of target residues with ncAAs.Thus, it is often difficult to correlate sequence to function.In contrast, codon suppression methodologies offer the possibility to incorporate multiple ncAAs at defined sites in a more controlled manner, as well as to greatly expand the repertoire of new chemical functionalities.For example, the combination of two o-pairs, pyrrolysyl-tRNA synthetase / tRNA Pyl (PylRS / tRNA Pyl ) and M. jannaschii tyrosyl-tRNA synthetase / tRNA Tyr (TyrRS / tRNA Tyr ), allowed the simultaneous suppression of the stop codons UAG and UGA with azide-and alkyne-containing amino acids for a biorthogonal cycloaddition reaction. [14]urther, the combination of the two above o-pairs with an evolved orthogonal ribosome for specific suppression of amber and quadruplet codons on an orthogonal mRNA was reported to be efficient. [15]Generally, it was possible to insert different pairs of ncAAs and label them selectively with fluorophores for protein conformation and dynamics studies by Förster resonance energy transfer (FRET). [16,17]he problem of suppression methodologies, however, is that these often require sophisticated systems (e.g., specific o-pairs for each ncAA and o-ribosomes evolved by directed evolution) for enabling an efficient labeling of proteins and usually only the successful attempts are reported (i.e., permissible sites of ncAA incorporation).
Importantly, the beneficial effects provided by the SPI method can be combined with the selective insertion of useful reactive groups by SCS.For example, a lipase mutant with enhanced activity was produced by globally replacing Met residues with norleucine (Nle), while introducing sitespecifically the photo-crosslinker ncAA 4-benzoylphenylalanine (Bpa). [18]In another experiment, all the Met residues in the model protein green fluorescent protein (GFP) were replaced with the "clickable" ncAAs Aha or Hpg, whereas the photo-crosslinker 3,4-dihydroxyphenylalanine (L-DOPA) was incorporated at a specific site. [19]ecent improvements in the SCS methodology, i.e. evolving more efficient aaRSs in a genomically recoded E. coli strain, have greatly increased the suppression efficiency allowing the multi-site incorporation of a single ncAA at up to 30 positions in protein-polymers with high yields. [20]Multiple site-specific incorporation of different ncAAs, however, is still dependent on the number of o-pairs available and, especially, restricted to three stop codons.It is expected that this limitation will be overcome in the future by recoding sense codons.

"Rewiring" the Genetic Code: Sense Codons Reassignment
A new strategy for incorporating multiple ncAAs is gaining increasing attention.It exploits the degeneracy of the genetic code to reassign sense codons, thus requiring opairs whose anticodon can be changed without affecting the tRNA recognition by the cognate synthetase, while still keeping their orthogonality towards the endogenous aaRSs.Compared to the SCS method, which usually entails competition between the suppressor tRNAs and the translation release factors, in sense codon reassignment the main limitation is the competition of the exogenous tRNA with endogenous tRNAs.Since tRNA levels in the cells correlate with codon usage, rare codons are expected to be more amenable to reassignment.Furthermore, reassigning rare codons should have less disrupting effects on the proteome.Recently, the rarest E. coli sense codon, AGG, was recoded from arginine (Arg) to L-homoarginine (LHR) using various tricks: all the AGG codons in essential genes were replaced with synonymous ones, the cellular levels of competing tRNAs Arg were eliminated/reduced, and a heterologous PylRS / tRNA Pyl CCU mutant specific for LHR was expressed. [21]In another example, it was possible to liberate the rare isoleucine codon AUA from its natural decoding mechanism.This was achieved by eliminating the enzyme catalyzing the tRNA modification C-to-L (2-lysyl-cytidine) at the wobble position of tRNA Ile CAU, which is important for codon recognition, and by rescuing the engineered cells with an heterologous IleRS / tRNA Ile UAU pair. [22]nterestingly, reassignment of abundant sense codons is also feasible.Tirrell used an orthogonal yeast PheRS / tRNA Phe AAA mutant to outcompete the endogenous tRNA Phe GAA in reading the UUU codon, one of the two Phe codons.This allowed almost quantitative reassignment of UUU codons to the Phe analogue 2-naphthylalanine in murine dihydrofolate reductase (mDHFR), but some misincorporation at the UUC codon also occurred. [23]Recently, O'Donoghue et al. showed that the machinery of selenocysteine can be modulated to recognize 58 of the 64 naturally occurring codons, in many cases completely outcompeting the endogenous tRNA, although other codons resulted in ambiguous translation. [24]Söll and co-workers showed that the serine sense codon AGU can be reassigned to 3-iodo-Lphenylalanine (3-I-Phe), [25] while Kwon and co-workers recently showed that it is possible to achieve ambiguous reading of the leucine UUG codon with 2-naphthylalanine. [26]n general, amino acids encoded by several codons (e.g., serine, leucine and arginine) offer more chances to be recoded since the wobble pairing at the 3 rd base could be outcompeted by a more energetically favorable Watson-Crick base pairing with the orthogonal tRNA.On the other hand, the reassignment of the methionine codon, AUG, the only one encoding for this amino acid and highly frequent (around 2.5 %), [27] represents a real challenge.The AUG decoding system, however, has a peculiar mechanism which could be exploited to insert two different ncAAs.The starting AUG codon and internal AUG codons are read by two distinct tRNAs, respectively, initiator tRNAi fMet (in prokaryotes) and elongator tRNA Met (Figure 1).Charging them by two different and mutually orthogonal MetRSs can help to deliver one ncAA site-specifically to the N-terminus and the other globally at internal methionine residues.In this article we present some preliminary results for engineering this system.

Global Reassignment of the AUG Codon
Met plays an important role in cellular processes.Its sulphur atom can be involved in binding to metals ions, whereas its methyl group is used in DNA methylation through its derivative S-adenosylmethionine (SAM).Met is also the universal starting residue in translation, even though 60 % of these residues are removed by N-terminal Methionine excision (NME) in E. coli. [28]It is believed that Met starts translation because it is the most metabolically expensive amino acid to synthesize. [29,30]Partial recycling would therefore help the cells to save a part of the energetic cost of translation.Furthermore, Met functions as scavenger of reactive oxygen species (ROS) to protect proteins.In presence of ROS, the thioether moiety of Met can be oxidized to its sulfoxide form, Met(O), and then be reduced by methionine sulfoxide reductase. [31]Cells exposed to oxidative stress showed mis-methionylation of specific tRNAs which increases the insertion of protective Met residues in the whole proteome. [32]iven its physiological importance, it might be more challenging to completely replace Met with a ncAA, similarly to the recently reported complete trophic reassignment of Trp with L-β-(thieno[3,2-b]pyrrolyl)alanine under experimentally designed evolutionary pressure. [33,34]evertheless, like Trp, Met is one of the least common amino acids in proteins: it represents only about 2.5 % of all residues, [35] mostly buried in the hydrophobic core of proteins where it is rarely directly involved in catalytic function.This feature makes it a good candidate for proteome-wide substitution using the SPI method.Indeed, the relatively relaxed substrate specificity of MetRS has Figure 1.Natural (A) and hypothetical (B) AUG decoding system.In E. coli, the MetRS normally activates methionine and charges it onto both the initiator tRNAi fMet (red) and elongator tRNA Met (black).Initiator tRNAi fMet is directed in the P-site of the ribosome where it participates in initiation and inserts an fMet residue at the N-terminus of the protein.Elongator tRNA Met , instead, enters the A-site of the ribosome where it reads internal AUG codons and inserts Met in the protein sequence.Our hypothetical model includes context-dependent AUG-codon reassignments whereby one ncAA can be encoded site-specifically at the N-terminus and the other ncAAs can be inserted globally at the internal AUG positions.
allowed the incorporation of a large repertoire of Met analogues for different purposes.Labeling of proteins with selenomethionine (SeMet) and telluromethionine (TeMet) has become a common practice for structure determination by X-ray crystallography. [36,37][40] The more hydrophobic [41] analogues norleucine (Nle) and ethionine (Eth) were used to study overall effects on enzyme activity.Their incorporation in the Met-rich active site of Calmodulin lowered its activity, [42] whereas incorporation of Nle into Lipase from Thermoanaerobacter thermohydrosulfuricus (TTL) led to an enzyme highly active in the aqueous phase without the need for thermal activation [43] and enhanced stability against denaturing agents. [44]Other examples of Met analogue applications include the use of methoxinine to study the effects of Met oxidation on prion proteins, [45] of photo-methionine in protein-protein interaction studies, [46] and of homoallylglycine (Hag) for potential use in olefin metathesis. [47]Even Met analogues which are generally translationally inactive can be incorporated by simple overexpression of MetRS, as Tirrell and coworkers showed with trans-crotylglycine (Tgc), cis-crotylglycine (Ctg), 2-aminoheptanoic acid, norvaline, 2-butynylglycine, allylglycine [48][49][50] and S-allylhomocysteine. [51] Met analogues containing azides and alkynes are of particular interest since they can be used as chemical handles to mimic post-translational modifications using "click chemistry".Thus far, azidohomoalanine (Aha) and homopropargylglycine (Hpg) have been incorporated using the native EcMetRS, [52,53] while azidonorleucine (Anl) [54] and propargylglycine (Pra) [51] were only translationally acive using mutants of EcMetRS.Especially, Aha and Hpg have been used to artificially attach to proteins post-translational modifications such as sugars [55] or biotin. [56]The recently developed BONCAT (bioorthogonal non-canonical amino acid tagging) strategy takes advantage of the incorporation of azidebearing ncAAs in the whole proteome and selective enrichment by affinity tags (e.g.biotin) to study temporal and spatial characteristics of newly synthesized proteins. [57]A cell-selective labeling in co-culture was achieved expressing an EcMetRS mutant specific for Anl (NNL-EcMetRS), and one mutant specific for Pra (PraRS), only in a subset of cells. [58,59]Figure 2 presents a survey of various Met analogues incorporated so far in a residuespecific fashion.The extent of their incorporation in newly expressed proteins was quantified by precise analytical measurements in some instances and found to be almost quantitative. [53,60,61]

Site-Specific Reassignment of the AUG codon
Met analogues contain single atomic substitutions like H  F, C  N or S  CH / CH2 / O / N / Se / Te, which are interesting for studying the effects of small perturbations on protein structure and function, [62] as well as bioorthogonal reactive groups useful for post-translational modifications (Figure 2).It would be therefore interesting to site-specifically incorporate them into target proteins.As mentioned earlier, this could be achieved via the SCS method.However, this method appears to be limited to bulky (aromatic or long aliphatic) amino acids, which are the substrates of the commonly used o-pairs PylRS and MjTyRS, and not available for Met analogues.The only known example of this kind is based on sense codon reassignment.Site-specific labelling of proteins at the N-terminus with azidonorleucine (Anl) was achieved in mammalian cells expressing NNL-EcMetRS, since the bacterial synthetase recognizes the mammalian initiator tRNAi Met but does not cross-react with the endogenous elongator tRNA Met . [63]enerally, the N-terminus appears to be a favorable target for site-specific functionalization being predominantly exposed on the protein surface and because protein initiation, in theory, could be reprogrammed separately from elongation (Figure 1).66][67][68] Interestingly, several Met analogues are known to work in protein initiation in vivo, indicating that they are formylated by methionyl-tRNA fMet formyltransferase (FMT) and recognized by the initiation factors (IFs). [69]In addition, their N-terminal processing is attenuated [69] and can be regulated by changing the identity of the second amino acid in the protein sequence. [70,71]Even bacteria devoid of N-terminal protein formylation were developed by Mutzel and Marliere. [72]In the evolved E. coli strain, which has both transformylase (fmt) and deformylase (fms) deleted, protein synthesis can be initiated by unmodified methionine instead of N-formylmethionine.Based on the observation that Pseudomonas aeruginosa with inactivated formylase is still viable, Newton et al. concluded that formylation is not essential for initiation of protein synthesis in all eubacteria. [73]n this work, we attempted the site-specific incorporation of azidohomoalanine (Aha) at the N-terminus, while simultaneously replacing the internal Met residues with its analogue ethionine (Eth) during in vivo protein translation.We also report the development of a genetic selection system based on the suppression of an amber stop codon instead of a Met residue in an important glycolytic gene, pfkA, which restores enzyme functionality and normal cellular growth.

Separating AUG and Internal AUG Codons by Two Different tRNAs
To decode differently the initial and internal AUG codons, their co-respective tRNAs Met should be loaded with two different ncAAs.This can be achieved using two MetRSs, one endogenous and the other exogenous, with different substrate specificities.Most importantly, the two systems should not cross-react, i.e., they need to be mutually orthogonal.Heterologous MetRS should aminoacylate its cognate tRNA Met with one specific ncAA, while not interacting with endogenous EctRNA Met .In addition, the heterologous tRNA Met should not be recognized by endogenous EcMetRS.
Cross-aminoacylation experiments with tRNAs Met and MetRSs from different organisms suggest the existence of two types of MetRSs (Figure 3). [74]The "type A" from bacteria and organelles aminoacylates bacterial or organellar initiator tRNAi fMet and elongator tRNA Met .Moreover, it also recognizes eukaryotic and archaeal initiator tRNAi Met . [75,76]In contrast, the "type B" includes archaeal or eukaryotic cytoplasmatic MetRSs that aminoacylate archaeal or eukaryotic initiator tRNAi Met and elongator tRNA Met .However, these MetRSs aminoacylate the bacterial or organellar tRNAs Met counterparts with low efficiency. [74]Consequently, it should be feasible to use a type B MetRS / tRNA Met pair to reassign AUG codons to a Met analogue exclusively or at least preferentially during elongation in E. coli.
Previous experiments in our group identified a putative orthogonal MetRS / tRNA Met pair from the archaeon Sulfolobus acidocaldarius (Sa). [77]In particular, in vitro experiments showed that SaMetRS aminoacylates EctRNAi fMet and elongator EctRNA Met with low efficiency and that, among several Met analogues, it has a marked substrate preference towards Eth. [77]In comparison, endogenous EcMetRS activates this analogue less efficiently.Eth has an ethyl group attached to the sulfur atom rather than a methyl group in Met (Figure 2).Given this slight difference, its global incorporation in target proteins is expected to have a minor impact on their structures, while it can be used to study overall effects on their biochemical properties. [42]On the other end, when Eth is incorporated in the whole proteome, it inhibits growth in many microorganisms including E. coli. [78]t was found that EcMetRS activates Aha with higher efficiency than Eth, while SaMetRS activates this analogue 4.4 times lower than Eth. [77]In vivo experiments with the model protein barstar [10] containing two Met residues (b*2M), one at the N-terminus and the other at the internal position E47M, confirmed the overall preference of the endogenous system for Aha over Eth. [77]In particular, Aha was incorporated 1.8 times more at the N-terminus than at the internal position, suggesting that this ncAA works better in translation initiation than in elongation. [77]ontrarily to Eth, Aha is not toxic in mammalian cell lines, neither in whole organisms for short incubation times, and it has been used to selectively label the newly synthesized proteome via BONCAT. [57,79]hese results provided a starting point for the differential reassignment of the N-terminus to Aha using the natural substrate preference of EcMetRS, while globally reassigning the internal AUG codons with Eth using the putative orthogonal SaMetRS / SatRNA Met pair.

Removal of metT / metU from E. coli and Rescue With the Heterologous SaMetRS/tRNA Pair
As discussed above, sense codon reassignment entails competition with the endogenous tRNAs.Considering that tRNA abundance correlates with codon usage [80] and that the AUG codon is highly used in E. coli (frequency: ~25 codons per thousand, of which ~22 at internal positions), [81,82] cellular concentrations of elongator EctRNA Met might be too high to be outcompeted by an exogenous tRNA Met .Therefore, we decided to eliminate this gene, which is present in the E. coli genome in two versions, metT and metU.Since the genomic deletion of both copies is lethal for the cells, and because the internal AUG codons become unassigned, the knockout strain was supplemented with a rescue vector from the pMEc series [83] containing an elongator tRNA Met (Figure S1).Two putative elongator tRNA Met were identified in S. acidocaldarius: [84] SatRNA (8) and SatRNA (16).They both differ from each other and from elongator EctRNA Met at several positions (Figure 4).For example, they have a different base composition at conserved positions G 2 and C 3 in the acceptor stem which are determinants for aminoacylation by EcMetRS. [85]A main difference is the size of the D-loop, which is composed of 10 bases in SatRNA Met ( 16) compared to that of SatRNA Met (8) and elongator EctRNA Met which is made of 9 bases (Figure 4).It was hypothesized that this difference in size might be a determinant for orthogonality in E. coli, [77] similar to the way in which enlarging the anticodon loop renders elongator EctRNA Met a good substrate for mammalian MetRS, while decreasing its interaction with bacterial MetRS. [74]Changing its size, however, does not seem to affect the rate of aminoacylation. [85]Nevertheless, in the present study we decided to test SatRNA Met (16).A metT / metU-deficient E. coli strain was successfully generated (see Figure S2).However, by using the pMEc vector containing different combinations of EcMetRS / SaMetRS and elongator EctRNA Met / SatRNA Met (Figure S1) we found that our system is not fully orthogonal in vivo: heterologous SatRNA Met is still recognized and charged by endogenous EcMetRS.

Simultaneous Reassignment of the Initial and Internal AUG Codons to L-Azidohomoalanine and L-Ethionine
In spite of the fact that the SaMetRS / SatRNA Met pair was not fully orthogonal in E. coli, we set out to test how far previously observed preferential Eth-incorporation [77] can be pushed in the newly designed metT / metU E. coli strain.
Such experiment combines sense codon reassignment with the SPI-method in an E. coli strain auxotrophic for Met, B834(DE3), deficient in the elongator tRNA Met (Figure S3).
Barstar (b*) congeners -proteins with the same gene sequence as wild-type protein but containing a fraction of synthetic amino acids -were expressed in the metT / metU-deficient B834(DE3) strain, as described in the SI.The purified congeners were analyzed by ESI mass spectrometry, in which three different species were identified: b*2Aha, b*1Aha-1Eth, and b*2Eth.The second congener can be in two forms: b*1Aha-47Eth and b*1Eth-47Aha.The distribution of each species was calculated from the peak intensity in the MS spectra (Table 1).
Notably, the relative abundances of the three b* congeners expressed in the metT / metU-deficient B834(DE3) strain and in the B834(DE3) wild-type strain differ from each other.Elimination of endogenous elongator EctRNA Met led to an overall shift towards incorporation of Eth, as indicated by an increase in the abundance of b*1Aha-1Eth (+8 % and +22 %) and b*2Eth (+9 % and +19 %) in presence of SatRNA Met and EctRNA Met , respectively (Table 1).The observed variations can probably be related to the altered intracellular levels of elongator tRNA Met .Analysis of congener b*1Aha-1Eth by N-terminal sequencing remains to be performed for determining the exact distribution of the two incorporated analogues at the N-terminus and at internal position 47.
Contrary to our expectations, the expression of the SaMetRS / SatRNA Met pair in the metT/metU-deficient strain did not alter significantly the incorporation efficiencies in favor of Eth.In fact, the distribution of Aha and Eth in b* was similar to that obtained in presence of EcMetRS / EctRNA Met .and of C) EctRNA Met (metT).The required identity elements for elongation (green) and aminoacylation (red) are shown.The dotted line indicates that a base modification is required for hydrogen bonding.In Archaea the CCA sequence at the 3´ terminal of pre-tRNA is added by a CCA-adding enzyme after gene transcription.

A Novel Genetic Selection System for AUG Reassignment via MmPylRS-Based Orthogonal Pairs
Due to the lack of orthogonality in the examined SaMetRS / SatRNA Met system, we turned our efforts to modify the Methanosarcina mazei pyrrolysyl-tRNA synthetase (PylRS) / tRNA Pyl pair for use in the AUG decoding system.Its natural orthogonality in E. coli makes it a useful tool for both SCS and sense codon reassignment approaches. [86]Furthermore, PylRS shows a low selectivity toward the anticodon of tRNA Pyl , which could be mutated to recognize the Met AUG codon, and a natural broad substrate specificity, which could be evolved to accept Met analogues.
In order to identify a PylRS mutant able to charge Met, we devised a selection system based on the functionality of a glycolytic enzyme phosphofructokinase-1 (PFK-1).This enzyme catalyzes the first committed step of glycolysis, the conversion of fructose 6-phosphate to fructose 1,6-bisphosphate.Its activity, together with that of phosphofructokinase-2 (PFK-2), is essential for bacterial growth in M9 minimal medium. [87]Importantly, it contains a conserved Met residue at position 169 crucial for substrate binding.It was shown, in fact, that substitutions at this position with alanine, lysine and isoleucine drastically decrease its catalytic efficiency. [88,89]Amber suppression at this position by an MmPylRS mutant able to incorporate Met is expected to restore PFK-1 functionality and, consequently, normal bacterial growth (Figure 5C).
Based on our selection experiments (depicted in Figure 5 and described in detail in the SI) we conclude that our system is reliable.Namely, the incorporation of Met at position 169 in PFK-1 can be used as selection marker, since it confers an advantage in cellular growth compared to incorporation of any of the other 19 canonical amino acids (Figure S7).However, a high background was found on the negative control plate, as several colonies grew after 1.52 days.This indicates that the system needs to be rendered more stringent in order to minimize false positives which Table 1.Abundance of the three b* congeners obtained using pMEc1 and pMEc3 plasmids in B834(DE3)_ΔmetT::frt_ΔmetU::npt and B834(DE3) WT.The combined intensities of the corresponding peaks sum up to 100 %, from which each species percentage was calculated and annotated.Mean and standard deviation (SD) of each congener percentage from triplicate experiments with the double knockout mutant are shown.Heterologous expression of SaMetRS / SatRNA Met (highlighted in brown) did not alter significantly the distribution of the 2 Met analogues compared to the negative control represented by homologously expressed EcMetRS / EctRNA Met (highlighted in red).Natural incorporation efficiencies measured previously in the Met auxotrophic E. coli strain CAG18491 are also reported. [77]nfiguration, strain  may interfere with the subsequent selection.The high background might be caused by UAG read-through by one of the nonsense tRNA suppressors (tRNA Gln , tRNA Leu , tRNA Ser and tRNA Tyr ) [90] or near-cognate aminoacyl-tRNAs identified in E. coli. [91]Incorporation of one of these amino acids at position 169 might produce a full-length PFK-1 variant which is active enough to support bacterial growth, although impaired compared to cells expressing PFK-1 wildtype.To test this hypothesis, the 20 possible variants could be constructed.Nevertheless, a more probable reason for the high number of colonies on the negative control plate could be the presence of some residual LB medium after transformation.The nutritionally rich LB medium can restore normal growth because some of its components may allow the auxotrophic cells to enter glycolysis circumventing PFK-1.
To solve these problems, we have introduced a second stop codon at position M224TAG, which completely eliminated the background suppression.This position was chosen because it is not conserved and does not participate in the substrate or cofactor binding, being located rather in a loop on the protein surface.The applicability of the optimized selection system with two stop codons in identifying the desired PylRS mutant is currently being tested using different libraries constructed in our group.

CONCLUSIONS
This work describes the first attempts to reassign the sense codon AUG to two different Met analogues, Aha and Eth, in E. coli.Based on in vitro data, [77] we tried exploiting the natural substrate preference of EcMetRS / EctRNAi fMet pair to incorporate Aha at the N-terminus, while using an exogenous archaeal SaMetRS / SatRNA Met pair to insert Eth at internal AUG positions.However, it was not possible to unambiguously reassign the AUG codon in initiation and elongation in vivo, without strict orthogonality.Even though the elimination of the elongator EctRNA Met altered the distribution of the two analogues in favor of Eth, it was not sufficient to achieve a significant AUG reassignment.Orthogonality and the differentiation of in vitro vs. in vivo data are important criteria for developing o-pairs. [92]he orthogonality of the Sulfolobus acidocaldariusbased heterologous pair could be achieved by identifying and mutating positions in the SatRNA Met sequence which are critical in its aminoacylation by EcMetRS, as compared to EctRNA Met . [85]Alternatively, the other putative elongator SatRNA Met (tRNA8, Figure 4) could also be tested for its orthogonality in vivo.Furthermore, the substrate specificity of EcMetRS towards Aha should be further evolved in order to achieve exclusive incorporation of this ncAA at the Nterminus.Interestingly, it is also possible to change the incorporation efficiencies at the N-terminus and internal positions by tuning the concentration of the two Met analogues in the media. [77]e also presented the first steps towards the design of a PylRS / tRNA Pyl pair for use in the Met decoding system based on the suppression of an amber stop codon with Met in an important glycolytic gene, pfkA, which, in turn, restores enzyme functionality and normal cellular growth.Importantly, any orthogonal aaRS / tRNA pair that accepts Met could be tested in the E. coli strain lacking EctRNA Met .We envision a PylRS library specifically customized for Met recognition that could be screened and identified with the here developed selection system.Once such synthetase is identified, its substrate specificity could be further evolved towards interesting Met analogues, such as Aha, Anl, Hpg, Nle and Eth and many others.
On the way to the evolution of synthetic cells with alternative biochemical building blocks, we also envision a strategy to exclusively reassign the internal or initial AUG codons but not both because Met is the major methyl group donor in cellular metabolism.Thus, its elimination from the genetic code would impair cellular growth, unless other sources of S-adenosylmethionine could be provided by advanced metabolic engineering.The most reasonable strategy would be to use properly configured metT / metUdeficient E. coli MG1655 strain which might be slowly adapted to use a different amino acid in place of Met using an evolution strategy recently reported by our group. [33,34]his configuration would also require the availability of mutually orthogonal MetRSs capable to generate the context-dependent AUG-codon reassignments: one ncAA (or Met itself) can be encoded site-specifically at the Nterminus and the other ncAAs (or Met itself) can be inserted globally at the internal AUG positions.

Experimental Details
Chromosomal deletion of metT and metU in E. coli Four rescue vectors and two controls were constructed using the pMEc plasmid as backbone, which is a shuttle vector for E. coli (p15A, CmR) and yeast (2u Ori and URA3) [1] to ease in vivo assembly (Figure S1).Plasmid pMEc1 was used to study whether the SaMetRS/SatRNA Met pair can rescue the metT/metU-deficient E. coli strain and reassign the internal AUG codons to Eth.Plasmids pMEc2 (SaMetRS/EctRNA Met ) and pMEc4 (EcMetRS/SatRNA Met ) were used to study if the endogenous and exogenous system cross-react with each other, whereas plasmids pMEc3 was used as positive control (EcMetRS/EctRNA Met ).Finally, two other vectors, pMEc5 (SaMetRS) and pMEc6 (EcMetRS), carrying each synthetase gene but lacking cognate tRNA Met , have served as negative controls.Expression of SaMetRS and EcMetRS was under control of the inducible propionate promoter, which should allow good overall synthetase production without impairing the target protein's yield [2] .The gene for tRNA Met was under control of the constitutive promoter proK.The pMEc plasmid contains the low-copy origin of replication p15A.
The E. coli strain MG1655 was transformed with each of the six pMEc vectors and knockout of metT and metU was carried out using the λ-red recombination system [3] .Replacement of genomic metT and metU with npt (neomycin phosphotransferase) was deemed successful by the appearance of colonies on LB plus kanamycin (Kan R ) selection plates.As expected, MG1655 cells not rescued by an exogenous elongator EctRNA Met (pMEc5 or pMEc6) did not grow on LB Kan R plates.Colonies containing one of the four pMEc rescue plasmids were analyzed by PCR with different sequencing primers (Table S1).Successful recombination occurred in all the analyzed clones generating the MG1655_ΔmetT::frt_ΔmetU::npt strain (Figure S2).Importantly, cells transformed with pMEc4 survived deletion of metT/metU, indicating that heterologous SatRNA Met can be recognized and charged by endogenous EcMetRS.
Subsequently, the npt cassette was moved from MG1655 to the Met auxotrophic strain B834(DE3) by phage P1 transduction [4] .The successful integration of npt at the metT and metU loci was confirmed by PCR (Figure S3).The generated B834(DE3)_ΔmetT::frt_ΔmetU::npt strain was rescued using the plasmid for AUG codon reassignment, pMEc1, and the control plasmid pMEc3.
Both clones were cultured in presence of limiting concentration of Met (45 µM) leading to its exhaustion in the medium during the fermentation, at which point, 0.5 mM L-Aha and 1 mM DL-Eth were added.Subsequently, expression of b*2M and MetRS was induced using 1 mM IPTG and 20 mM propionate, respectively.Expression of the two MetRSs was detected after enrichment of the synthetases from cell lysates using a Strep-Tactin ® column (Figure S4).Barstar congeners were purified by ion exchange chromatography and analyzed by ESI mass spectrometry (Figure S5).

PFKI-based selection system
As recipient for genetic selection, E. coli strain RL257, which is deficient in PFK-1 and PFK-2 and contains the lacI q allele [5] , was used.This strain does not grow in M9 minimal medium containing glucose as the sole carbon source over a period of 25 hours [5] and longer, as observed in our experiments.As selection plasmid, we used the low-copy number pNB26'1_pfkA(M169TAG) carrying the gene for PFK-1 (pfkA) under the lac promoter and having an amber stop codon at position 169.To test the stringency of the selection system, a library of pfkA variants containing the random mutation M169NNK (N = any base, K = G or T) was generated by site-saturation mutagenesis using primer X (FW) and Y (RV).With the NNK degenerate codon, all 20 cAAs are theoretically covered by just 32 of the 64 codons.The diversity of the saturation library was estimated by sequencing (Figure S6).Subsequently, the pfkA(M169NNK) library was transformed by electroporation into RL257 cells.In parallel, RL257 cells were transformed with plasmid pNB26´1_pfkA or plasmid pNB26´1_pfkA(M169TAG) to server as positive and negative control, respectively.The transformants were plated onto M9 plates containing 0.2% glucose and kanamycin and incubated at 37 °C.Interestingly, RL257 colonies transformed with the positive control grew overnight despite PFK-1 expression was not induced (Figure S7, A).This finding suggests that leaky expression of pfkA in presence of the LacI repressor produces enough PFK-1 to rescue the host.The negative control, instead, showed colonies after 1.5 days.After 2 days, satellite colonies appeared on the same plate (Figure S7, B).Finally, in case of the RL257 strain transformed with the pfkA(M169NNK) library, the first colonies appeared after 1 day incubation followed by smaller ones after 2 days (Figure S7, C).Six colonies were picked up from the plate containing the library and the pNB26´1_pfkA(M169NNK) plasmid was isolated from them.The pfkA gene was then sequenced and in all the 6 cases it was found to carry the methionine codon, ATG, at position 169.A second analysis was made by pooling all the colonies on the plate and isolating again the pNB26´1_pfkA(M169NNK) plasmid.Also in this case sequencing results showed the presence of the ATG codon at position 169 (Figure S8).

Figure 3 .
Figure 3. Two types of MetRS / tRNA Met pairs.Type A includes the MetRSs from eubacteria and eukaryotic organelles, which preferentially aminoacylate their initiator tRNAi fMet and elongator tRNA Met counterparts, as well as archaeal or eukaryotic cytoplasmic initiator (tRNAi Met ).Type B includes archaeal or eukaryotic cytoplasmic MetRS, which exclusively aminoacylate their cognate elongator tRNA Met and initiator tRNAi Met .

Figure 4 .
Figure 4. Structures of elongator tRNAs Met .Secondary structure of the two putative S. acidocaldarius elongator tRNA Met , (A) tRNA8 and B) tRNA16,[84] and of C) EctRNA Met (metT).The required identity elements for elongation (green) and aminoacylation (red) are shown.The dotted line indicates that a base modification is required for hydrogen bonding.In Archaea the CCA sequence at the 3´ terminal of pre-tRNA is added by a CCA-adding enzyme after gene transcription.

Figure 5 .
Figure 5. Selection system based on amber suppression for the identification of a PylRS mutant able to charge Met.(A) The E. coli RL257 strain expressing exogenous PFK-1 can grow on minimal medium plates; (B) When a TAG stop codon is inserted in pfkA at position 169 in place of a critical Met residue, cell growth is impaired.Slow growth might be restored when M169TAG is suppressed by one natural amber suppressor tRNA; (C) A PylRS library is subsequently transformed into these cells.A PylRS mutant able to suppress the M169TAG codon with Met (in red) would restore normal cell growth, whereas incorporation of any of the other 19 cAAs (in grey) would result in a slow-growth phenotype.MM = minimal medium.

Figure S1 .
Figure S1.The six pMEc plasmids generated for AUG codon reassignment.Plasmid pMEc1 contains the putative orthogonal pair SaMetRS/SatRNA Met used to direct the incorporation of Eth preferentially at the internal AUG codons.Plasmids pMEc2 and pMEc4 were used to test the orthogonality, respectively, of SaMetRS and elongator SatRNA Met in vivo.Plasmid pMEc3 was used as negative control in comparison with pMEc1.Plasmids pMEc5 and pMEc6 were used to test the effect of MetRS overexpression.Cm R = chloramphenicol resistance.URA3 = uracil auxotrophic marker.Origins of replication 2μ and p15A are used for plasmid propagation in yeast and E. coli, respectively.

Figure S2 .
Figure S2.Verification of metT and metU knockout in MG1655.Water was used as a negative control (C-), while E. coli MG1655 wild-type represented the positive control (C+).A) Knockout of the metT gene in four clones (1-4) was confirmed by PCR using a forward and reverse primer that respectively amplify inside and downstream of the metT gene locus (PB328 + PB698, PCR product = 256 bp).B) Integration of the Kan R cassette (npt) in place of metT in the same clones was verified by PCR using a primer for amplifying npt gene and an external primer targeting downstream of metT gene locus (PB327 + PB329, PCR product = 754 bp).C) Elimination of the Kan R cassette was confirmed by PCR using primers PB327 + PB329.Lane 1 and 4 show the PCR of two clones before "plasmid curing", containing the Kan R cassette (754 bp).Lane 2, 3, 5 and 6 show four clones "cured" by transformation with plasmid pCP20.The cured strain was termed MG1655_ΔmetT::frt.D) Verification of knockout of metU and replacement with npt in MG1655_ΔmetT::frt.Left: knockout of the metU gene in eight clones (2 clones each for pMEc1 -4) was confirmed via PCR with oligos priming inside and downstream of the metU gene locus (PB328 + PB662, PCR product = 441 bp).Right: integration of the Kan R cassette (npt) in place of metU in the same clones was verified with an internal primer directed inside the npt gene and an external one downstream of metU gene (PB329 + PB662, PCR product = 981 bp).The resulting strain is called MG1655_ΔmetT::frt_ΔmetU::npt (pMEc).

Figure S6 .
Figure S6.Sequencing results of site-directed saturation mutagenesis at position M169 in pfkA.The M169NNK mutation is highlighted in the box.The sequencing chromatogram is represented using the software Ape (by M. Wayne Davis).Green = A; Blue = C; Red = T; Black = G.The randomization frequencies at positions N1 and N2 are higher for the nucleotide "A".

Figure S8 .
Figure S8.Sequencing results of the pfkA(M169NNK) library extracted from RL257 colonies grown on M9 + 0.2% Glu.The randomized codon (169NNK) in the pfkA library corresponded to the methionine ATG codon in all the pooled RL257 colonies.The sequencing chromatogram is represented using the software Ape (by M. Wayne Davis).