Expansion of the genetic code via expansion of the genetic alphabet

https://doi.org/10.1016/j.cbpa.2018.08.009Get rights and content

Highlights

  • An unnatural base pair has been developed and used as the foundation of semi-synthetic organisms.

  • The semi-synthetic organisms can use the unnatural base pair to efficiently make unnatural proteins.

  • Hydrophobic and packing interactions can replace hydrogen bonds to control base pairing.

Current methods to expand the genetic code enable site-specific incorporation of non-canonical amino acids (ncAAs) into proteins in eukaryotic and prokaryotic cells. However, current methods are limited by the number of codons possible, their orthogonality, and possibly their effects on protein synthesis and folding. An alternative approach relies on unnatural base pairs to create a virtually unlimited number of genuinely new codons that are efficiently translated and highly orthogonal because they direct ncAA incorporation using forces other than the complementary hydrogen bonds employed by their natural counterparts. This review outlines progress and achievements made towards developing a functional unnatural base pair and its use to generate semi-synthetic organisms with an expanded genetic alphabet that serves as the basis of an expanded genetic code.

Introduction

Biological diversity allows life to adapt to different environments, and over time, evolve new forms and functions. The source of this diversity is the variation within protein sequences provided by the twenty natural amino acids, variation that is encoded in an organism’s genome by the four natural DNA nucleotides. Although the functional diversity provided by the natural amino acids may be high, the vastness of sequence space dramatically limits what might actually be explored, and moreover, some functionality is simply not available. Nature’s use of cofactors for hydride transfer, redox activity, electrophilic bond formation and so on, attests to these limitations. Furthermore, with the increasing focus on developing proteins as therapeutics [1], these limitations are problematic, as the physiochemical diversity of the natural amino acids is dramatically restricted compared to that of the small molecule drugs designed by chemists. In principle, it should be possible to circumvent these limitations by expanding the genetic code to include additional, non-canonical amino acids (ncAAs) with desired physiochemical properties.

Almost 20 years ago, Peter Schultz increased the diversity available to living organisms by expanding the genetic code using the amber stop codon (UAG) to encode ncAAs in Escherichia coli [2••,3••]. This landmark accomplishment was achieved using a tRNA–amino acid tRNA synthetase (aaRS) pair from Methanococcus jannaschii, in which the tRNA was recoded to suppress the stop codon and the aaRS was evolved to charge the tRNA with an ncAA. This method of codon suppression has since been expanded to the other stop codons [4] and even quadruplet codons [5], as well as to the use of several other orthogonal tRNA–aaRS pairs (most notably the pyrrolysyl (Pyl) tRNA–synthetase pair from Methanosarcina barkeri/mazei [6,7,8,9]), broadening the scope of ncAAs that may be incorporated into proteins. These methods have already begun to revolutionize both chemical biology [10, 11, 12] and protein therapeutics [13].

Though these methods enable incorporation of up to two, different ncAAs in both prokaryotic [14] and eukaryotic [15] cells, the heterologous recoded tRNAs must compete with endogenous release factors (RFs), or in the case of quadruplet codons, normal decoding [16], which limits the efficiency and fidelity of ncAA incorporation. To eliminate competition with RF1, which recognizes the amber stop codon and terminates translation, efforts have been directed toward removal of many or all instances of the amber stop codon in the host genome [17,18] or modification of RF2 [19] to allow for the deletion of RF1. However, eukaryotes have only one release factor, and while it may be modified [20] it cannot be deleted, and with prokaryotes, deletion of RF1 results in greater mis-suppression of the amber stop codon by other tRNAs, which reduces the fidelity of ncAA incorporation [21]. Though Herculean efforts to further exploit codon redundancy to liberate natural codons for reassignment to ncAAs are underway [22], codon reassignment may be complicated by pleiotropic effects, as codons are not truly redundant, for example due to their effects on the rate of translation and protein folding [23]. In addition, codon reappropriation is limited by the challenges of large-scale genome engineering, especially for eukaryotes [24].

An alternative approach to natural codon reassignment is the creation of entirely new codons that are free of any natural function or constraint, and whose recognition at the ribosome is inherently more orthogonal. This may be accomplished through the creation of organisms that harbor a fifth and sixth nucleotide that form an unnatural base pair (UBP). Such semi-synthetic organisms (SSOs) would need to faithfully replicate DNA containing the UBP, efficiently transcribe it into mRNA and tRNA containing the unnatural nucleotides, and then efficiently decode unnatural codons with cognate unnatural anticodons. Such SSOs would have a virtually unlimited number of new codons to encode ncAAs.

Section snippets

Development of a functional UBP

The first challenge in developing an expanded genetic alphabet is the identification of unnatural nucleotides that selectively pair in duplex DNA and during replication by a DNA polymerase. Although the Benner group (who has contributed to this special issue) has approached this challenge with synthetic nucleotides that pair via hydrogen bonding (H-bonding) patterns that are orthogonal to the natural base pairs [25], our group, as well as the Hirao group (who has also contributed to this

An SSO with an expanded genetic alphabet

E. coli was an obvious target for the creation of the first SSO, being an extensively studied organism with vast resources and tools for genetic manipulation. To overcome the crucial obstacle of making unnatural triphosphates available for replication within the SSO, we turned to the plastids of Phaeodactylum tricornutum that use nucleoside triphosphate transporters (NTTs) to import triphosphates from the cytosol. Expression of the transporter PtNTT2 enables uptake of a wide variety of natural

An SSO with an expanded genetic code

Having developed SSOs with an expanded genetic alphabet, we turned our attention to using the expanded alphabet as the basis of an expanded code (Figure 3b). In addition, in order to explore whether the UBP on its own could productively participate in every step of information storage and retrieval, despite completely lacking hydrogen bonds, our initial efforts employed an SSO that did not benefit from either the error elimination or avoidance mechanisms described above [44••]. We incorporated

Conclusions and future perspectives

Amber suppression represented a milestone in the biological sciences, but it is not without its limitations, and efforts to overcome these limitations are challenged by fidelity, generality, cost, and possibly pleiotropic effects. The approach based on the development of UBPs requires a significant amount of chemical optimization, but it should be more portable to other host organisms, as it does not require significant genome modification, and it may even prove more optimal due to increased

Conflict of interest statement

Patent applications have been filed by Synthorx and The Scripps Research Institute covering the UBPs and their use to produce proteins containing ncAAs. FER also has shares in Synthorx, Inc., a company that has commercial interests in the UBP.

References and recommended reading

Papers of particular interest, published within the period of review, have been highlighted as

  • • of special interest

  • •• of outstanding interest

Acknowledgements

This work was supported by the National Institutes of Health (GM118178 to FER and GM128376 to RJK) and the National Science Foundation (Graduate Research Fellowship NSF/DGE-1346837 to SEM).

References (44)

  • H. Neumann et al.

    Encoding multiple unnatural amino acids via evolution of a quadruplet-decoding ribosome

    Nature

    (2010)
  • G. Srinivasan et al.

    Pyrrolysine encoded by UAG in Archaea: charging of a UAG-decoding specialized tRNA

    Science

    (2002)
  • W. Wan et al.

    Pyrrolysyl-tRNA synthetase: an ordinary enzyme but an outstanding genetic code expansion tool

    Biochim Biophys Acta

    (2014)
  • J.C.W. Willis et al.

    Mutually orthogonal pyrrolysyl-tRNA synthetase/tRNA pairs

    Nat Chem

    (2018)
  • I. Coin

    Application of non-canonical crosslinking amino acids to study protein–protein interactions in live cells

    Curr Opin Chem Biol.

    (2018)
  • R.E. Kelemen et al.

    Synthesis at the interface of virology and genetic code expansion

    Curr Opin Chem Biol

    (2018)
  • T. Courtney et al.

    Recent advances in the optical control of protein function through genetic code expansion

    Curr Opin Chem Biol

    (2018)
  • T. Mukai et al.

    Codon reassignment in the Escherichia coli genetic code

    Nucleic Acids Res

    (2010)
  • D.B. Johnson et al.

    RF1 knockout allows ribosomal incorporation of unnatural amino acids at multiple sites

    Nat Chem Biol

    (2011)
  • W.H. Schmied et al.

    Efficient multisite unnatural amino acid incorporation in mammalian cells via optimized pyrrolysyl tRNA synthetase/tRNA expression and engineered eRF1

    J Am Chem Soc

    (2014)
  • H.R. Aerni et al.

    Revealing the amino acid composition of proteins within an expanded genetic code

    Nucleic Acids Res

    (2015)
  • N. Ostrov et al.

    Design, synthesis, and testing toward a 57-codon genome

    Science

    (2016)
  • Cited by (44)

    • Expanding the chemical repertoire of protein-based polymers for drug-delivery applications

      2022, Advanced Drug Delivery Reviews
      Citation Excerpt :

      Several such codons can be used, and alongside the reassignment of triplet codons, to incorporate multiple distinct uAAs per protein using mutually orthogonal aaRS-tRNA pairs [144–146], and can be facilitated by an orthogonal ribosome that was engineered to more efficiently decode the quadruplet codons [96,146,147]. Alternatively, entirely new synthetic base-pairing nucleotides can potentially be introduced to yield a vast number of new codons for uAA incorporation [148]. Several examples of unnatural nucleotide pairs have been developed [149–151], and one such pair has recently been shown to sustain the mRNA transcription and translation of a protein containing an uAA [151].

    • Biosynthesis and Genetic Incorporation of 3,4-Dihydroxy-L-Phenylalanine into Proteins in Escherichia coli

      2022, Journal of Molecular Biology
      Citation Excerpt :

      Genetic Code Expansion technology enables the site-specific incorporation of noncanonical amino acids (ncAAs) into proteins in living cells, a methodology that has proven to be a powerful tool for investigating and manipulating protein structure and function.1–9

    • Robustness against point mutations of genetic code extensions under consideration of wobble-like effects

      2021, BioSystems
      Citation Excerpt :

      One point on which almost all theories agree is that its development has followed the direction of including ever increasing numbers of amino acids to be encoded (Barbieri, 2019; Di Giulio, 2005; Wong, 1975). This point is increasingly relevant today because of the development of biotechnology and especially the synthesis of new medicines (Anderson et al., 2004; Chin, 2017; Dien et al., 2018; Kimoto et al., 2009; Neumann et al., 2010). Graph theoretical methods have proved to be extremely helpful and fruitful in the study of the structural properties of the genetic code.

    View all citing articles on Scopus
    1

    These authors contributed equally.

    View full text