Programming cells: towards an automated ‘Genetic Compiler’

https://doi.org/10.1016/j.copbio.2010.07.005Get rights and content

One of the visions of synthetic biology is to be able to program cells using a language that is similar to that used to program computers or robotics. For large genetic programs, keeping track of the DNA on the level of nucleotides becomes tedious and error prone, requiring a new generation of computer-aided design (CAD) software. To push the size of projects, it is important to abstract the designer from the process of part selection and optimization. The vision is to specify genetic programs in a higher-level language, which a genetic compiler could automatically convert into a DNA sequence. Steps towards this goal include: defining the semantics of the higher-level language, algorithms to select and assemble parts, and biophysical methods to link DNA sequence to function. These will be coupled to graphic design interfaces and simulation packages to aid in the prediction of program dynamics, optimize genes, and scan projects for errors.

Introduction

Numerous genetic circuits have been built that encode functions that are analogous to electronic circuits [1, 2, 3]. When multiple circuits are connected to sensors and actuators, this forms a genetic program. For example, we constructed an ‘edge detector’ program that combines a ANDN gate, light sensor, and cell–cell communication that give bacteria the ability to draw the edge between the light and dark regions of an image projected onto a plate [4]. Other genetic programs have been built that combine circuits to produce a push-on/push-off circuit [5], implement a counter [6••], and reproduce predator–prey dynamics [7]. These represent toy systems, but the implementation of such programs in applications for industrial biotechnology is inevitable.

Automated DNA synthesis gives genetic engineers an unprecedented design capacity [8]. This technology enables the specification of every basepair for long sequences, without having to be concerned about the path to construction. Together with methods to rapidly combine genetic parts [9••, 10] and assembly methods that scale to whole genomes [11, 12, 13], the problem of DNA construction has far outpaced our capacity for design [14]. A good example of this is the 2006 UCSF iGEM team to build a ‘remote-controlled bacterium.’ DNA synthesis was used to build the first construct (requiring a few weeks), but after four years of additional tinkering, the paper will be submitted in 2010.

Our ability to design programs has been hampered by three problems. First, there is a lack of good, robust genetic circuits that can be easily connected. Second, there are few design rules that are sufficiently quantitative to be carried out algorithmically. Modeling can be helpful before the experiments to determine the topologies and parameter regimes required to obtain a particular function. However, simulations cannot be used to ‘reach down’ to the DNA and suggest a specific mutation or select a part. Third, the frequency of mistakes in the DNA sequence increases quickly with size. Currently, to scan for potential errors (e.g. transposon insertion sites or putative internal promoters), it requires the running of multiple (usually web-based) programs. There is no unified software package to date that addresses all of these issues.

The creation of a simulation environment for genetic engineering is complicated by the diversity of cellular functions. When studying natural networks, there is a feeling of ‘peeling an onion,’ where there are seemingly endless redundancies and classes of biochemical interactions. Even within the Registry of Standard Biological Parts (www.partsregistry.org), there are a wide variety of cellular functions: from enzymes and transcription factors to multi-gene gas vesicles and secretion chaperones. Each specific problem requires its own style of simulation; a dynamic program may be well satisfied by sets of differential equations, pattern formation by cellular automata, and enzymes by metabolic flux analysis. It would be daunting to create a simulation package that could encompass all of this diversity.

To reduce the problem complexity and to frame recent computational work, we introduce the concept of a ‘Genetic Compiler,’ whose inputs are high-level instructions (equivalent to VHDL or Verilog) and whose output is a DNA sequence. The sequence can be sent to a company for DNA synthesis or a robot for automated assembly. The problem is constrained by focusing on genetic programs that encode a desired logical or dynamical function, which can be integrated into many applications in biotechnology (Figure 1). This avoids the application-specific portions of the problem; for example, building a butanol sensor a particular metabolic pathway. It is distinct from tools for protein or metabolic engineering [15].

The scope of this review is on the underlying algorithms and biophysical methods that would power such a compiler (Figure 2). Realizing this goal will require: 1. Libraries of reliable genetic circuits designed specifically to be part of a CAD program, 2. the definition of a higher-level language, 3. algorithms to assemble circuits according to a specified program, 3. biophysical methods to connect and optimize circuits, 4. simulation programs to debug the program dynamics, 5. algorithms for DNA assembly and experimental design. The scope has been limited to exclude several topics that are crucial to synthetic biology, but have been well-reviewed elsewhere, notably codon optimization and tools from systems biology and metabolic engineering [15, 16, 17].

Section snippets

Robust combinatorial logic

Combinatorial logic is implemented by Boolean circuits and is the basis for digital computing. It is used to build circuits that apply Boolean algebra on a set of inputs to transform them into a set of desired outputs. Simple circuits can be layered in different configurations in order to achieve a computational operation. This has enabled the automated design that underlies VLSI. The ability for digital circuits to be flexibly used and easily captured by CAD comes at a cost of speed, design

Conclusions

Genetic engineering is moving towards becoming an information science. The model of storing and distributing genetic material is slowly loosing relevance. It is routine to outsource the task of constructing DNA from the designer to synthesis facilities. This has created a strong need for computer-aided design programs that are able to facilitate the organization and construction of large projects. Once the parts are experimentally characterized, it is unnecessary to distribute the DNA. Rather,

References and recommended reading

Papers of particular interest, published within the annual period of review, have been highlighted as:

  • • of special interest

  • •• of outstanding interest

Acknowledgements

The authors thank Ron Weiss (MIT), Rahul Sarpeshkar (MIT), Alan Mishchenko (UC-Berkeley), Jean Peccoud (VPI), Costas Maranas (Penn State), and Douglas Densmore (BU) for helpful discussions. CAV is supported by Life Technologies, ONR, Packard Foundation, NIH, NSF (synBERC: Synthetic Biology Engineering Research Center, www.synberg.org) and a Sandpit on Synthetic Biology hosted by EPSRC/NSF.

References (97)

  • K.F. Murphy et al.

    Combinatorial promoter design for engineering noisy gene expression

    Proc Natl Acad Sci USA

    (2007)
  • J.M. Pedraza et al.

    Noise propegation in gene networks

    Science

    (2005)
  • T. Lu et al.

    A molecular noise generator

    Phys Biol

    (2008)
  • J. Stricker et al.

    A fast, robust and tunable synthetic gene oscillator

    Nature

    (2008)
  • T.S. Ham et al.

    A tightly regulated inducible expression system utilizing the fim inversion recombination switch

    Biotechnol Bioeng

    (2006)
  • H. Salis et al.

    Computer-aided design of modular protein devices: Boolean AND gene activation

    Phys Biol

    (2006)
  • C. Lou et al.

    Synthesizing a novel genetic sequential logic circuit: a push-on push-off switch

    Mol Syst Biol

    (2009)
  • A.E. Friedland et al.

    Synthetic gene networks that count

    Science

    (2009)
  • F.K. Balagadde et al.

    A synthetic Escherichia coli predator–prey system

    Mol Syst Biol

    (2008)
  • M.J. Czar et al.

    Gene synthesis demystified

    Trends Biotechnol

    (2008)
  • D.G. Gibson et al.

    Enzymatic assembly of DNA molecules up to several hundred kilobases

    Nat Methods

    (2009)
  • R.P. Shetty et al.

    Engineering BioBrick vectors from BioBrick parts

    J Biol Eng

    (2008)
  • D.G. Gibson et al.

    One-step assembly in yeast of 25 overlapping DNA fragments to form a complete synthetic Mycoplasma genitalium genome

    Proc Natl Acad Sci USA

    (2008)
  • D.G. Gibson et al.

    Creation of a bacterial cell controlled by a chemically synthesized genome

    Science

    (2010)
  • H.H. Wang et al.

    Programming cells by multiplex genome engineering and accelerated evolution

    Nature

    (2009)
  • P.E. Purnick et al.

    The second wave of synthetic biology: from modules to systems

    Nat Rev

    (2009)
  • B.-K. Cho et al.

    Microbial regulatory and metabolic networks

    Curr Opin Biotechnol

    (2007)
  • S.M. Richardson et al.

    GeneDesign 3.0 is an updated synthetic biology toolkit

    Nucleic Acids Res

    (2010)
  • A. Villalobos et al.

    Gene Designer: a synthetic biology tool for constructing artificial DNA segments

    BMC Bioinformatics

    (2006)
  • K.I. Ramalingam et al.

    Forward engineering of synthetic bio-logical AND gates

    Biochem Eng J

    (2009)
  • A. Kinkhabwala et al.

    Uncovering cis regulatory codes using synthetic promoter shuffling

    PLoS ONE

    (2008)
  • R.S. Cox et al.

    Reprogramming gene expression with combinatorial promoters

    Mol Syst Biol

    (2007)
  • O. Rackham et al.

    Synthesizing cellular networks from evolved ribosome-mRNA pairs

    Biochem Soc Trans

    (2006)
  • Y. Benenson

    RNA-based computation in live cells

    Curr Opin Biotechnol

    (2009)
  • V. Sharma et al.

    Engineering complex riboswitch regulation by dual genetic selection

    J Am Chem Soc

    (2008)
  • J.E. Dueber et al.

    Reprogramming control of an allosteric signaling switch through modular recombination

    Science

    (2003)
  • N.E. Buchler et al.

    On schemes of combinatorial transcription logic

    Proc Natl Acad Sci USA

    (2003)
  • Y. Yokobayashi et al.

    Directed evolution of a genetic circuit

    Proc Natl Acad Sci USA

    (2002)
  • B. Canton et al.

    Refinement and standardization of synthetic biological parts and devices

    Nat Biotechnol

    (2008)
  • H.M. Salis et al.

    Automated design of synthetic ribosome binding sites to control protein expression

    Nat Biotechnol

    (2009)
  • C. Tan et al.

    Emergent bistability by a growth-modulating positive feedback circuit

    Nat Chem Biol

    (2009)
  • S. Hooshangi et al.

    Ultrasensitivity and noise propegation in a synthetic transcriptional cascade

    Proc Natl Acad Sci USA

    (2005)
  • E.S. Groban et al.

    Kinetic buffering cross talk between bacterial two-component systems

    J Mol Biol

    (2009)
  • Y. Cai et al.

    A syntactic model to design and verify synthetic genetic constructs derived from standard biological parts

    Bioinformatics

    (2007)
  • Y. Cai et al.

    Modeling structure–function relationships in synthetic DNA sequences using attribute grammars

    PLoS Comput Biol

    (2009)
  • Densmore D, Kittleson JT, Bilitchenko L, Liu A, Anderson JC, Rule based constraints for the construction of genetic...
  • M.J. Czar et al.

    Writing DNA with GenoCAD

    Nucleic Acids Res

    (2009)
  • Y. Cai et al.

    GenoCAD for iGEM: a grammatical approach to the design of standard-complient constructs

    Nucleic Acids Res

    (2010)
  • Cited by (0)

    View full text