Emerging engineering principles for yield improvement in microbial cell design

Metabolic Engineering has undertaken a rapid transformation in the last ten years making real progress towards the production of a wide range of molecules and fine chemicals using a designed cellular host. However, the maximization of product yields through pathway optimization is a constant and central challenge of this field. Traditional methods used to improve the production of target compounds from engineered biosynthetic pathways in non-native hosts include: codon usage optimization, elimination of the accumulation of toxic intermediates or byproducts, enhanced production of rate-limiting enzymes, selection of appropriate promoter and ribosome binding sites, application of directed evolution of enzymes, and chassis re-circuit. Overall, these approaches tend to be specific for each engineering project rather than a systematic practice based on a more generalizable strategy. In this mini-review, we highlight some novel and extensive approaches and tools intended to address the improvement of a target product formation, founded in sophisticated principles such as dynamic control, pathway genes modularization, and flux modeling.


Introduction
The utilization of industrial biotechnology for the conversion of energy, chemicals and materials to value-added products is centuriesold. Traditional methods of strain development based on evolution, random mutagenesis, mating and selection strategies were successfully used for the optimization of production hosts. However, early applications of microbial production of chemicals were limited to native metabolites, such as amino acids, alcohols, organic acids, fatty acids or antibiotics. The foundational development of genetic and metabolic engineering made possible the production of a wide and diverse range of molecules including biofuels, pharmaceuticals, biopolymers, precursors and specialty chemicals [1]. By definition, metabolic engineering is the field that involves the construction, redirection, and manipulation of cellular metabolism through the alteration of endogenous and/or heterologous enzyme activities and levels to achieve the biosynthesis or biocatalysis of desired molecules [2]. Thus, the design of a cell factory requires a thoughtful understanding of the metabolic reactions involved in the synthesis of a target product, the consideration of the regulatory elements that affect metabolic throughput and the analysis of interconnectedness of cellular metabolism. In this sense, the development of the collectively named "omics technologies" has prompted the progress of metabolic engineering in the past decades: 1) the improvements in genome sequencing and DNA synthesis technologies; 2) the expansion of gene expression, metabolic reactions and enzyme structures databases; 3) the generation of new genetic tools to exert a strict control over metabolic pathways; 4) the setting up of new analytical methods to detect and quantify RNA, protein and cell metabolites; and 5) the creation of detailed biological models aided to the design of enzymes and metabolic pathways [3].
The potential of metabolic engineering could be exemplified by the early successful heterologous production of polyketide compounds. Polyketides are an important class of natural products with complex chemical structures widely used as antibiotics, immunosuppressant, antitumor, antifungal and antiparasitic agents [4]. These biomolecules are synthesized from simple building blocks such as acetyl coenzyme A (acetyl-CoA), propionyl-CoA, malonyl-CoA, and methylmalonyl-CoA through the action of multienzyme complexes or large modular megasynthases called PKSs [5]. The structural complexity of polyketides often precludes the development of practical chemical synthetic routes, leaving fermentation as the only viable source for the commercial production of these pharmaceutically and agriculturally useful agents. Since most polyketide-producing organisms are often difficult to culture at industrial scale, and the genetic tools available for them are scarce, the use of a more genetically and physiologically tractable heterologous host became a reasonable alternative. A significant achievement in this field was the production of the 6-deoxyerythronolide B (6dEB) aglycone in an engineered strain of Escherichia coli [6]. This was followed by the successful biosynthesis of yersiniabactin [7], a polyketidenonribosomal peptide hybrid, and an ansamycin polyketide precursor [8]. These results provide the platform, not only for the expression of new clusters of PKS genes but also to greatly enhance the rate at which combinatorial biosynthesis tools are developed and PKS are engineered. Remarkably, the expression of 23 foreign genes in E. coli enabled the conversion of 6dEB into the bioactive erythromycin analogs Ery C and Ery D. Further chassis optimization provided a robust proof of principle that mature erythromycin analogs can be efficiently produced in a single heterologous expression system [9,10].
Nevertheless, metabolic engineering is not only restricted to "compound manufacture". Nielsen J. grouped different examples of metabolic engineering into the following categories [11]: 1) CSBJ Abstract: Metabolic Engineering has undertaken a rapid transformation in the last ten years making real progress towards the production of a wide range of molecules and fine chemicals using a designed cellular host. However, the maximization of product yields through pathway optimization is a constant and central challenge of this field. Traditional methods used to improve the production of target compounds from engineered biosynthetic pathways in non-native hosts include: codon usage optimization, elimination of the accumulation of toxic intermediates or byproducts, enhanced production of rate-limiting enzymes, selection of appropriate promoter and ribosome binding sites, application of directed evolution of enzymes, and chassis re-circuit. Overall, these approaches tend to be specific for each engineering project rather than a systematic practice based on a more generalizable strategy. In this mini-review, we highlight some novel and extensive approaches and tools intended to address the improvement of a target product formation, founded in sophisticated principles such as dynamic control, pathway genes modularization, and flux modeling.

Emerging engineering principles for yield improvement in microbial cell design
Santiago Comba a , Ana Arabolaza a,* , Hugo Gramajo a heterologous protein production; 2) extension of substrate range; 3) creation of pathways leading to new products; 4) degradation of xenobiotics; 5) engineering of cellular physiology for process improvement; 6) elimination or reduction of byproduct formation; 7) improvement of yield and productivity.
Maximization of product yields through pathway optimization is a central challenge for every metabolic engineering project [3]. Natural metabolic pathways are controlled by myriad of regulatory systems such as global and specific transcription factors, promoter strength, biochemical regulation of enzymes and substrate availability, among others. Metabolic engineers can potentially repurpose these features to modulate pathway components in order to improve the efficiency of product formation. In this sense, the ability to tune pathways has improved as the fundamental principles of metabolism and biological regulation continue to be discovered [12]. Because the yield and productivity of a process are linked to its commercial viability, the optimization of production of a target metabolite often requires precisely controlled expression of several natural and/or heterologous genes to avoid that individual conversion steps in the pathway limit the desired product yield, ensuring that cellular resources and energy are being efficiently utilized.
In essence, pathway optimization is a multivariate problem and the answer of this challenge is not universal and depends on the pathway to be expressed, the compound to be produced, the host, the availability of genetic tools, and the substrates utilized. In this context, the ability of the metabolic engineer to identify potential bottlenecks and eliminate them is critical and has a decisive impact on project success. This mini-review will dig into the existing novel approaches intended to improve the efficiency of metabolite-compounds production in designed microbial cells.

Dynamic control of biosynthetic pathways to balance metabolism
The simple overexpression of a biosynthetic pathway often is not enough to achieve high yield production of the desired compound. In this context, the depletion of essential cellular precursors (for the production of needless RNAs, proteins or metabolites) and the accumulation of intermediates can cause toxic effects on the host and lead therefore to a decrease in productivity [13][14][15]. On the other hand, the suboptimal expression of the genes coding the rate-limiting enzymes of a metabolic pathway will also limit the production of the target compound. Metabolic flux imbalances are always governing the yield of a desired product, therefore methods to tightly control gene expression and/or protein levels are essential in the design of any metabolic engineering approach. Besides well known "static" methods or devices used to regulate gene transcription and RNA translation, such as modification of promoter strength, customization of RBS and modulation of RNA stability [16]; novel and more flexible strategies are being designed to exert a dynamic control over the heterologous systems. For example, dynamic regulation would allow an organism to adapt its metabolic flux to changes within the host or in its environment in real time [17]. An even better regulation system for an engineered pathway would sense the concentration of critical pathway intermediates and dynamically regulate the production and consumption of the intermediates, which would allow the delivery of intermediates at the appropriate levels and rates, in order to optimize the pathway for its highest productivity as conditions change in the cell's environment [18]. The common principle of these novel strategies is the application of "sensor-actuator systems" based either in proteins (specifically transcription factors) or in RNA-regulatory elements that could exert a fine and tunable control over the level of gene expression in a concerted manner by sensing the intracellular pool of a specific metabolite.
Farmer and Liao introduced this idea by demonstrating that dynamic control of a flux could improve yields and productivity of lycopene production with an engineered E. coli strain [19]. The system was design to sense an excess of glucose by using an acetyl phosphate-activated transcription factor and its target promoter. Excess glucose flux and diversion of carbon to acetate formation increased the intracellular concentration of acetyl phosphate and reduced the productivity of lycopene. Thus, when acetyl phosphate accumulates inside the cell the transcription of two genes of the engineered pathway was up-regulated in order to redirect carbon flux from acetate to lycopene. This engineered strain produced 180-fold higher titers of lycopene than the strain that contains the biosynthetic genes under control of tac promoter [19]. Recently, Zhang et al.
described a dynamic sensor-regulator system (DSRS) to optimize a fatty acid-derivative compound production in E. coli [18]. DSRS is based on a ligand-responsive transcription factor that regulates the expression levels of genes involved in fatty acid ethyl ester (FAEE) production. In this system, a biosensor was design in order to coordinately regulate the expression of the enzymatic steps that provide ethanol and condense it with fatty acyl-CoA according to the availability of fatty acids in the cell (Figure 1). This biosensor is based on the E. coli FadR transcription factor, capable of binding fatty acyl-CoA molecules, and on hybrid promoters created by the combination of FadR and LacI responsive elements [18]. The FAEE biosynthetic pathway contains three modules. Module A uses the native E. coli fatty acid biosynthetic pathway and a thioesterase, TesA, to produce free fatty acids. Module B is an ethanol biosynthetic pathway that converts pyruvate into ethanol. Module C contains an acyl-CoA synthase (FadD) and a wax-ester synthase (AtfA). AtfA condenses the fatty acyl-CoA product (generated from module A plus FadD activity) and ethanol (module B) to form FAEEs (Figure 1). Following this approach, 20 different strains were constructed and the best ones reached a productivity of 1.5 g/l after 3 days of incubation, corresponding to 28% of the maximum FAEE theoretical yield [18].
Up to date, there are few published examples of engineered dynamic control of fluxes in heterologous pathways. However, in general, this strategy can be extended to design biosensors and regulatory systems for other molecules and metabolic pathways using the large pool of natural ligand-responsive transcription factors [20]. These systems should allow to sense in real time the metabolism of the host cells and adjust the expression of the biosynthetic machinery in such a way to prevent toxicity and maximize the efficiency of product formation.
The above mentioned mechanisms were based on proteins which are involved in regulating gene expression at the transcriptional level. However, as it is widely known, the modulation of gene expression can be achieved by intervening at post transcriptional stages. One remarkable example of a biological mechanism that has been successfully adapted to control the expression of engineered systems is based on the use of RNA-regulatory elements called riboswitches. These natural elements can recognize a variety of structurally diverse ligands, controlling gene expression in a concentration dependent manner [21]. Moreover, RNA elements can be subjected to successive rounds of mutagenesis and selection in order to generate synthetic riboswitches that could respond to a desired compound. Modularization and systematization of metabolic engineering As we already mentioned, classical optimization of metabolic pathways is achieved by modification of promoters, RBS, codon composition, copy number, construction of operons, etc [26]. However, these approaches can rarely be extrapolated among different hosts since the genetic toolboxes available for each microorganism differ greatly and the understanding of regulatory mechanisms prevailing into the defined cell host is often incomplete. This scenario thus implies that an almost complete optimizing strategy has to be designed ad hoc for each new metabolic engineering project, making this discipline costly and very time-consuming [27].
In order to achieve the same flexibility of chemical engineering for the production of structurally diverse molecules, metabolic engineering need to establish principles of systematization to achieve their goals in terms of production and optimization. An excellent application of this concept was applied to the design and development of engineered pathways for the production of terpenoid derived moieties.
Studies addressing the heterologous pathway optimization of isoprenoid-derived molecules on E. coli resulted in innovative and interesting advances not only for their particular yield improvement, but also because they laid down strategies capable to systematize metabolic engineering approaches, generating universal principles capable of been transferred to other pathways and/or hosts [27]. Chronologically, this case moved from particular to more universal optimization principles developed and applied in order to maximize product titers of these pathways. Due to their polymer-forming functional groups, the natural terpenoid family of products is composed of very diverse functional and structural molecules, ranging from antioxidant pigments to therapeutic drugs. Amorphadiene (amorpha-4,11-diene) and taxadiene (taxa-4,11-diene) are precursors for the synthesis of the antimalarial compound artemisinin and the anticancer drug taxol, respectively. In nature, terpenoids are synthesized by one of two possible routes: the mevalonate (MVA) pathway, which starts with the condensation of two acetyl-CoA and is present in eukaryotic cells, and the non-mevalonate 2-methyl-(D)erythriol-4-phosphate (MEP) pathway, which initiates with the condensation of glyceraldehyde-3-phosphate and pyruvate, and can be found in bacteria and plant plastids [27].
First, amorphadiene production was achieved by heterologous expression of the MVA pathway from yeast and a codon-optimized amorphadiene synthase gene from Artemisia annua, in E. coli [28].
Although this system could produce significant levels of the precursor moiety, it was far from the theoretical maximum yield. Further optimization of amorphadiene titers was done by Anthony et al. [26]. Importantly, in this study the authors gave the first outline for systematization of amorphadiene metabolic engineering. They constructed a plasmid in which combinations of various promoters and operon constructs could be easily substituted and tested. In this way they addressed the problems of having many plasmids for high level of gene expression in the same cell (i.e., the metabolic burden imposed to the cell because of the active synthesis of plasmid DNA, and the corresponding proteins for antibiotic resistance [29]; and the flux imbalances between the segments of the pathway which are in different replicons due to plasmid instability and fluctuations in copy numbers). Using this backbone they were able to combine on one vector two transcriptional units previously located in different plasmids, increasing the amorphadiene titers by 3-fold. Furthermore, they were able to identify the limiting enzymatic step by overexpressing one by one each gene of the pathway and quantifying the final amount of amorphadiene being produced. This approach constitutes a clear demonstration in which classical optimization techniques can be used in a systematic way, providing a strategy that can be transferred to other systems.
Ajikumar et al., went further with systematization and they laid down a novel approach known as "multivariate modular metabolic modeling" (MMME) [30]. They developed a system for efficient taxadiene synthesis in E. coli by expressing the native host MEP pathway, the geranylgeranyl diphosphate (GGPP) synthase and the taxadiene synthase. The MEP pathway has the advantage over the MEV pathway in being more balanced and efficient in producing isoprenoids from sugars. The MMME approach is based on grouping enzymes with similar turnovers in modules, and then doing a combinatorial screening varying the expression of the different modules. For this, operons are constructed and the toolbox of promoters, RBS and plasmid vector can be used to achieve different expression levels which will maximize the product titers of the system. The combinatorial approach can easily circumvent the non-linear interactions between enzymes in the pathway and offer the opportunity to broadly sample parameter space [30]. In this way, the authors constructed the "upstream module", an operon coding for the genes of the E. coli MPE pathway, which generates IPP (isopentenyl pyrophosphate) and DMAPP (dimethylallyl pyrophosphate); and the "downstream module", a GGPP synthase and taxadiene synthase biscistronic transcriptional unit. Thus, instead of dealing with six genes, the problem is reduced to two modules, considerably simplifying the system, making it more tractable and easy to handle. In this way, the carbon flux balance along the pathway is achieved by changing the expression levels (plasmid copy number or promoter strength) of each module, and doing a combinatorial screening of all the possible combinations ( Figure 2). Using this approach, the authors needed to construct 32 strains in order to obtain a substantial increase of 15,000 times in taxadiene titers over the control strain without modularization. In this case, this was enough to carefully balance the flux trough the heterologous pathway, avoiding the accumulation of toxic intermediaries and matching the output with the input of the upstream and downstream modules respectively. It is worth to mention that the same combinatorial approach without modularization involves the tuning of eight genes for MEP pathway and two more to reach taxadiene, implying the construction of at least 10,000 strains and the necessity of having a high-throughput screening method.
Despite its outstanding medical and industrial relevance, the progress in metabolic engineering of this family of products is laying down the basis for systematization of metabolic engineering in general. Principles of codon-adjustment, identification of pathway bottlenecks and, more recently, modularization and combinatorial experimentation can be transferred and reused to perform the heterologous production of completely different products. These principles, together with the ever declining costs of DNA synthesis and sequencing, and the development of computational software for analyzing and predicting the behavior of these heterologous systems will lead microbial metabolic engineering to be a systematic and flexible tool for production of an expanding array of molecules for pharmaceutical and other industrial uses.

Metabolic network flux analysis for yield improvements
The last two decades encompassed what is called "post-genomics era" of metabolic engineering, an overwhelming compendium of data coming from genomics, transcriptomics, proteomics and more recently, metabolomics fields. In this context, the paradigm of metabolic engineering has clearly shifted away from one focused in perturbing individual pathways, to one which considers cell reactions in their entirety, and which integrates metabolic (native and heterologous) pathways in a genome-scale metabolic network.
Metabolic engineering makes particular emphasis in the metabolic fluxes and their control under in vivo conditions in order to maximize product formation [2]. Since empirical determination of the wholecell metabolic fluxes is an almost impossible task, they have to be inferred from constraint based computational simulations that include measurable parameters (i.e. substrate uptake, growth rate) in their equations. Metabolic networks are described as a collection of the biochemical reactions that occur in the cell. Each of these reactions is linked to the plethora of putative enzymes found in the organism genome [31]. In the flux balance analysis (FBA) approaches, the set of reactions is displayed as a stoichiometric matrix, in which each reaction has an assigned constraint that defines the intervening molecules, the direction and the flow through it [32,33]. Once defined the laying metabolic network, FBA seeks to maximize or minimize an objective function, which can be any linear combination of fluxes. Classically, the objective function for microorganisms is to maximize biomass-related reactions, i.e. simulating maximum growth. Thus by knowing a restricted set of measurable parameters, quantitative predictions and testable hypothesis can be generated [34]. By altering the constraints in the stoichiometric matrix, FBA can simulate different media or gene deletions, calculate the new metabolic flux distribution and predict the concomitant parameters of growth and yield [33,35,36]. While it is generally accepted that microorganisms metabolism is suited to achieve optimal growth, this may not be the case of a mutant organism obtained under laboratory conditions. Frequently, when a Figure 2. The genes of the biosynthetic pathway are grouped into modules 1 and 2 and different promoters are cloned upstream of each one. The generated strains contain all the possible combinations between the constructs. All these strains are then tested for the production of the target compound and the yield of product is plotted as a function of both modules.
Yield improvement in microbial cell design mutation is introduced, the topology and flux distribution of a metabolic network is so perturbed that the mutant organism does not follow the optimal growth principle. In such cases, the objective of "minimization of metabolic adjustment" (MOMA) is better suited for predicting growth and product yields, since it is not based on an optimal growth objective function, but in the concept that the mutant metabolism remains initially as close as possible to the wild type optimum in terms of flux values [34]. In this context, MOMA uses the same stoichiometric matrix as FBA, but the solution is a metabolic flux distribution which is an intermediate scenario between the optimal wild type and the optimal mutant flux distribution [34].
These two different approaches to analyze metabolic networks are frequently complementary, and understanding the general principles and the biological interpretation of each method provide an extremely powerful advantage to the metabolic engineer [ . More recently, malonyl-CoA supply for heterologous production of target molecules was improved by means of genome-scale metabolic modeling analysis [40]. Malonyl-CoA is an essential building block for the production of many industrial and medical relevant polyketides [6,42] and flavanones [43], and also the starting point for microdiesel [44]. Since malonyl-CoA is also the precursor for the essential fatty acid biosynthesis, the flux towards it is tightly controlled and the cell will not easily commit to overproduce recombinant molecules derived from such essential metabolites [45]. Many researchers have improved the availability of malonyl-CoA by tuning enzymatic steps that replenish or deplete malonyl-CoA pools [46]. However, Xu et al., provide the first rational approach of in silico analysis of carbon metabolism in E. coli for redirecting carbon flux to malonyl-CoA, in order to optimize the synthesis of the flavanone naringenin [40]. They used OptFORCE algorithm to identify the minimal set of genetic interventions that led to maximization of product titers. The main targets identified were: A) up-regulation of glycolytic enzymes; B) down-regulation of tricarboxylic acid flux; C) increased pyruvate dehydrogenase (PDH) and acetyl-CoA carboxylase (ACC) activities. In this way, the authors improved the production of naringenin by 560% over the current state of art by introducing a total of five genetic interventions: overexpression of ACC, PDH, glyceraldehyde-3-phosphate dehydrogenase and phosphoglycerate kinase, and a knockout of the succinyl-CoA synthetase and fumarase enzymes.
Competition with native pathways and metabolites, accumulation of side products and consequent growth inhibition are only some of the possible effects that a heterologous pathway can have on cell metabolism. These examples are a clear demonstration of how metabolic networks and metabolic flux analysis can be used to identify non-trivial targets for genetic interventions that lead to further optimization of product titers. In silico simulation tools open the field of metabolic engineering possibilities that can be explored to circumvent the problems that arise when expressing a foreign pathway, contributing to accelerate the design cycle of metabolic engineering and leading to the cost-effective production of industrial and medical important molecules.

Perspectives
The rapid rate of fossil resources consumption has drawn the attention towards the development of an economy based on renewable biological raw materials. In this context, the term "Biorefineries" is used to describe biological entities capable of generating high valueaggregated directly from biomass, in an analogy with petroleum-based refineries [47]. Metabolic engineering thus appears as a key discipline, modifying and creating metabolic pathways in microbial hosts in order to produce complex chemicals of therapeutic and industrial relevance [26]. However, these pathways often alter significantly the hosts native metabolic network, by either increasing toxic metabolite levels or depleting essential building blocks for the cell, and thus resulting in sub-optimal yield of the desired product [12].
The intersection of metabolic engineering with other emerging areas of systems and synthetic biology presents exciting opportunities to develop solutions to many of the global challenges we face nowadays. In particular, these disciplines will be of fundamental importance to promote sustainable energy, reduce the environmental impact of several industrial processes and to improve the cell factory production of natural and semisynthetic drugs that are currently synthesized at very low yields, hopefully having an impact in the costs and availability of medicines.