A mathematical framework for yield (vs. rate) optimization in constraint-based modeling and applications in metabolic engineering

Background: The optimization of metabolic rates (as linear objective functions) represents the methodical core of flux-balance analysis techniques which have become a standard tool for the study of genome-scale metabolic models. Besides (growth and synthesis) rates, metabolic yields are key parameters for the characterization of biochemical transformation processes, especially in the context of biotechnological applications. However, yields are ratios of rates, and hence the optimization of yields (as nonlinear objective functions) under arbitrary linear constraints is not possible with current flux-balance analysis techniques. Despite the fundamental importance of yields in constraint-based modeling, a comprehensive mathematical framework for yield optimization is still missing. Results: We present a mathematical theory that allows one to systematically compute and analyze yield-optimal solutions of metabolic models under arbitrary linear constraints. In particular, we formulate yield optimization as a linear-fractional program. For practical computations, we transform the linear-fractional yield optimization problem to a (higher-dimensional) linear problem. Its solutions determine the solutions of the original problem and can be used to predict yield-optimal flux distributions in genome-scale metabolic models. For the theoretical analysis, we consider the linear-fractional problem directly. Most importantly, we show that the yield-optimal solution set (like the rate-optimal solution set) is determined by (yield-optimal) elementary flux vectors of the underlying metabolic model. However, yield- and rate-optimal solutions may differ from each other, and hence optimal (biomass or product) yields are not necessarily obtained at solutions with optimal (growth or synthesis) rates. Moreover, we discuss phase planes/production envelopes and yield spaces, in particular, we prove that yield spaces are convex and provide algorithms for their computation. We illustrate our findings by a small example and demonstrate their relevance for metabolic engineering with realistic models of E. coli. Conclusions: We develop a comprehensive mathematical framework for yield optimization in metabolic models. Our theory is particularly useful for the study and rational modification of cell factories designed under given yield and/or rate requirements.

Given: a flux polyhedron P with bounded yield and a set of "bounded" EFVs v (generators of the polytope associated to P) and "unbounded" EFVs u (generators of the recession cone of P) with index sets I * , I u (for EFVs v with maximum/undefined yield) and J * , J u (for EFVs u with maximum/undefined yield). attained.
An optimal solution is a • conical sum of EFVs u with maximum or undefined yield, a polytope, then the maximum is attained.
An optimal solution is a • convex sum of EFVs v with maximum or undefined yield,   Acetate synthesis rate

S-2
Text S1. Production envelope and yield space for ethanol production in a genome-scale model of E. coli To exemplify the analysis of production envelopes (PEs) and yield spaces (YSs) also in a genome-scale metabolic model (GSMM), we analyze the trade-off between biomass and ethanol production in the E. coli GSMM model iJO1366 [1]. Here, computation of elementary flux modes (EFMs) or elementary flux vectors (EFVs) is not possible and so we use CellNetAnalyzer to compute the biomass-ethanol YS and PE via the approximative algorithms given in Section 4.4.
The YS of the flux cone (without additional flux constraints) is shown in Figure S3 (a) and looks similar as for acetate in the core model, see Figure 6(a). The maximal biomass yield is almost exactly the same as in the core model and the maximal ethanol yield is, as expected, 2 mmol/gDW/h.
Next we add the same flux bounds for maximal glucose uptake and for minimal adenosine triphosphate (ATP) non-growth associated maintenance demand as in the acetate example. However, here we assume fully anaerobic growth (with enabled formate-hydrogen lyase (FHL) reaction which was set inactive in the original iJO1366 model [1] to reflect aerobic growth). In this case, YS and PE have an identical shape differing only by a scaling factor of ten [the maximal glucose uptake rate; see S3(b) and (c)]. In difference to the acetate example (where oxygen uptake was limited but not zero), the rate-optimal solutions for anaerobic growth and ethanol synthesis correspond to the respective yield-optimal solutions. We see that growth-optimal behavior (with respect to both yield and rate) may be accompanied with ethanol synthesis [with an ethanol yield of up to 0.78 mmol/(mmol glucose)]. Yet, ethanol synthesis is not mandatory for maximal growth rate since other pathways with zero ethanol yield are feasible as well.
Regarding optimal ethanol yield we find that the maximum yield can be obtained also for smaller ethanol synthesis rates down to 3.15 mmol/gDW/h. This minimal synthesis rate is required to obtain maximum ethanol yield under the constraint of ATP synthesis for non-growth associated maintenance.
Biased and unbiased strain designs (similar as D1-D3 discussed for the acetate example) for growth-coupled ethanol production could now again be computed, even in this GSMM. For example, linear inequalities can be used to specify the undesired and desired regions in the PE and YS which serve as input for the computation of minimal cut set (MCS) via the dual approach presented in [2]. An example for enumerating intervention strategies for growth-coupled ethanol synthesis in a GSMM of E. coli can be found in [2] as well.  ), non-growth associated maintenance ATP demand (r ATPmaint ), and zero oxygen uptake (anaerobic growth). In (b) and (c) the colors indicate the location of the following optimal flux vectors: red: maximal ethanol yield; yellow: maximal biomass yield; green: maximal ethanol synthesis rate; gray: maximal growth rate. The red/green circle in (b) as well as the gray/yellow lines in (b) and (c) correspond to exactly the same points/lines and have been slightly shifted for better visibility.