Genome-scale model of C. autoethanogenum reveals optimal bioprocess conditions for high-value chemical production from carbon monoxide

: Clostridium autoethanogenum is an industrial microbe used for the commercial-scale production of ethanol from carbon monoxide. While significant progress has been made in the attempted diversification of this bioprocess, further improvements are desirable, particularly in the formation of the high-value platform chemicals such as 2,3-butanediol (2,3-BD). A new, experimentally parameterised genome-scale model of C. autoethanogenum predicts dramatically increased 2,3-BD production under non-carbon-limited conditions when thermodynamic constraints on hydrogen production are considered.


Introduction
The adverse environmental and societal consequences of continued fossil-fuel dependence represent arguably the defining challenge for scientific research in the 21st century [1,2]. One component of the solution to this problem is carbon recycling, which can be achieved through the application of bioprocesses enabling conversion of industrial waste-gas components into commodity chemicals [3]. For example, the Chicago-based company LanzaTech (www.lanzatech.com) uses Clostridium autoethanogenum to convert carbon monoxide (CO) into ethanol on a commercial scale [4][5][6]. Diversification of the product portfolio of this sustainable technology will help to secure its continued success.
The fundamental metabolic pathway enabling C. autoethanogenum to grow on CO is the Wood-Ljungdahl pathway (WLP) [7,8]. This ancient biochemical pathway is split into two 'branches': the methyl branch and the carbonyl branch [9]. The methyl branch proceeds by constructing a methyl group from carbon dioxide (CO 2 ) via a series of biochemical conversions including an adenosine triphosphate (ATP)-consuming reaction [catalysed by formyl-tetrahydrofolate (THF) ligase, FtfL] and three redox reactions collectively requiring nicotinamide adenine dinucleotide (NADH), NAD phosphate (NADPH) and reduced ferredoxin (Fd red ). The carbonyl branch is simply the reduction of CO 2 to an enzyme-bound carbonyl group as catalysed by the acetyl-CoA synthase/CO dehydrogenase complex (ACS/CODH). This same enzyme complex is responsible for the final step of the pathway, in which the methyl group and enzyme-bound carbonyl group are combined with CoA to form one molecule of acetyl-CoA.
Clearly, in the case of CO-fed growth, sources of CO 2 , reducing power and the means to generate ATP are needed for the WLP to function. As with all known acetogens [10], acetate and ATP are produced from acetyl-CoA via phosphotransacetylase and acetate kinase, thus fulfilling the energetic requirement of FtfL but not enabling a positive ATP yield [5]. The CO 2 and reducing power requirements are met by two additional monofunctional CODH enzymes encoded in the C. autoethanogenum genome which generate Fd red and CO 2 from CO (and water). Fd red is then oxidised to generate NADH and subsequently NADPH, which in turn satisfy the redox requirements of the WLP. Finally, electrons are transferred from Fd red to NAD by a membrane-bound Rnf complex [11], which produces a transmembrane proton gradient and enable the generation of ATP by F 1 F o ATP synthase [12,13]. The regeneration of NAD for this essential process is achieved through further redox reactions leading to excreted by-products, e.g. ethanol [5].
One such product of particular importance is the platform chemical 2,3-butanediol (2,3-BD) -a trace component of the native C. autoethanogenum product profile, the downstream products of which have an estimated global market value of $43 billion in sales [14]. Other compounds in the native product profile of C. autoethanogenum include acetate, ethanol and, in smaller amounts, lactate [14,15]. Hydrogen gas is another common product in carboxydotrophic microorganisms [16]. While the potential for 2,3-BD synthesis from industrial waste gas is wellestablished [14], a clear bioprocess/metabolic engineering strategy for its increased production has not been published despite the construction and analysis of a genome-scale model (GSM) of C. autoethanogenum [17] -an approach which has achieved success in guiding metabolic engineering through the investigation of optimal steady states [18]. Previous attempts to model C. autoethanogenum use parameters from GSMs of other organisms [17,19,20], a potential weakness identified by Dash et al. [21]. The availability of a new, manually annotated genome sequence of C. autoethanogenum [22], therefore, provides an opportunity to construct an improved, experimentally parameterised GSM and pursue novel strategies for enhanced 2,3-BD production.
Experimental studies aimed at manipulating the product profile of C. autoethanogenum have so far focused on carbon-limited growth regimes [17,19], i.e. the scenario, in which rate of growth of a bacterial population is limited by the supply of its carbon source. Such studies tend to neglect non-carbon limitation -the case where some other nutrient becomes limiting and the carbon source is in excess. We argue that non-carbon-limited growth regimes are worth investigating because an excess supply of energy (i.e. electrons from CO) may result in the diversion of flux toward increasingly reduced compounds (e.g. 2,3-BD), especially in strict anaerobes such as C. autoethanogenum whose product profile is constrained by the redox balance of internal metabolites [23].
This work presents the construction of a new GSM of C. autoethanogenum ('MetaCLAU') including parameterisation and validation with data from continuous-culture chemostat experiments. The subsequent model is used to test the hypotheses that 2,3-BD is an optimal sink for excess reducing power supplied under non-carbon-limited conditions before and after a thermodynamic limit on hydrogen production is imposed.

Data
This section describes the data used for model construction, parameterisation and validation. Data used in the interpretation of model analyses are also specified.

Genome sequences:
The task of constructing a GSM requires an annotated genome sequence. Table 1 summarises relevant genomic data available for use in this work. The ScrumPyreadable pathway genome database (PGDB) generated from the genome sequence described in [24] is available from www.metacyc.org. Flatfiles for the PGDB derived from the genome annotation described in [22] were generated with Pathway Tools [25] and can be found in [26].

Fermentation data:
Optical density (OD), gas chromatography and cell-mass data collected at steady state during fermentation experiments were used to derive model parameters in this work. Bioreactor data were collected and processed using inhouse software (see 2.7). Details of fermentation experiments are provided in the supplementary materials (S4).
The data for the estimation of growth and non-growthassociated maintenance costs are shown in the supplementary materials (S6).

Biomass:
Proportions of macromolecular biomass components were measured experimentally with continuousculture samples taken at steady state. Details of experimental procedures for the measurement of total DNA, RNA, protein, lipid and polysaccharide are given in the supplementary materials. Liquid chromatography-mass spectrometry (LC-MS) was used to estimate trace metabolite concentrations as described in [29].

Model construction
In the context of this paper, 'construction' refers to the selection of a set of reaction stoichiometries which form the metabolic network, whereas 'curation' refers to the identification and correction of errors and omissions in the definition of these reactions, e.g. errors in mass-balance and thermodynamic favourability.
ScrumPy models are formed of a top-level 'module' and several sub-modules containing either automatically generated or manually defined reaction stoichiometries (see Fig. 1). This helps to organise the model components during construction and curation.

Draft network:
Construction of the GSM began with the Tier 3 BioCyc [25,30] database for C. autoethanogenum JA1-1 (strain DSM 10061) generated from the genome sequence published by Brown et al. [24]. This database (referred to here as the 'CAETHG database') formed the foundation of a draft reconstruction of the organism's metabolic network. Automatically generated Tier 3 BioCyc databases are not fully curated, thus an additional manual genome annotation [22] was required to complete construction in line with methods described by Fell et al. [31] and Hartman et al. [32]. This additional annotation was used to create a second genome database (referred to as the 'CLAU database') with the PathoLogic algorithm as implemented in Pathway Tools [25,30]. Information contained in this second database formed the basis for the continued curation of the model, the aim being to establish gene-protein-reaction relationships (GPRs) based solely on the manual annotation. For a detailed, stepby-step model construction methodology see supplementary materials (S1).

Curation:
Initial curation steps involved removing chemically undefined metabolites from the automatically generated network. Subsequently, reactions without gene associations were investigated and removed if evidence for the necessary encoded enzymes could not be found [22]. Atomically unbalanced reactions were also corrected or removed. The thermodynamic consistency of the model, i.e. its adherence to energy conservation, was regularly checked using a specific linear programming (LP) problem detailed in Section 2.2.4. Inconsistent enzyme subsets containing erroneous reaction reversibility constraints [33] were identified using ScrumPy [34] and removed from the model (see Section 2.5).

Electron transport chain:
A model of the electron transport chain (ETC) of C. autoethanogenum was manually constructed based on biochemical literatures [9,11,14,35,36]. The separate construction of an ETC sub-network enables the computation of its elementary modes (i.e. the set of minimal steady-state flux distributions across the network [37,38]), since smaller networks avoid combinatorial difficulties encountered with larger networks. Inspection of the subsequent elementary modes ensures that legitimate routes for the generation of energy are available in model simulations (see Fig. 2). One feature of the ETC in C. autoethanogenum is that the number of translocated protons required for the generation of one ATP by ATP synthase is not known. In previous studies, a value of 3.66 H + /ATP is assumed based on the structure of the Clostridium paradoxum ATP synthase (the rotor of which consists of 11 csubunits) [39], taking this stoichiometry as an indication of the Clostridium phenotype, in general [35]. We adopt this assumption here but also show that our modelling results are not qualitatively sensitive to this parameter [see supplementary materials (S3)].
The details of the ETC are as follows: C. autoethanogenum maintains a transmembrane proton gradient enabling energy conservation via coupling of an Rnf complex (1) with an ATP synthase (2) [11]. When an ATPase c-ring consisting of 11 csubunits is assumed (as in [35]), the following reactions represent the ETC: The stoichiometry of the reaction catalysed by methylene-THF reductase (MetFV) is unknown [40]. The definition given above (4) represents the electron-bifurcating mechanism proposed by Köpke et al. [28] and further developed in [35]. An electronbifurcating hydrogenase, HytA-E is responsible for the utilisation of hydrogen as an energy source [36] and is also thought to be responsible for hydrogen production [41] (5). Finally, since the WLP in C. autoethanogenum involves an NADPH-dependent methylene-THF dehydrogenase, the ETC must include a mechanism for the generation of NADPH (this is also necessary to support biomass). This is achieved by the Nfn complex (6) HytA − E: 2H 2 + NADP + + Fd ox → 3H + + NADPH + Fd red (5) (see (6)) An example steady-state flux distribution of the ETC which yields ATP and NADPH is shown in Fig. 2.

Energy conservation:
The following analysis ensured thermodynamic consistency across the metabolic network: flux balance analysis (FBA) was computed with fixed positive ATP demand and without uptake of any carbon, nutrient or energy source in line with methods described in [32]: where transport reactions are denoted v t i and t is a vector of length X trans containing all indices of transport reactions in v. Any subsequent solution included a thermodynamic error, meaning definitions for one or more of the participant reactions required amendment or removal.

Transporters:
The module 'Transporters.spy' defines all reactions involving transfer between the organism and the environment. Transporters were added based on known carbon/ energy sources and products of C. autoethanogenum [15]. Transporters were also added for individual biomass precursors [31] and components of the growth media used to culture C. autoethanogenum in this paper, see supplementary materials (S4).

Model characteristics
A GSM of C. autoethanogenum consisting of 755 reactions and 772 metabolites has been constructed through the refinement of a draft network derived from a Pathway Tools database consisting of 1429 reactions and 1097 metabolites. About 73 model reactions in the curated model were unique to the initial CAETHG database. About 47 gene associations for these reactions were mapped to CLAU locus tags using EDGAR [42]. About 15 of the remaining model reactions unique to the CAETHG database were defined as spontaneous, and thus required no association to genes. Gene associations for 3 of the remaining 11 reactions were defined manually, and 8 were retained as they proved essential for the production of biomass on a range of growth substrates (CO, [CO + H 2 ] and fructose). These reactions, forming a pathway for teichoic acid production, have been retained as hypothetical reactions and are detailed in the supplementary material (S1). Teichoic acid is an assumed component of the model's biomass equation consistent with established physiology of Gram-positive bacteria [43].

Parameterisation
Where possible, species-specific parameters should be derived experimentally, improving the accuracy of model calculations and discouraging the unhelpful culture of 'parameter borrowing'. In the case of bacterial cells, two key parameters are required: biomass composition and ATP maintenance costs.

Biomass composition:
A GSM should be capable of producing essential cellular materials (in realistic proportions) using feasible biosynthetic routes to enable increasingly accurate calculations of optimal network behaviours [44,45]. Despite a large number of biotechnologically interesting Clostridia, detailed physiological information suitable for deriving cellular biomass composition is available for a small range of species only [46][47][48].
The macromolecular biomass composition was measured experimentally including protein, DNA, RNA, lipid and polysaccharide [see supplementary materials (S4) for experimental methods]. The relative abundances of various lipids in C. autoethanogenum are taken from [15]. DNA nucleotide ratios were estimated from the full genome sequence [22], whereas nucleotide and amino acid ratios were estimated from transcribed rRNA and translated ribonucleoprotein sequences, respectively [see supplementary materials (S4)]. The use of ribosomal RNA and protein sequences to characterise the prevalence of ribonucleotides and amino acids are based on the observation that the largest allocation of protein synthesis capacity is in the production of ribosomes [49]. The 'Other' class of biomass component contains The overall biomass composition including essential trace metabolite pools is given in Table 2. Tables showing the estimated relative proportion of each biomass precursor within the macromolecular divisions are given in the supplementary materials (S4).

ATP maintenance costs:
The following experimental procedure was conducted to measure growth and non-growth ATP maintenance costs (growth-associated ATP maintenance cost (GAM) and non-growth-associated ATP maintenance cost (NGAM), respectively) in CO-fed continuous culture of C. autoethanogenum: The set dilution rate D was increased incrementally from 0.01 to 0.028 h −1 . Steady state was reached after each change, as determined by stable culture OD. For each steady-state period, the average CO uptake rate was calculated and plotted against the corresponding specific growth rate μ which was taken as equivalent to the set dilution rate according to (8). The slope of a linear fit to this data is the growth-associated CO uptake rate; the y-intercept is the non-growth-associated CO uptake rate [see supplementary materials (S6)] GAM and NGAM were calculated by multiplying growth and nongrowth-associated CO uptake rates by the model-calculated maximal ATP yield (Y ATP ) from CO. The growth and non-growthassociated CO uptake rates are 415 ± 92 and 5.7 ± 1.4, respectively, and the optimal Y ATP value is 0.375 [see (12)]. These values give a GAM of 155.7 ± 34.6 and an NGAM value of 2.2 ± 0.5 (The units for GAM and NGAM are mmol gDCW −1 and mmol gDCW -1 h -1 , respectively.).
Since a proportion of GAM is accounted for by biomass precursor production in the metabolic network, this proportion must be subtracted from the total GAM before incorporation as a flux constraint. After subtracting the proportion of growthassociated ATP maintenance incorporated in the model's metabolic network (44.5 mmol gDCW -1 ) from the experimentally derived GAM parameter, the actual GAM value used as a constraint in FBA simulations was 111.2 mmol gDCW -1 h -1 . Total ATP maintenance (r ATP ) resulting from GAM/NGAM parameters sourced from published GSMs and an assumed growth rate of 0.028 h −1 are shown in Table 3.

Steady-state metabolism
The fundamental concept underlying the analyses presented in this work is metabolic steady state. This occurs when the concentrations of metabolites involved in a metabolic system do not change over time [56,57]. The steady-state constraint is enforced with a set of linear differential equations representing the production and consumption of m metabolites by n chemical reactions. These equations can be represented as the dot product of a matrix of reaction stoichiometries N and a vector of reaction fluxes (i.e. net reaction rates) v [58]. The steady-state condition is fulfilled when i.e. when the distribution of metabolic fluxes v lies in the nullspace of N. Basis vectors for the nullspace of N can subsequently be analysed to reveal useful properties of the subject metabolic network such as 'dead reactions' which always carry zero flux at steady state [59]. Another useful concept from the nullspace of N is 'enzyme subsets' [60]. An enzyme subset is a subset of enzymes within a metabolic network whose members carry flux in a fixed ratio at steady state [59]. Inconsistent enzyme subsets contain oneor-more enzymes whose flux direction violates irreversibility constraints, effectively inactivating every enzyme in the subset.

Flux balance analysis
FBA is a fundamental technique for the analysis of GSMs which formulates steady-state metabolism as an LP problem given some assumed, biologically relevant objective function [61][62][63]. Since (9) only specifies constraints on the relative values of fluxes, further constraints are required to achieve a solution space which is sufficiently bounded for LP [64]. For example, transport 'reactions' responsible for the influx (and efflux) of metabolites into the system are often subject to an upper-bound constraint. Following this, an objective function is specified which maximises or minimises a selection of the flux variables. An objective is usually chosen which mimics an assumed biological objective [65] such as maximising ATP generation. The subsequent solution to the LP problem is thus an optimal flux distribution.
Results obtained through the application of FBA provide insight into the capabilities of a metabolic network, thus helping to investigate the steady-state behaviour of an organism. In this paper, FBA is used to calculate the growth rate of C. autoethanogenum as a means of model validation. Following this, FBA is used to investigate the model's response to changes in industrially relevant culture conditions including non-carbon limitation and constraints on hydrogen production.

Network flux minimisation:
The preferred objective function in this paper is the minimisation of absolute flux through all enzyme-catalysed reactions (10). For a mass-balanced metabolic network, total network flux minimisation reduces the occurrence of thermodynamically infeasible energy-generating cycles, while providing a reasonable approximation of enzymatic economy [31,32,34,66,67].   where N is the stoichiometric matrix and v is a vector of all reaction flux values included in the metabolic network; v cat is the set of network reactions catalysed by enzymes, i.e. the subset of v which excludes transport by diffusion (in this paper: CO, CO 2 and H 2 ) and spontaneous reactions. This objective requires information on whether or not a reaction can occur spontaneously in the context of metabolism, as provided in Pathway Tools. Y xATP and m ATP are equivalent to the growth and non-growth-associated ATP maintenance costs (GAM and NGAM), respectively, and μ is the specific growth rate. v ATPase is the flux through the network's ATPase reaction (ATP + H 2 O→ADP + P i + H + ). Reaction fluxes representing the production of X bio biomass precursors are denoted v k i , where k is a vector containing the corresponding indices of these reactions in v. The vector b contains relative abundance values for each biomass precursor which, when multiplied by μ, determine the value of each v k i . Reaction fluxes representing the transport of non-biomass-associated metabolites not required as growth substrates (i.e. potential products) are denoted v p j , where p is a vector of length X prods containing the corresponding indices in v.

Growth yield/ATP maximisation:
An alternative objective function is the maximisation of biomass precursor production, an approximation of growth rate optimisation [44] maximise: μ subject to Here, μ is represented as a lumped chemical equation including all biomass precursors as reactants with their relative abundance given as stoichiometric coefficients. Y xATP (GAM) is incorporated in the lumped equation as the stoichiometric coefficient on ATP. Reaction fluxes representing each of X growth substrate-uptake reactions are denoted v k i , where k is a vector containing the corresponding indices in v. The vector g contains fixed-flux values for each of X growth substrate-uptake reactions. In this paper, the fixed substrateuptake rate is taken as the average experimentally observed uptake rate over the time period at which the culture is at steady state. An alternative form of this LP problem maximises ATP dissipation, i.e. flux carried by ATPase (v ATPase ) in the ATP hydrolysing direction. Solutions to this problem achieve optimal ATP yields. Furthermore, a complete hierarchy of products ordered in terms of Y ATP can be generated by iteratively computing the Y ATP -optimal solution while blocking (i.e. constraining to 0) flux across the transporter of the product formed in the previous iteration until ATP production is infeasible.

Software
Reconstruction steps and model analyses were carried out using the ScrumPy package (http://mudshark.brookes.ac.uk/ScrumPy) [34]. Analysis of the previously published iCLAU786 [17] model was carried out in COBRApy [68]. Bioreactor data were analysed using an automated python/MATLAB-based software pipeline designed for the BioCommand software (BioCommand ® , New Brunswick Scientific), in which data were transferred to a web server, allowing real-time updates before calibration and post-processing using a suite of MATLAB ® scripts. Further details of this toolset including source code and descriptions are available on request.

Comparison of MetaCLAU with published models
Genome-scale metabolic models have been published for both Clostridium ljungdahlii [50] and C. autoethanogenum [17,20]. The C. autoethanogenum model, iCLAU786, was first made available in [17] and then updated in [20]. Additionally, the repository of metabolic models automatically generated by CarveMe [69] contains a draft network for C. autoethanogenum. This section shows the results of a general comparison between MetaCLAU and these published models.
Differences in the general model statistics are shown in Table 4. Importantly, the more recent iCLAU786 version [20] fails the essential condition set-out in Section 2.2.4 that no ATP can be generated without the uptake and excretion of metabolites. For this reason, this version of iCLAU786 is excluded from further analysis. Similarly, the CarveMe draft model is excluded from further analysis as it is unable to generate ATP from CO and water.
An important difference is the number of dead reactions; iCLAU786 has 966 (97%) dead reactions, whereas MetaCLAU has 327 (43%). MetaCLAU also accounts for the transport of more metabolites. Finally, unlike the other models, MetaCLAU contains no inconsistent enzyme subsets [33]. Details of this comparison are provided in the externally hosted files [26].

Validation
Validation of key behaviours against experimental data was carried out to demonstrate the model's good reliability. Substrate utilisation, the growth rate on CO and product spectrum are the validating behaviours tested in this paper.

Substrate utilisation:
The ability of C. autoethanogenum to use different carbon/energy sources for the production of biomass (and maintenance) was calculated with FBA. Each compound was used as the sole carbon source in the in silico minimal medium by allowing influx across the corresponding transporter while applying the min v objective. The simulation results were compared with experimental data (Table 5), showing good agreement with known carbon/energy sources.

Growth rate:
An average experimentally observed CO uptake rate of 16.52 ± 0.02 mmol gDCW −1 h −1 was applied as a fixed-flux constraint on CO transport, which, with the objective function set to maximise biomass production, resulted in a calculated optimal specific growth yield of 0.027 h −1 . This value is within the error margin of the measured specific growth rate of 0.028 ± 0.001 h −1 . A comparison of model-simulated versus reported experimental growth rates including fructose and syngas is shown in Table 6.

Product spectrum:
The ratio of product fluxes in optimal solutions with CO as the sole source of carbon and energy was sensitive to the choice of the objective function. Maximisation of growth yield (11) resulted in acetate and CO 2 production, with no flux to ethanol. In contrast, when minimisation of network flux was the objective function (10), acetate and ethanol formed the main products. Optimal flux distributions computed with published models of C. autoethanogenum (iCLAU786 [17,19]) and C. ljungdahlii (iHN637 [50]) did not include simultaneous production of ethanol and acetate without additional constraints (see Table 7). The min v cat objective could not be applied to iCLAU786 or iHN637 since reactions that are known to occur spontaneously (in the context of metabolism) are not identified in these models. The optimal ATP yield (Y ATP ) per mole substrate uptake with CO as the sole carbon and energy source was 0.375 with acetate as the sole product [net stoichiometry shown in (12)]. A sub-optimal Y ATP value (0.342) is associated with ethanol production. A hierarchy of products based on their associated ATP yield is shown below, demonstrating that the model can return flux distributions associated with the expected range of liquid products:

Gas shift
A range of CO uptake rates exceeding the uptake rate required to support a fixed growth rate (v CO = 16.97, μ = 0.028) was applied to simulate non-carbon-limited growth. FBA with CO uptake in the range 17-42.5 mmol gDCW -1 h -1 and a fixed growth rate of 0.028 h −1 results in a shift from majority acetate production to ethanol production (Fig. 3) and subsequent hydrogen production in optimal solutions (Fig. 4). When the model was constrained to allow no production (or uptake) of hydrogen, 2,3-BD and lactate efflux appear in optimal solutions after the initial switch from acetate to ethanol (where 2,3-BD efflux increases with CO uptake but lactate efflux remains constant, see Fig. 5). The constraint on hydrogen efflux was applied to mimic possible limitations on hydrogen production caused by increasing hydrogen partial pressure in the external environment [73].

Discussion
A GSM of C. autoethanogenum has been constructed and parameterised with experimentally derived values representing ATP maintenance costs and cellular biomass composition. The experimental parameterisation was designed to make the model more accurately represent C. autoethanogenum metabolism, as confirmed by the validation in Section 3.2 above. Model-calculated yields for growth on CO agree with experimentally observed growth rate data, while the production of acetate and ethanol in optimal flux distributions supporting biomass production and maintenance costs is consistent with the established native product profile of C. autoethanogenum.
Oversupply of CO to the metabolic network leads to excess reducing equivalents, forcing flux through pathways (see Fig. 6) allowing the removal of superfluous reducing power (electrons) from the system [14]. This has been tested using FBA by increasing the ratio between energy-substrate supply (in this case, CO) and growth rate over a fixed range (Fig. 3) representing noncarbon-limited conditions. Results show that increasing the influx   of CO beyond the rate necessary to achieve a fixed growth rate causes a switch from acetate to ethanol production over the uptake range 17-18.5 mmol gDCW -1 h -1 . This provides a new strategy for increasing ethanol production. It has also been hypothesised that bacterial hydrogen production will be restricted by external constraints during continuous culture. More specifically, hydrogen production may become thermodynamically unfavourable if its partial pressure in the external environment exceeds some critical value. This will cause metabolism to favour the production of other electron sinks (such as lactate) [73][74][75]. To test this, the CO uptake scan was repeated with hydrogen transport restricted to zero. Optimal flux distributions resulting from this simulation showed 2,3-BD production increasing monotonically with CO uptake after the acetate/ethanol switch (Fig. 5). This result is significant since 2,3-BD production has not been reported from the analysis of previous models [17,20]. Furthermore, lactate production appeared in optimal flux distributions but was unchanged with increasing CO uptake, suggesting an association with biomass production. Thus, the analysis has uncovered a testable set of bioprocess conditions for which 2,3-BD is included in optimal metabolic flux distributions, increasing with the gas inflow.

Model availability
The model is available to download as an SBML, JSON (COBRA readable) or ScrumPy-formatted file from the Biomodels database [76] using accession number MODEL1810120001. All model and database files are available in the supplementary materials including an ipython notebook demonstrating all presented analyses (requires installation of ipython and COBRApy).

Acknowledgments
This work was supported by the Biotechnology and Biological Sciences Research Council, as part of the BBSRC Longer and Larger Grant GASCHEM [grant number BB/K00283X/1]; the BBSRC/EPSRC Synthetic Biology Research Centre Nottingham [grant number BB/L013940/1]; and the industrial partner LanzaTech Inc. Furthermore, the authors thank their colleagues at the SBRC for their support during the preparation of the paper. MP gratefully acknowledges support from Oxford Brookes University. Finally, we thank C1net for facilitating both the collaboration between Oxford Brookes University and the University of Nottingham and for funding and organising the C1net metabolic modelling workshops. The responsibility for the content of this paper lies with the authors.