Simultaneous application of enzyme and thermodynamic constraints to metabolic models using an updated Python implementation of GECKO

ABSTRACT Genome-scale metabolic (GEM) models are knowledge bases of the reactions and metabolites of a particular organism. These GEM models allow for the simulation of the metabolism, for example, calculating growth and production yields—based on the stoichiometry, reaction directionality, and uptake rates of the metabolic network. Over the years, several extensions have been added to take into account other actors in metabolism, going beyond pure stoichiometry. One such extension is enzyme-constrained models, which enable the integration of proteomics data into GEM models containing the necessary k cat values for their enzymes. Given its relatively recent formulation, there are still challenges in standardization and data reconciliation between the model and the experimental measurements. In this work, we present geckopy 3.0 (genome-scale model with enzyme constraints, using Kinetics and Omics in Python), an actualization from scratch of the previous Python implementation of the same name. This update tackles the aforementioned challenges, to reach maturity in enzyme-constrained modeling. With the new geckopy, proteins are typed in the Systems Biology Markup Language (SBML) document, taking advantage of the SBML Groups extension, in compliance with community standards. In addition, a suite of relaxation algorithms—in the form of linear and mixed-integer linear programming problems—has been added to facilitate the reconciliation of raw proteomics data with the metabolic model. Several functionalities to integrate experimental data were implemented, including an interface layer with pytfa for the usage of thermodynamics and metabolomics constraints. Finally, the relaxation algorithms were benchmarked against public proteomics data sets in Escherichia coli for different conditions, revealing targets for improving the enzyme-constrained model and/or the proteomics pipeline. IMPORTANCE The metabolism of biological cells is an intricate network of reactions that interconvert chemical compounds, gathering energy, and using that energy to grow. The static analysis of these metabolic networks can be turned into a computational model that can efficiently output the distribution of fluxes in the network. With the inclusion of enzymes in the network, we can also interpret the role and concentrations of the metabolic proteins. However, the models and the experimental data often clash, resulting in a network that cannot grow. Here, we tackle this situation with a suite of relaxation algorithms in a package called geckopy. Geckopy also integrates with other software to allow for adding thermodynamic and metabolomic constraints. In addition, to ensure that enzyme-constrained models follow the community standards, a format for the proteins is postulated. We hope that the package and algorithms presented here will be useful for the constraint-based modeling community.

2.-The tool is presented as a version 3.0 of the originally proposed geckopy by Sánchez, et al. 2017.At the same time it is highlighted that this is a reconstruction from scratch, which implies independent development, which I do agree that is presented here, but breaks strict software versioning practices.A problem, intrinsic to GECKO available at: https://github.com/SysBioChalmers/GECKO, is that development of the python module stopped practically at its first version, whilst MATLAB versions presented methodological changes in data integration (GECKO 2.0), and even at the model format level (GECKO 3.0).Therefore, it is strictly recommended to be consistent with the title of this manuscript and refer to the new tool as an independent implementation of data constraints in GECKO throughout the rest of the text and associated materials.3.-Across the whole manuscript there are several terms that differ from the ones that have been used in the studies of this modeling field.In particular, the term enzyme constraint model is repeatedly used in this manuscript.The habit has been to name these models as enzyme, enzymatically or protein models.Consistency with the terminology of the subfield is recommended unless a re-discussion of the term is presented.
4.-The manuscript and supplementary materials do not provide enough detail the parameterization procedure of the reduced metabolic model (i.e.selection of kcat values).It has been reported by several studies that kcat distributions play a major role in flux distributions, therefore, this factor would be expected to majorly impact the conclusions extracted from the results of the flux variability analysis results.It is necessary to provide such parameterization details and complement discussion with more details regarding the selection of parameters.Particularly those discussed at the reaction level.A property of enzyme constrained models is that systems-level distribution of kinetic parameters translates to changes at the reaction level, and even though this is a reduced model, the number of reactions and metabolites suggest that it represents the conjunction of multiple pathways.It is also recommended to mention the sectors of metabolism that this small models accounts for, in order to highlight the relevance of conclusions to a wider public, such as non-modeler microbiologists.
5.-The average enzyme saturation across enzymes is analyzed at a genome-scale by using different relaxation algorithms.This analysis returned an overall low saturation value as all individually constrained enzymes were taken into account.Looking separately at the number of "used" enzymes (the flux carrying ones) may offer an additional factor for comparison, and also calculating a separated saturation factor for this group of enzymes may help the discussion of the commonly adopted average saturation factor of 0.5 in models constrained at the protein pool level.
6.-It is not clear in the manuscript and methods how the carbon source uptake rate is used as a constraint in addition to those at the proteomic level.This information is of particular importance for constraining a metabolic network, as it is indicative of the metabolic state, mode and even stress levels for a given growth phenotype.
7.-Comparison of the proposed LP and MILP algorithms for selection of IIS to the brut-force algorithms available in GECKOmat and caffeine is lacking and is strongly recommended, for the sake of providing quantitative evaluation of approaches to the interested community.When referring to the LP approach in the benchmarking section it is not clear which of the two methods presented for selection of IIS was used (optimization of elastic variables or elastic variables + objective).As growth rate is used as a constraint, this may suggest the former, but not explicitly said and also not clear for researchers starting to explore the subfield.
8.-What does a "a large constant K" (line 169) means in the context of the MILP problem.How large is it?what was the rationale behind the chosen value?what is the impact of this parameter in the performance of the MILP-based approach?9.-Table 1 is not informative enough of the modifications, condition parameters and assumptions used to model the conditions of origin for the different proteomic datasets.
10.-Provide statistical tests for the comparison of flux variability ranges across different layers of constraints and add significance arguments to the discussion of these results Minor comments: 11.-Line 24: replace "Genome-scale model Enzyme constraints" by "Genome-scale model with Enzyme constraints" as originally named by Sánchez, et al., 2017.12.-Line 65: "Now by enforcing mass conservation over the network, Sv=0".The explanation is not as straightforward as mentioned here and has been extensively described in other specialized reviews.Better cite those and rephrase.
11.-Line 76: "an update of an existing open-source enzyme-constraint software: geckopy 3.0" please follow the recommendation in the major concerns and be more descriptive here, the presented geckopy is a software for incorporation of omics constraints into models with enzymatic parameters.
13.-Line 78: recommended to call proteins as "pseudometabolites" in order to avoid confusion.
14.-Line 112: "v is the vector of reaction variables".More than 20 years of FBA related papers have named this as a vector of reaction fluxes, recommended to indicate that the prediction outcome is the distribution of fluxes as a separate idea.
15.-Line 116: Provide references to the use of the biomass pseudoreaction or the ATP synthase as objective functions.
16.-Line 154: "a LP problem", acronyms starting with L are usually referred to with "an".17.-Line 159: "the total flux", not clear if the authors refer to the metabolic flux that those enzymes carry or if they mean total use of elastic variables instead.
18.-There is a gap in between lines 163 and 164 that does not clarify what the listed points are specifically referring to.
19.-Line 225: not clear with what is meant with "when optimizing biomass for all reactions" in this context.20.-A one line explanation describing the thermodynamic solution will help to clarify this section.
21.-Line 298: The term the number of conditions is not clear when reading this and also figure 1 with its corresponding caption.What does it mean?Do the authors refer to the identifiers of the conditions/samples (e.g.sample 1, sample 2, ..., sample 17)?Clarifying this will majorly help this section.
22.-The average number of relaxed proteins across algorithms reported in lines 299-302 does not seem to correspond with the counts values shown in figure 1 (y-axis).Or do they refer to different variables?
23.-Line 317: Here fig. 4 is referred to before introducing figure 3, therefore the order of these figures should be interchanged so that it reflects the sequence in the text.
24.-Line 322: typo in "one deviation form this trend" 25.-Current figure 4 shows growth rate sum of relaxation values, but units are not reported for any of the axis.According to equation 8, I assume that the sum of relaxation values is presented in mmol/gDw (such as enzyme usages).It is recommended to convert the sum of relaxation values to mass units, by multiplying the contribution of every flexibilized enzyme by its molecular weight.In this way it is possible to assess what is the proportion of flexibilized data in comparison to the total protein content of the cell (global constraint) in terms of a conserved quantity.
26.-Additionally,I recommend to add a color code or different markers to the data points in figure 4, indicating the modeled conditions, as the association between growth rates and experimental conditions is not provided anywhere else in the text or associated materials.This will help top clarify what do authors mean with expression such as "protein constraints are less important when moved away from optimal conditions" (line 327).
27.-Line 334: the sentence "... samples that exceeded the growth rate of and what were capped ..." is hard to understand, probably missing words between "for" and "and".
Reviewer #2 (Comments for the Author):

Preparing Revision Guidelines
To submit your modified manuscript, log onto the eJP submission site at https://spectrum.msubmit.net/cgi-bin/main.plex.Go to Author Tasks and click the appropriate manuscript title to begin the revision process.The information that you entered when you first submitted the paper will be displayed.Please update the information as necessary.Here are a few examples of required updates that authors must address: • Point-by-point responses to the issues raised by the reviewers in a file named "Response to Reviewers," NOT IN YOUR COVER LETTER.
• Upload a compare copy of the manuscript (without figures) as a "Marked-Up Manuscript" file.
• Each figure must be uploaded as a separate file, and any multipanel figures must be assembled into one file.For complete guidelines on revision requirements, please see the journal Submission and Review Process requirements at https://journals.asm.org/journal/Spectrum/submission-review-process.Submissions of a paper that does not conform to Microbiology Spectrum guidelines will delay acceptance of your manuscript." Please return the manuscript within 60 days; if you cannot complete the modification within this time period, please contact me.If you do not wish to modify the manuscript and prefer to submit it to another journal, please notify me of your decision immediately so that the manuscript may be formally withdrawn from consideration by Microbiology Spectrum.
If your manuscript is accepted for publication, you will be contacted separately about payment when the proofs are issued; please follow the instructions in that e-mail.Arrangements for payment must be made before your article is published.For a complete list of Publication Fees, including supplemental material costs, please visit our website.
Corresponding authors may join or renew ASM membership to obtain discounts on publication fees.Need to upgrade your membership level?Please contact Customer Service at Service@asmusa.org.
Thank you for submitting your paper to Microbiology Spectrum.
The manuscript " Simultaneous applica on of enzyme and thermodynamic constraints to metabolic models using an updated Python implementa on of GECKO" provides a pythonbased so ware implementa on for facilitated integra on of proteomics and thermodynamic constraints into metabolic models.A genome scale enzyme-constrained model of the metabolism of Escherichia coli is used to test the different data flexibiliza on/reconcilia on algorithms available in the so ware.Addi onally, a reduced model of E. coli's metabolism is used to assess the reduc on in the solu on space (allowable predic ons) by the introduc on of different combina on of constraints (thermodynamic and proteomic, with and without addi on of omics measurements at the molecular level).
The presented study offers substan al methodological advancements to the field of metabolic modeling of microorganisms, in par cular to the subfield of enzyme constraints, which is undergoing recent significant developments.Namely, the authors improve the func oning and systema zed the incorpora on of proteomics constraints into metabolic models by totally free open source so ware; provide further compa bility of this kind of models and omics data with the SBML standard and the widely used simula on tool COBRApy; demonstrate how the study of proteomics data across diverse condi ons, together with other layers of constraints and contextualized metabolic knowledge, contribute to pinpoint model and/or data inconsistencies at the reac on and protein levels.
Overall, this is a high-quality research product that offers tools and insights of relevance for publica on in Microbiology Spectrum, as it touches upon two of the points listed in its scope (Findings that are of primary interest to smaller sub-fields within microbiology, and re-analyses of large datasets that provide addi onal insights).I recommend accep ng this manuscript for publica on a er the following major and minor points of concern have been clarified and considered.
Major concerns: 1.-The geckopy implementa on presented here is a so ware pipeline for integra on of omics and thermodynamic constraints into an enzyme-constrained model of metabolism.This is well explained by the tle.However, the abstract and introduc on sec ons do not specify that the enzyma c constraints, or cataly c constants are not treated here and are taken as inputs in the geckopy pipeline.The current text may sound self-explanatory to the authors or very specialized researchers, but the readers of Microbiology Spectrum include a broader audience.
2.-The tool is presented as a version 3.0 of the originally proposed geckopy by Sánchez, et al. 2017.At the same me it is highlighted that this is a reconstruc on from scratch, which implies independent development, which I do agree that is presented here, but breaks strict so ware versioning prac ces.A problem, intrinsic to GECKO available at: h ps://github.com/SysBioChalmers/GECKO, is that development of the python module stopped prac cally at its first version, whilst MATLAB versions presented methodological changes in data integra on (GECKO 2.0), and even at the model format level (GECKO 3.0).Therefore, it is strictly recommended to be consistent with the tle of this manuscript and refer to the new tool as an independent implementa on of data constraints in GECKO throughout the rest of the text and associated materials.3.-Across the whole manuscript there are several terms that differ from the ones that have been used in the studies of this modeling field.In par cular, the term enzyme constraint model is repeatedly used in this manuscript.The habit has been to name these models as enzyme, enzyma cally or protein models.Consistency with the terminology of the subfield is recommended unless a re-discussion of the term is presented.
4.-The manuscript and supplementary materials do not provide enough detail the parameteriza on procedure of the reduced metabolic model (i.e.selec on of kcat values).It has been reported by several studies that kcat distribu ons play a major role in flux distribu ons, therefore, this factor would be expected to majorly impact the conclusions extracted from the results of the flux variability analysis results.It is necessary to provide such parameteriza on details and complement discussion with more details regarding the selec on of parameters.Par cularly those discussed at the reac on level.A property of enzyme constrained models is that systems-level distribu on of kine c parameters translates to changes at the reac on level, and even though this is a reduced model, the number of reac ons and metabolites suggest that it represents the conjunc on of mul ple pathways.It is also recommended to men on the sectors of metabolism that this small models accounts for, in order to highlight the relevance of conclusions to a wider public, such as non-modeler microbiologists.
5.-The average enzyme satura on across enzymes is analyzed at a genome-scale by using different relaxa on algorithms.This analysis returned an overall low satura on value as all individually constrained enzymes were taken into account.Looking separately at the number of "used" enzymes (the flux carrying ones) may offer an addi onal factor for comparison, and also calcula ng a separated satura on factor for this group of enzymes may help the discussion of the commonly adopted average satura on factor of 0.5 in models constrained at the protein pool level.
6.-It is not clear in the manuscript and methods how the carbon source uptake rate is used as a constraint in addi on to those at the proteomic level.This informa on is of par cular importance for constraining a metabolic network, as it is indica ve of the metabolic state, mode and even stress levels for a given growth phenotype.
7.-Comparison of the proposed LP and MILP algorithms for selec on of IIS to the brut-force algorithms available in GECKOmat and caffeine is lacking and is strongly recommended, for the sake of providing quan ta ve evalua on of approaches to the interested community.When referring to the LP approach in the benchmarking sec on it is not clear which of the two methods presented for selec on of IIS was used (op miza on of elas c variables or elas c variables + objec ve).As growth rate is used as a constraint, this may suggest the former, but not explicitly said and also not clear for researchers star ng to explore the subfield.
8.-What does a "a large constant K" (line 169) means in the context of the MILP problem.How large is it?what was the ra onale behind the chosen value?what is the impact of this parameter in the performance of the MILP-based approach?9.-Table 1 is not informa ve enough of the modifica ons, condi on parameters and assump ons used to model the condi ons of origin for the different proteomic datasets.
10.-Provide sta s cal tests for the comparison of flux variability ranges across different layers of constraints and add significance arguments to the discussion of these results

Minor comments:
11.-Line 24: replace "Genome-scale model Enzyme constraints" by "Genome-scale model with Enzyme constraints" as originally named by Sánchez, et al., 2017.12.-Line 65: "Now by enforcing mass conserva on over the network, Sv=0".The explanation is not as straightforward as mentioned here and has been extensively described in other specialized reviews.Better cite those and rephrase.
11.-Line 76: "an update of an existing open-source enzyme-constraint software: geckopy 3.0" please follow the recommendation in the major concerns and be more descriptive here, the presented geckopy is a software for incorporation of omics constraints into models with enzymatic parameters.
13.-Line 78: recommended to call proteins as "pseudometabolites" in order to avoid confusion.
14.-Line 112: "v is the vector of reaction variables".More than 20 years of FBA related papers have named this as a vector of reaction fluxes, recommended to indicate that the prediction outcome is the distribution of fluxes as a separate idea.
15.-Line 116: Provide references to the use of the biomass pseudoreaction or the ATP synthase as objective functions.
16.-Line 154: "a LP problem", acronyms starting with L are usually referred to with "an".17.-Line 159: "the total flux", not clear if the authors refer to the metabolic flux that those enzymes carry or if they mean total use of elastic variables instead.
18.-There is a gap in between lines 163 and 164 that does not clarify what the listed points are specifically referring to.
19.-Line 225: not clear with what is meant with "when optimizing biomass for all reactions" in this context.
20.-A one line explanation describing the thermodynamic solution will help to clarify this section.
21.-Line 298 and 306: The terms "number of conditions" and "number of samples" is not clear when reading this and also figure 1 with its corresponding caption.What does it mean?Do the authors refer to the identifiers of the conditions/samples (e.g.sample 1, sample 2, …, sample 17)?Clarifying this will majorly help this section and figure 1.
22.-The average number of relaxed proteins across algorithms reported in lines 299-302 does not seem to correspond with the counts values shown in figure 1 (y-axis).Or do they refer to different variables?
23.-Line 317: Here fig. 4 is referred to before introducing figure 3, therefore the order of these figures should be interchanged so that it reflects the sequence in the text.
24.-Line 322: typo in "one deviation form this trend" 25.-Current figure 4 shows growth rate sum of relaxation values, but units are not reported for any of the axis.According to equation 8, I assume that the sum of relaxation values is presented in mmol/gDw (such as enzyme usages).It is recommended to convert the sum of relaxation values to mass units, by multiplying the contribution of every flexibilized enzyme by its molecular weight.In this way it is possible to assess what is the proportion of flexibilized data in comparison to the total protein content of the cell (global constraint) in terms of a conserved quantity.
26.-Additionally,I recommend to add a color code or different markers to the data points in figure 4, indicating the modeled conditions, as the association between growth rates and experimental conditions is not provided anywhere else in the text or associated materials.This will help top clarify what do authors mean with expression such as "protein constraints are less important when moved away from optimal conditions" (line 327).
27.-Line 334: the sentence "… samples that exceeded the growth rate of and what were capped …" is hard to understand, probably missing words between "for" and "and".

Review comments
Muriel et al. presents a new implementa0on of geckopy, a python version of the GECKO so?ware implemented in MATLAB.The GECKO so?ware is one method for introducing enzymeconstraints in GEMs, and this extension of "standard" GEMs have been proved valuable across a range of organisms and scien0fic ques0ons.As such, a python implementa0on that makes this method more accessible will be a good contribu0on to the COBRA community.Also, by formalizing the enzyme-constraints in SBML language the authors make these models in beOer compliance with FAIR principles.Furthermore, the authors extend the standard GECKO method to incorporate thermodynamic constraints, an important improvement that further improves the value of the developed geckopy 3.0.Finally, the authors use a reduced E. coli GEM to evaluate different methods for relaxing experimental constraints that restricts the model from achieving the experimentally observed growth rate.In summary, I find that this work addresses several improvements and ques0ons that in principle could be of broad interest.The so?ware has a decent documenta0on online and it is easy to install with pip.
Unfortunately, I find that both the so?ware itself (geckopy 3.0) and the manuscript are of insufficient comprehensiveness / quality to reach its poten0al value.First and foremost, I would have liked to see the ability in geckopy to actually make the ecGEMs, ideally including both the extrac0on of enzyme coefficients from Brenda (or with DLKcat) and the conversion from a standard GEM to an ecGEM.In this way, geckopy would be a true alterna0ve to GECKO in matlab.That said, once you have an ecGEM, geckopy 3.0 seems useful for simula0ng the model with methods like FBA and FVA, and for integra0on of proteomics data, thermodynamics and metabolomics data.And for t The manuscript itself seems to be hastely wriOen, it is hard to get the main message and how the results relate to each other and to importance of the developed so?ware.Specific comments are given in bullet points below: -The authors don't put this work into context, and leaves out relevant work like ECMpy and MOMENT -The introduc0on reads strange: first the authors spend several lines on explaining FBA (not sure if this is necessary), then summarizes the paper, line 76-83, and then goes back to the GECKO-formula0on.-In the latest version of GECKO, the stoichiometric coefficient of each enzyme is Mw/Kcat, not 1/Kcat.Why haven't the authors aligned their work with this formula0on, which I believe is also used by MOMENT?-Despite mul0ple figures, it is not clear what's the recommended method for doing relaxa0on.-The formula0on of the relaxa0on linear programs seems to miss the constraint on reac0on bounds?-Also is the numbering of the equa0ons in the text correct, e.g. on line 164 the authors refer to Equa0on 9 (which is also the one below the bullet point), but should this be Eq.6?
-The numbering of the equa0ons is messy, without any space between # and the equa0on.-Eq.9, it is not clear that this op0mize the original objec0ve, as this seems to minimize Z, while in the original it maximizes Z -Line 190: two commas next to each other.
-I would like to see the FVA comparison on full GEM (not E. coli core) o Maybe also for yeast to confirm that these trends are more general -The Enzyme-constrained E.coli GEM has been relaxed in other publica0ons, does the findings here align with previous results?Are the same enzymes relaxed in similar work on yeast or S. coelicolor?-Line 118: Biomass components should sum up to 1 gDW, so the units is actually just 1/h -Line 197: ODml? -Line 383: Figure 3, not 3 -Line 384: Write out Glucose, not Glc -Line 335: rightmost -I would like to see the jupyter notebooks in the .ipynbformat and not .htmlto be able to reproduce the results.-

Point-by-point responses Reviewer #1 (Comments for the Author):
Major concerns: 1.-The geckopy implementation presented here is a software pipeline for integration of omics and thermodynamic constraints into an enzyme-constrained model of metabolism.This is well explained by the title.However, the abstract and introduction sections do not specify that the enzymatic constraints, or catalytic constants are not treated here and are taken as inputs in the geckopy pipeline.The current text may sound self-explanatory to the authors or very specialized researchers, but the readers of Microbiology Spectrum include a broader audience.
We have rephrased a sentence in the abstract to make the requirement of kinetic data clear.The introduction has been expanded to explain in greater detail the rationality and behavior of enzyme-constrained models for a broader audience.
2.-The geckopy implementation presented here is a software pipeline for integration of omics and thermodynamic constraints into an enzyme-constrained model of metabolism.This 2.-The tool is presented as a version 3.0 of the originally proposed geckopy by Sánchez, et al. 2017.At the same time it is highlighted that this is a reconstruction from scratch, which implies independent development, which I do agree that is presented here, but breaks strict software versioning practices.A problem, intrinsic to GECKO available at: https://github.com/SysBioChalmers/GECKO, is that development of the python module stopped practically at its first version, whilst MATLAB versions presented methodological changes in data integration (GECKO 2.0), and even at the model format level (GECKO 3.0).Therefore, it is strictly recommended to be consistent with the title of this manuscript and refer to the new tool as an independent implementation of data constraints in GECKO throughout the rest of the text and associated materials.
A couple of sentences has been added in the introduction to clarify this.We believe that the version number is perfectly fine with semantic versioning, since a major version number change corresponds to breaking changes and this makes it possible to publish it in PyPi without removing the history of a previously existing package with the same name.
3.-Across the whole manuscript there are several terms that differ from the ones that have been used in the studies of this modeling field.In particular, the term enzyme constraint model is repeatedly used in this manuscript.The habit has been to name these models as enzyme, enzymatically or protein models.Consistency with the terminology of the subfield is recommended unless a re-discussion of the term is presented.
All instances of the concept has been normalized to "enzyme-constrained model" in the text, which is the term used by Sánchez et al., 2017.4.-The manuscript and supplementary materials do not provide enough detail the parameterization procedure of the reduced metabolic model (i.e.selection of kcat values).It has been reported by several studies that kcat distributions play a major role in flux distributions, therefore, this factor would be expected to majorly impact the conclusions extracted from the results of the flux variability analysis results.It is necessary to provide such parameterization details and complement discussion with more details regarding the selection of parameters.Particularly those discussed at the reaction level.A property of enzyme constrained models is that systems-level distribution of kinetic parameters translates to changes at the reaction level, and even though this is a reduced model, the number of reactions and metabolites suggest that it represents the conjunction of multiple pathways.It is also recommended to mention the sectors of metabolism that this small models accounts for, in order to highlight the relevance of conclusions to a wider public, such as non-modeler microbiologists.
This has been made added to the results of the FVA comparisons.Also, to make it clearer, we have added a note in the results to indicate that the benchmark was performed with eciML1515 and not the reduced model.
5.-The average enzyme saturation across enzymes is analyzed at a genome-scale by using different relaxation algorithms.This analysis returned an overall low saturation value as all individually constrained enzymes were taken into account.Looking separately at the number of "used" enzymes (the flux carrying ones) may offer an additional factor for comparison, and also calculating a separated saturation factor for this group of enzymes may help the discussion of the commonly adopted average saturation factor of 0.5 in models constrained at the protein pool level.
Figure 2 has been modified to show this, agreeing with the reviewer's note.
6.-It is not clear in the manuscript and methods how the carbon source uptake rate is used as a constraint in addition to those at the proteomic level.This information is of particular importance for constraining a metabolic network, as it is indicative of the metabolic state, mode and even stress levels for a given growth phenotype.
A paragraph in the Methods was updated about the Glucose carbon source and the growth rate.

7.-Comparison of the proposed LP and MILP algorithms for selection of IIS to the brut-force algorithms available in
GECKOmat and caffeine is lacking and is strongly recommended, for the sake of providing quantitative evaluation of approaches to the interested community.When referring to the LP approach in the benchmarking section it is not clear which of the two methods presented for selection of IIS was used (optimization of elastic variables or elastic variables + objective).As growth rate is used as a constraint, this may suggest the former, but not explicitly said and also not clear for researchers starting to explore the subfield.
We agree that it is very illustrative and included the results.To make the comparison clearer, the Figure 1 was replaced with a table.The comment about the objective was clarified in the Methods.

8.-What does a "a large constant K" (line 169) means in the context of the MILP problem. How large is it? what was the rationale behind the chosen value? what is the impact of this parameter in the performance of the MILP-based approach?
We added a clarification: "1000, the default upper bound for a COBRA reaction flux, sufficient enough to block the flux through a protein pseudoexchange reaction".9.-Table 1 is not informative enough of the modifications, condition parameters and assumptions used to model the conditions of origin for the different proteomic datasets.
A paragraph was added in the methods explaining the conditions themselves in the methods and the reasoning about the modifications to the model."About the conditions, they consist on different strains constructed to modulate three substrate limitations: Ammonia limitation (A), Carbon limitation (C) and Ribosome limitation (R).As explained in Table 1, just one modification was done to the model, in the case of ammonia limitation, where the GLUDy reaction was knocked out.The rest of modifications were assumed to be accounted by proteomics data, since they refer to the modulation of the expression of either a protein or the whole proteome (R-limitation).The implementation can be found at S2 Files, proteomics_data_relaxations.ipynb."

10.-Provide statistical tests for the comparison of flux variability ranges across different layers of constraints and add significance arguments to the discussion of these results
We have done as asked, and run pairwise Student's t-tests and reported them in the following The table shows that we cannot reject the null hypothesis that the "Proteomics" method and "Thermo + Proteomics" method fluxes are generated from the same underlying normal distribution under the usual p-value threshold of 0.05.Other comments are that we can reject the null hypothesis for the "FBA" and "Proteomics" pair or the "Proteomics" and "Pool Constraint", for instance.
However, we think that this kind of significance test would confuse the readers.First, the assumptions of a t-test do not hold, the width of the fluxes in a linear programming problem are correlated (t-test requires independent and identically distributed observations).Second, the meaning of a p-value in this case is how likely would be to observe a t-statistic at least as extreme as the computed one if we were to generate a new sample of data.However, in this case, there is no such underlying sampling generation process: we are observing the full population of variables (not a sample), which would be exactly the same everytime we run the FVA for a particular set of constraints.
Thus, we think that discussing the summary statistics displayed in the box-plot should be enough to discuss the differences and similarities between results.
12.-Line 65: "Now by enforcing mass conservation over the network, Sv=0".The explanation is not as straightforward as mentioned here and has been extensively described in other specialized reviews.Better cite those and rephrase.
Added citation and rephrased.
11.-Line 76: "an update of an existing open-source enzyme-constraint software: geckopy 3.0" please follow the recommendation in the major concerns and be more descriptive here, the presented geckopy is a software for incorporation of omics constraints into models with enzymatic parameters.
Rephrased to accomodate the comments.

Fixed.
13.-Line 78: recommended to call proteins as "pseudometabolites" in order to avoid confusion.
Agreed and added.
14.-Line 112: "v is the vector of reaction variables".More than 20 years of FBA related papers have named this as a vector of reaction fluxes, recommended to indicate that the prediction outcome is the distribution of fluxes as a separate idea.
Fixed accordingly.
15.-Line 116: Provide references to the use of the biomass pseudoreaction or the ATP synthase as objective functions.
Added the reference.
16.-Line 154: "a LP problem", acronyms starting with L are usually referred to with "an".

Noted.
17.-Line 159: "the total flux", not clear if the authors refer to the metabolic flux that those enzymes carry or if they mean total use of elastic variables instead.
Rephrased to "total value" so that it is less confusing.
18.-There is a gap in between lines 163 and 164 that does not clarify what the listed points are specifically referring to.
That was a formatting error from LaTeX to DOCX, we hope it is clearer now.
19.-Line 225: not clear with what is meant with "when optimizing biomass for all reactions" in this context.

Clarified.
20.-A one line explanation describing the thermodynamic solution will help to clarify this section.
Not sure which section they are referring to.

21.-Line 298:
The term the number of conditions is not clear when reading this and also figure 1 with its corresponding caption.What does it mean?Do the authors refer to the identifiers of the conditions/samples (e.g.sample 1, sample 2, …, sample 17)?Clarifying this will majorly help this section.
Figure one was removed in favor of the table to make it clearer.

22.-The average number of relaxed proteins across algorithms reported in lines 299-302 does not seem to correspond with the counts values shown in figure 1 (y-axis). Or do they refer to different variables?
Figure 1 was the number of samples (experimental conditions) where a given protein showed up in the IIS.This is different from the average number of proteins in the IIS per method.Anyways, it was removed in favor of the table.Please note that the new table was recomputed and there is some slight numerical variability in the final numbers from run to run.
23.-Line 317: Here fig. 4 is referred to before introducing figure 3, therefore the order of these figures should be interchanged so that it reflects the sequence in the text.
Agreed, we changed it.
25.-Current figure 4 shows growth rate sum of relaxation values, but units are not reported for any of the axis.According to equation 8, I assume that the sum of relaxation values is presented in mmol/gDw (such as enzyme usages).It is recommended to convert the sum of relaxation values to mass units, by multiplying the contribution of every flexibilized enzyme by its molecular weight.In this way it is possible to assess what is the proportion of flexibilized data in comparison to the total protein content of the cell (global constraint) in terms of a conserved quantity.
That is right, we changed it to g/gDw as recommended and annotated the units.
26.-Additionally,I recommend to add a color code or different markers to the data points in figure 4, indicating the modeled conditions, as the association between growth rates and experimental conditions is not provided anywhere else in the text or associated materials.This will help top clarify what do authors mean with expression such as "protein constraints are less important when moved away from optimal conditions" (line 327).
We changed it to "optimal growth conditions" to make it clearer and added colors for the kind of limitation present in the sample.
27.-Line 334: the sentence "… samples that exceeded the growth rate of and what were capped …" is hard to understand, probably missing words between "for" and "and".
Added dashes to distinguish it.

Review #2
Unfortunately, I find that both the software itself (geckopy 3.0) and the manuscript are of insufficient comprehensiveness / quality to reach its potential value.First and foremost, I would have liked to see the ability in geckopy to actually make the ecGEMs, ideally including both the extraction of enzyme coefficients from Brenda (or with DLKcat) and the conversion from a standardGEM to an ecGEM.In this way, geckopy would be a true alternative to GECKO in matlab.
The generation of the ecGEM was left out on purpose after discussing it with the authors of GECKOmat to concentrate that part in one place.This aligns with the kind of work that both packages focus on: while GECKOmat has more functionalities about the Kcats (generation pipeline, DLKcat), geckopy is focused on the experimental integration; i.e., the proteomics and metabolomics relaxations.That being said, there has been interest from the GECKO authors to take over geckopy and unify both in a single python package for the next iteration.
That said, once you have an ecGEM, geckopy3.0seems useful for simulating the models with methods like FBAand FVA,and for integra0onof proteomics data, thermodynamics and metabolomics data.And for the manuscript itself seems to be hastely written, it is hard to get the main message and how the results relate to each other and to importance of the developed so?ware.Specific comments are given in bullet points below: -The authors don't put this work into context, and leaves out relevant work like ECMpy and MOMENT.
The introduction was elaborated to expand upon the context.
-The introduction reads strange: first the authors spend several lines on explaining FBA (not sure if this is necessary), then summarizes the paper, line 76-83, and then goes back to the GECKO-formulation.
We think that explaining FBA in the introduction makes sense from the point of view of the broad audience of Microbilogy Spectrum that might not be familiar with this field (as pointed out but the other reviewer).We have elaborated further the introduction.
-In the latest version of GECKO, the stoichiometric coefficient of each enzyme is Mw/Kcat, not 1/Kcat.Why haven't the authors aligned their work with this formula0on, which I believe is also used by MOMENT?
At the moment of writing, the latest version of GECKO (3.0) has not been published, so we thought it would be respectful to avoid including it here.Nonetheless, we have implemented support for reading and writing SBML documents with this encoding at the user option (see PR#10), which is not supported by GECKO 3.0 yet, as far as we know.
-Despite multiple figures, it is not clear what's the recommended method for doing relaxation.
We have added a few lines discussing how the LP fashion gives smaller IIS and replaced Figure 1 with a table that better compares the results.We hope that makes it more clear.Another aspect that is already explained is the advantage to explore the relaxtion in the LP case.Nonetheless, it is not a black and white answer, as explained in the Methods section: "The different variants of formulations and objectives reflect different assumptions about the uncertainty of the experimental methods in place.Hence, if it is suspected that the uncertainty is uniformly distributed over all measurements, one of the elastic filtering methods should be chosen over the MILP problem.Correspondingly, if the a priori knowledge points to a reduced subset of the enzymes with high uncertainty, the MILP might be a better fit." -The formulation of the relaxation linear programs seems to miss the constraint on reaction bounds?
We added it, thank you for noting it.
-The numbering of the equations is messy, without any space between # and the equation.
We apologize, these are artifacts when converting it form LaTeX to DOCX.We hope that now it is better formatted.
-Eq. 9, it is not clear that this optimize the original objective, as this seems to minimize Z, while in the original it maximizes Z.
We have added a minus to Z to clarify it.
-Line 190: two commas next to each other. Fixed.
-I would like to see the FVA comparison on full GEM (not E. coli core).Maybe also for yeast to confirm that these trends are more general.
We respectfully disagree.The point is that a naive layering of constraints (and its subsequent relaxation) make not make a more constrained model (thermodynamics + enzyme constraints), which is sufficiently shown for a reduced model that is easier to interpret.
-The Enzyme-constrained E.coli GEM has been relaxed in other publications, does the findings here align with previous results?Are the same enzymes relaxed in similar work on yeast or S. coelicolor?
We have not been able to access the relaxations/flexibilizations performed in Domenzain et al., 2022.The proteome flexiblization that they use is the same greedy relaxation algorithm that is also implemented in geckopy.We have added the results for this algorithm to compare with the rest.We do not know of other publications where this Enzyme-constrained E. coli GEM has been used.
-Line 118: Biomass components should sum up to 1 gDW, so the units is actually just 1/h Fixed accordingly.
Added a dot to clarify it.It uses the same nomenclature as in the original publication (see Supplementary material Figure S2 of Balakrishnan et al., 2022).
-I would like to see the jupyter notebooks in the .ipynbformat and not .htmlto be able to reproduce the results.
I agree with that and tried it but the submission form does not allow for uploading ipynb extensions.I have added a link to repository that contains the notebooks in jupyter format with the necessary data to run them (referenced in the text).Note that the relaxation notebook was modified to account for the comments of the other reviewer.Editor minor modifications: Line 21: "for its enzymes" is not grammarly correct Lines 45-47 "Additionally, to ensure that enzyme-constrained models follow the community standards, a format for the proteins is postulated."can be move at line 43.Line 63: Sv = 0.It would be useful to introduce what is v here.Line 79: ... its corresponding reactions ... is not grammarly correct Line 90: "equal the total amount" is not grammarly correct Line 128: Figures ??, 1, 3, 2; has to be fixed Line 129: "to be reproduced at" lacks the verb Line 139: "pseudorreaction" -> pseudoreaction Line 149: Flux should be in minor case Line 195: Why Relaxation is upper case here?Line 203: fp,i is diferent from the symbol used in eq.11 Please check the indices used in eqs.12 and 13 Line 221: "consist on" -> consist of Line 251: "which block" is not correct Line 300: "that reflect the" is not correct Line 304: "on top th"e is not correct Line 297: "Enzyme constraint models" is to be fixed Line 416: "formulation provides" would require a comma in between

Preparing Revision Guidelines
To submit your modified manuscript, log onto the eJP submission site at https://spectrum.msubmit.net/cgi-bin/main.plex.Go to Author Tasks and click the appropriate manuscript title to begin the revision process.The information that you entered when you first submitted the paper will be displayed.Please update the information as necessary.Here are a few examples of required updates that authors must address: • Point-by-point responses to the issues raised by the reviewers in a file named "Response to Reviewers," NOT IN YOUR COVER LETTER.
• Upload a compare copy of the manuscript (without figures) as a "Marked-Up Manuscript" file.
• Each figure must be uploaded as a separate file, and any multipanel figures must be assembled into one file.For complete guidelines on revision requirements, please see the journal Submission and Review Process requirements at https://journals.asm.org/journal/Spectrum/submission-review-process.Submissions of a paper that does not conform to Microbiology Spectrum guidelines will delay acceptance of your manuscript." Please return the manuscript within 60 days; if you cannot complete the modification within this time period, please contact me.If you do not wish to modify the manuscript and prefer to submit it to another journal, please notify me of your decision immediately so that the manuscript may be formally withdrawn from consideration by Microbiology Spectrum.
If your manuscript is accepted for publication, you will be contacted separately about payment when the proofs are issued; please follow the instructions in that e-mail.Arrangements for payment must be made before your article is published.For a complete list of Publication Fees, including supplemental material costs, please visit our website.
Corresponding authors may join or renew ASM membership to obtain discounts on publication fees.Need to upgrade your membership level?Please contact Customer Service at Service@asmusa.org.
Thank you for submitting your paper to Microbiology Spectrum.

Point-by-point responses: 2 nd review
Reviewer #2 (Comments for the Author) 1.There are several spelling errors in line 400-401 in marked-up manuscript (e.g.TCA cylce) We have fixed it, thank you.

I am not sure if it make sense to capitalize metabolite names (Glucose)
We have changed glucose to lower case.

Editor minor modifications
Line 21: "for its enzymes" is not grammarly correct Fixed, thank you.
Lines 45-47 "Additionally, to ensure that enzyme-constrained models follow the community standards, a format for the proteins is postulated."can be move at line 43.
Instead, we have moved it to line 48, we believe it achieves the desired effect.
Line 63: Sv = 0.It would be useful to introduce what is v here.
We have included it now.
Line 90: "equal the total amount" is not grammarly correct Line 128: Figures ??, 1, 3, 2; has to be fixed Line 129: "to be reproduced at" lacks the verb Line 139: "pseudorreaction" -> pseudoreaction Line 149: Flux should be in minor case Line 195: Why Relaxation is upper case here? Fixed.
Line 203: fp,i is diferent from the symbol used in eq.11 Please check the indices used in eqs.12 and 13 All indices were changed to "i" on the left hand side for consistency and the equations were updated appropriately.
Line 221: "consist on" -> consist of Line 251: "which block" is not correct Line 300: "that reflect the" is not correct Line 304: "on top th"e is not correct Line 297: "Enzyme constraint models" is to be fixed Line 416: "formulation provides" would require a comma in between All of the above were corrected accordingly.

Point-by-point responses: 1 st review
Reviewer #1 (Comments for the Author): Major concerns: 1.-The geckopy implementation presented here is a software pipeline for integration of omics and thermodynamic constraints into an enzyme-constrained model of metabolism.This is well explained by the title.However, the abstract and introduction sections do not specify that the enzymatic constraints, or catalytic constants are not treated here and are taken as inputs in the geckopy pipeline.The current text may sound self-explanatory to the authors or very specialized researchers, but the readers of Microbiology Spectrum include a broader audience.
We have rephrased a sentence in the abstract to make the requirement of kinetic data clear.
The introduction has been expanded to explain in greater detail the rationality and behavior of enzyme-constrained models for a broader audience.
2.-The geckopy implementation presented here is a software pipeline for integration of omics and thermodynamic constraints into an enzyme-constrained model of metabolism.This 2.-The tool is presented as a version 3.0 of the originally proposed geckopy by Sánchez, et al. 2017.At the same time it is highlighted that this is a reconstruction from scratch, which implies independent development, which I do agree that is presented here, but breaks strict software versioning practices.A problem, intrinsic to GECKO available at: https://github.com/SysBioChalmers/GECKO, is that development of the python module stopped practically at its first version, whilst MATLAB versions presented methodological changes in data integration (GECKO 2.0), and even at the model format level (GECKO 3.0).Therefore, it is strictly recommended to be consistent with the title of this manuscript and refer to the new tool as an independent implementation of data constraints in GECKO throughout the rest of the text and associated materials.
A couple of sentences has been added in the introduction to clarify this.We believe that the version number is perfectly fine with semantic versioning, since a major version number change corresponds to breaking changes and this makes it possible to publish it in PyPi without removing the history of a previously existing package with the same name.
3.-Across the whole manuscript there are several terms that differ from the ones that have been used in the studies of this modeling field.In particular, the term enzyme constraint model is repeatedly used in this manuscript.The habit has been to name these models as enzyme, enzymatically or protein models.Consistency with the terminology of the subfield is recommended unless a re-discussion of the term is presented.
All instances of the concept has been normalized to "enzyme-constrained model" in the text, which is the term used by Sánchez et al., 2017.
4.-The manuscript and supplementary materials do not provide enough detail the parameterization procedure of the reduced metabolic model (i.e.selection of kcat values).It has been reported by several studies that kcat distributions play a major role in flux distributions, therefore, this factor would be expected to majorly impact the conclusions extracted from the results of the flux variability analysis results.It is necessary to provide such parameterization details and complement discussion with more details regarding the selection of parameters.
Particularly those discussed at the reaction level.A property of enzyme constrained models is that systems-level distribution of kinetic parameters translates to changes at the reaction level, and even though this is a reduced model, the number of reactions and metabolites suggest that it represents the conjunction of multiple pathways.It is also recommended to mention the sectors of metabolism that this small models accounts for, in order to highlight the relevance of conclusions to a wider public, such as non-modeler microbiologists.
This has been made added to the results of the FVA comparisons.Also, to make it clearer, we have added a note in the results to indicate that the benchmark was performed with eciML1515 and not the reduced model.
5.-The average enzyme saturation across enzymes is analyzed at a genome-scale by using different relaxation algorithms.This analysis returned an overall low saturation value as all individually constrained enzymes were taken into account.Looking separately at the number of "used" enzymes (the flux carrying ones) may offer an additional factor for comparison, and also calculating a separated saturation factor for this group of enzymes may help the discussion of the commonly adopted average saturation factor of 0.5 in models constrained at the protein pool level.
Figure 2 has been modified to show this, agreeing with the reviewer's note.
6.-It is not clear in the manuscript and methods how the carbon source uptake rate is used as a constraint in addition to those at the proteomic level.This information is of particular importance for constraining a metabolic network, as it is indicative of the metabolic state, mode and even stress levels for a given growth phenotype.
A paragraph in the Methods was updated about the Glucose carbon source and the growth rate.
7.-Comparison of the proposed LP and MILP algorithms for selection of IIS to the brut-force algorithms available in GECKOmat and caffeine is lacking and is strongly recommended, for the sake of providing quantitative evaluation of approaches to the interested community.When referring to the LP approach in the benchmarking section it is not clear which of the two methods presented for selection of IIS was used (optimization of elastic variables or elastic variables + objective).As growth rate is used as a constraint, this may suggest the former, but not explicitly said and also not clear for researchers starting to explore the subfield.
We agree that it is very illustrative and included the results.To make the comparison clearer, the Figure 1  We added a clarification: "1000, the default upper bound for a COBRA reaction flux, sufficient enough to block the flux through a protein pseudoexchange reaction".
9.-Table 1 is not informative enough of the modifications, condition parameters and assumptions used to model the conditions of origin for the different proteomic datasets.
A paragraph was added in the methods explaining the conditions themselves in the methods and the reasoning about the modifications to the model.
"About the conditions, they consist on different strains constructed to modulate three substrate limitations: Ammonia limitation (A), Carbon limitation (C) and Ribosome limitation (R).As explained in Table 1, just one modification was done to the model, in the case of ammonia limitation, where the GLUDy reaction was knocked out.The rest of modifications were assumed to be accounted by proteomics data, since they refer to the modulation of the expression of either a protein or the whole proteome (R-limitation).The implementation can be found at S2 Files, proteomics_data_relaxations.ipynb." 10.-Provide statistical tests for the com on full GEM (not E. coli core).Maybe also for yeast to confirm that these trends are more general.
We respectfully disagree.The point is that a naive layering of constraints (and its subsequent relaxation) make not make a more constrained model (thermodynamics + enzyme constraints), which is sufficiently shown for a reduced model that is easier to interpret.
-The Enzyme-constrained E.coli GEM has been relaxed in other publications, does the findings here align with previous results?Are the same enzymes relaxed in similar work on yeast or S. coelicolor?
We have not been able to access the relaxations/flexibilizations performed in Domenzain et al., 2022.The proteome flexiblization that they use is the same greedy relaxation algorithm that is also implemented in geckopy.We have added the results for this algorithm to compare with the rest.We do not know of other publications where this Enzyme-constrained E. coli GEM has been used.
-Line 118: Biomass components should sum up to 1 gDW, so the units is actually just 1/h Fixed accordingly.
Added a dot to clarify it.It uses the same nomenclature as in the original publication (see Supplementary material Figure S2 of Balakrishnan et al., 2022).
-I would like to see the jupyter notebooks in the .ipynbformat and not .htmlto be able to reproduce the results.
I agree with that and tried it but the submission form does not allow for uploading ipynb extensions.I have added a link to repository that contains the notebooks in jupyter format with the necessary data to run them (referenced in the text).Note that the relaxation notebook was modified to account for the comments of the other reviewer.parison of flux variability ranges across different layers of constraints and add significance arguments to the discussion of these results We have done as asked, and run pairwise Student's t-tests and reported them in the following The table shows that we cannot reject the null hypothesis that the "Proteomics" method and "Thermo + Proteomics" method fluxes are generated from the same underlying normal distribution under the usual p-value threshold of 0.05.Other comments are that we can reject the null hypothesis for the "FBA" and "Proteomics" pair or the "Proteomics" and "Pool Constraint", for instance.
However, we think that this kind of significance test would confuse the readers.First, the assumptions of a t-test do not hold, the width of the fluxes in a linear programming problem are correlated (t-test requires independent and identically distributed observations).Second, the meaning of a p-value in this case is how likely would be to observe a t-statistic at least as extreme as the computed one if we were to generate a new sample of data.However, in this case, there is no such underlying sampling generation process: we are observing the full population of variables (not a sample), which would be exactly the same everytime we run the FVA for a particular set of constraints.
Thus, we think that discussing the summary statistics displayed in the box-plot should be enough to discuss the differences and similarities between results.
12.-Line 65: "Now by enforcing mass conservation over the network, Sv=0".The explanation is not as straightforward as mentioned here and has been extensively described in other specialized reviews.Better cite those and rephrase.
Added citation and rephrased.
11.-Line 76: "an update of an existing open-source enzyme-constraint software: geckopy 3.0" please follow the recommendation in the major concerns and be more descriptive here, the presented geckopy is a software for incorporation of omics constraints into models with enzymatic parameters.
Rephrased to accomodate the comments.

Fixed.
13.-Line 78: recommended to call proteins as "pseudometabolites" in order to avoid confusion.
Agreed and added.
14.-Line 112: "v is the vector of reaction variables".More than 20 years of FBA related papers have named this as a vector of reaction fluxes, recommended to indicate that the prediction outcome is the distribution of fluxes as a separate idea.
Fixed accordingly.
15.-Line 116: Provide references to the use of the biomass pseudoreaction or the ATP synthase as objective functions.
Added the reference.
16.-Line 154: "a LP problem", acronyms starting with L are usually referred to with "an". Noted.
17.-Line 159: "the total flux", not clear if the authors refer to the metabolic flux that those enzymes carry or if they mean total use of elastic variables instead.
Rephrased to "total value" so that it is less confusing.
18.-There is a gap in between lines 163 and 164 that does not clarify what the listed points are specifically referring to.
That was a formatting error from LaTeX to DOCX, we hope it is clearer now.
19.-Line 225: not clear with what is meant with "when optimizing biomass for all reactions" in this context.

Clarified.
20.-A one line explanation describing the thermodynamic solution will help to clarify this section.
Not sure which section they are referring to.

21.-Line 298:
The term the number of conditions is not clear when reading this and also figure 1 with its corresponding caption.What does it mean?Do the authors refer to the identifiers of the conditions/samples (e.g.sample 1, sample 2, …, sample 17)?Clarifying this will majorly help this section.
Figure one was removed in favor of the table to make it clearer.

22.-The average number of relaxed proteins across algorithms reported in lines 299-302 does not seem to correspond with the counts values shown in figure 1 (y-axis). Or do they refer to different variables?
Figure 1 was the number of samples (experimental conditions) where a given protein showed up in the IIS.This is different from the average number of proteins in the IIS per method.Anyways, it was removed in favor of the table.Please note that the new table was recomputed and there is some slight numerical variability in the final numbers from run to run.
23.-Line 317: Here fig. 4 is referred to before introducing figure 3, therefore the order of these figures should be interchanged so that it reflects the sequence in the text.
Agreed, we changed it.
25 That is right, we changed it to g/gDw as recommended and annotated the units.
26.-Additionally,I recommend to add a color code or different markers to the data points in figure 4, indicating the modeled conditions, as the association between growth rates and experimental conditions is not provided anywhere else in the text or associated materials.This will help top clarify what do authors mean with expression such as "protein constraints are less important when moved away from optimal conditions" (line 327).
We changed it to "optimal growth conditions" to make it clearer and added colors for the kind of limitation present in the sample.
27.-Line 334: the sentence "… samples that exceeded the growth rate of and what were capped …" is hard to understand, probably missing words between "for" and "and".
Added dashes to distinguish it.

Review #2
Unfortunately, I find that both the software itself (geckopy 3.0) and the manuscript are of insufficient comprehensiveness / quality to reach its potential value.First and foremost, I would have liked to see the ability in geckopy to actually make the ecGEMs, ideally including both the extraction of enzyme coefficients from Brenda (or with DLKcat) and the conversion from a standardGEM to an ecGEM.In this way, geckopy would be a true alternative to GECKO in matlab.
The generation of the ecGEM was left out on purpose after discussing it with the authors of GECKOmat to concentrate that part in one place.This aligns with the kind of work that both packages focus on: while GECKOmat has more functionalities about the Kcats (generation pipeline, DLKcat), geckopy is focused on the experimental integration; i.e., the proteomics and metabolomics relaxations.That being said, there has been interest from the GECKO authors to take over geckopy and unify both in a single python package for the next iteration.
That said, once you have an ecGEM, geckopy3.0seems useful for simulating the models with methods like FBAand FVA,and for integra0onof proteomics data, thermodynamics and metabolomics data.And for the manuscript itself seems to be hastely written, it is hard to get the main message and how the results relate to each other and to importance of the developed so?ware.Specific comments are given in bullet points below: -The authors don't put this work into context, and leaves out relevant work like ECMpy and MOMENT.
The introduction was elaborated to expand upon the context.
-The introduction reads strange: first the authors spend several lines on explaining FBA (not sure if this is necessary), then summarizes the paper, line 76-83, and then goes back to the GECKO-formulation.
We think that explaining FBA in the introduction makes sense from the point of view of the broad audience of Microbilogy Spectrum that might not be familiar with this field (as pointed out but the other reviewer).We have elaborated further the introduction.
-In the latest version of GECKO, the stoichiometric coefficient of each enzyme is Mw/Kcat, not 1/Kcat.Why haven't the authors aligned their work with this formula0on, which I believe is also used by MOMENT?
At the moment of writing, the latest version of GECKO (3.0) has not been published, so we thought it would be respectful to avoid including it here.Nonetheless, we have implemented support for reading and writing SBML documents with this encoding at the user option (see PR#10), which is not supported by GECKO 3.0 yet, as far as we know.
-Despite multiple figures, it is not clear what's the recommended method for doing relaxation.
We have added a few lines discussing how the LP fashion gives smaller IIS and replaced Figure 1 with a table that better compares the results.We hope that makes it more clear.Another aspect that is already explained is the advantage to explore the relaxtion in the LP case.Nonetheless, it is not a black and white answer, as explained in the Methods section: "The different variants of formulations and objectives reflect different assumptions about the uncertainty of the experimental methods in place.Hence, if it is suspected that the uncertainty is uniformly distributed over all measurements, one of the elastic filtering methods should be chosen over the MILP problem.Correspondingly, if the a priori knowledge points to a reduced subset of the enzymes with high uncertainty, the MILP might be a better fit." -The formulation of the relaxation linear programs seems to miss the constraint on reaction bounds?
We added it, thank you for noting it.
-The numbering of the equations is messy, without any space between # and the equation.
We apologize, these are artifacts when converting it form LaTeX to DOCX.We hope that now it is better formatted.
-Eq. 9, it is not clear that this optimize the original objective, as this seems to minimize Z, while in the original it maximizes Z.
We have added a minus to Z to clarify it.
-Line 190: two commas next to each other. Fixed.
-I would like to see the FVA comparison on full GEM (not E. coli core).Maybe also for yeast to confirm that these trends are more general.
We respectfully disagree.The point is that a naive layering of constraints (and its subsequent relaxation) make not make a more constrained model (thermodynamics + enzyme constraints), which is sufficiently shown for a reduced model that is easier to interpret.
-The Enzyme-constrained E.coli GEM has been relaxed in other publications, does the findings here align with previous results?Are the same enzymes relaxed in similar work on yeast or S. coelicolor?
We have not been able to access the relaxations/flexibilizations performed in Domenzain et al., 2022.The proteome flexiblization that they use is the same greedy relaxation algorithm that is also implemented in geckopy.We have added the results for this algorithm to compare with the rest.We do not know of other publications where this Enzyme-constrained E. coli GEM has been used.
-Line 118: Biomass components should sum up to 1 gDW, so the units is actually just 1/h Fixed accordingly.
table (nan indicates that they are the same distribution): Thank you for submitting your manuscript to Microbiology Spectrum.As you will see your paper is very close to acceptance.Please modify the manuscript along the lines I have recommended.As these revisions are quite minor, I expect that you should be able to turn in the revised paper in less than 30 days, if not sooner.If your manuscript was reviewed, you will find the reviewers' comments below.When submitting the revised version of your paper, please provide (1) point-by-point responses to the issues raised by the reviewers as file type "Response to Reviewers," not in your cover letter, and (2) a PDF file that indicates the changes from the original submission (by highlighting or underlining the changes) as file type "Marked Up Manuscript -For Review Only".Please use this link to submit your revised manuscript.Detailed instructions on submitting your revised paper are below.
Link Not AvailableThank you for the privilege of reviewing your work.Below you will find instructions from the Microbiology Spectrum editorial office and comments generated during the review.The ASM Journals program strives for constant improvement in our submission and publication process.Please tell us how we can improve your experience by taking this quick Author Survey.
• Manuscript: A .DOC version of the revised manuscript • Figures: Editable, high-resolution, individual figure files are required at revision, TIFF or EPS files are preferred was replaced with a table.The comment about the objective was clarified in the Methods.
table (nan indicates that they are the same distribution): .-Current figure4shows growth rate sum of relaxation values, but units are not reported for any of the axis.According to equation 8, I assume that the sum of relaxation values is presented in mmol/gDw (such as enzyme usages).It is recommended to convert the sum of relaxation values to mass units, by multiplying the contribution of every flexibilized enzyme by its molecular weight.In this way it is possible to assess what is the proportion of flexibilized data in comparison to the total protein content of the cell (global constraint) in terms of a conserved quantity.