Application of a novel numerical simulation to biochemical reaction systems

Recent advancements in omics and single-cell analysis highlight the necessity of numerical methods for managing the complexity of biological data. This paper introduces a simulation program for biochemical reaction systems based on the natural number simulation (NNS) method. This novel approach ensures the equitable treatment of all molecular entities, such as DNA, proteins, H2O, and hydrogen ions (H+), in biological systems. Central to NNS is its use of stoichiometric formulas, simplifying the modeling process and facilitating efficient and accurate simulations of diverse biochemical reactions. The advantage of this method is its ability to manage all molecules uniformly, ensuring a balanced representation in simulations. Detailed in Python, NNS is adept at simulating various reactions, ranging from water ionization to Michaelis–Menten kinetics and complex gene-based systems, making it an effective tool for scientific and engineering research.


Introduction
In recent years, single-cell and multi-omics data analysis has attracted attention in the field of molecular cell biology (Lim et al., 2023;Massimino et al., 2023;Baysoy et al., 2024).In particular, analysis using machine learning models, such as deep learning, has been active (Lin et al., 2022;Gunawan, et al., 2023;Wagle et al., 2024).However, understanding stoichiometric reactions is still considered essential to understand gene expression and protein enzyme functions in detail.This paper proposes a simulation program for biochemical reaction systems based on the natural number simulation (NNS) method.NNS offers a simplified approach for modeling biochemical reactions stoichiometrically and performing computations of the time course of the reactions.This method excels in its straightforward formulation process and is particularly suitable for complex biological systems (Blinov et al., 2004;Browning et al., 2022;Huizing et al., 2022) because of the absence of complicated reaction formulas using mathematical equations.
Various models have been considered to represent complex biological systems, including ordinary differential equations and Gillespie's algorithm (Gillespie, 1977;Ullah and Wolkenhauer, 2011;Alon, 2019).For example, metabolic reaction systems continue to be vigorously analyzed in detail using differential equations (Khodayari and Maranas, 2016;Himeoka and Mitarai, 2022).Petri nets substantially contribute to analyzing biochemical reaction networks by integrating stochastic algorithms and/or non-parametric strategies.This progression has resulted in the emergence of specialized forms such as signaling Petri nets, large-scale metabolic models, and multilevel biological models using colored Petri nets, indicating their increasing utility in the analysis of complex systems (Ruths et al., 2008;Rohr, 2018;Brinkrolf et al., 2021;Shaikh et al., 2022;Liu et al., 2023).
However, ordinary differential equations require complex formulations of reaction rate equations.Stochastic simulations require detailed formulations of chemical reaction networks, which may limit their usefulness, particularly for large-scale simulations.Petri net-based simulations offer a comprehensive framework for formulating chemical reaction networks involving specific elements such as places, transitions, arcs, and markings.However, modeling actual biochemical reaction systems requires consideration of the definition of the number of tokens and their correspondence to the number of molecules.In addition, when introducing stochastic models, it is necessary to explicitly define the timing of firing.
The NNS method implemented in Python is very easy to model because the procedure for determining detailed time evolution is included in the computational algorithm.Requiring only the stoichiometric equation, its rate constants, and the initial number of molecules, the new algorithm, based on a probabilistic binomial distribution, immediately calculates the number of molecules after one calculation step for the elements in the model system.It is crucial to emphasize the necessity of precise rate constant determination to depict the system accurately.Nevertheless, the simplicity of the formulation has the advantage of allowing the optimization process to be easily performed.
Recent advancements in non-parametric analytical techniques also present an intriguing possibility for NNS (Ruths et al., 2008;Rohr, 2018).By meticulously setting the initial molecular counts, rate constants, and stoichiometric equations, an analysis comparable to those seen in studies of complex signaling networks may be achieved.This approach could potentially enhance the applicability and accuracy of NNS for modeling intricate biochemical systems.
Contrary to traditional methods, such as ordinary differential equations and Gillespie's algorithm, which might not optimally represent complex biological systems (Gillespie, 1977;Alon, 2019), NNS provides a more precise simulation even for a small number of molecules in DNA-related reactions.It uses stoichiometric equations with rate constants for specific and selective processes such as transcription and translation (Watson et al., 2014).Moreover, the binomial distribution in NNS facilitates calculating informational and entropic metrics, including Shannon's entropy (Shannon, 1948;Adriaans and van Benthem, 2008).Examples of these calculations are discussed later in the paper.

Methods
Our method calculates time-developing stoichiometric reactions using a binomial algorithm.Consider the general reaction as Equation 1: X 1 , X 2 , /, X s denote molecular elements before the reaction while Y 1 , Y 2 , /, Y t represent those after the reaction.The coefficients q i and r i are reaction orders.The subscripts "s" and "t" are natural numbers, with no limitation on their sizes.The rate constant k implies the probability of forming Y 1 , Y 2 , /, Y t .In the NNS approach, reaction orders are consistently natural numbers because they necessitate dealing with element counts as natural numbers at all times.Our methodology accommodates stoichiometric reactions without limiting specific chemical formulas, thereby ensuring the conservation of the atom number.
Initial element counts are necessary for computing the time evolution.X i _n and Y i _n are defined as the elemental number of X i and Y i ; then, the following calculation in Equations 2, 3 with Equations 4, 5 provide the decrement and increment quantities of the elements.
n int X i n q i min (4) R binomial (n, p) generates random numbers following the binomial distribution, returning natural numbers (Rohr, 2018).In this function, n denotes a trial number, each of which can be either a success or a failure (binary outcomes).The parameter p specifies the probability of success in each trial.The function provides the total number of successes in n trials.NumPy's random binomial (n, p) achieves this.The function returns one successful natural number under n, including zero.Stochastic simulation often uses these discrete numbers (Székely and Burrage, 2014; Gholami and Ilie, 2021).The trial number n is represented with the minimum integer value among X 1 _n/q 1 , X 2 _ n/q 2 , X 3 _n/q 3 , so on.The probability p is derived based on the rate constant k, the reaction order q i , a normalization parameter N, and with Xi n qi selected in Equation 4. In a calculation step, X i _n becomes X i _n + ΔX i _n and Y i _n becomes Y i _n + ΔY i _n.The rate constant k in NNS is semantically the same as the rate constant that appears in the stoichiometric equation defined by concentration.However, k in this study is defined as a constant that affects the stochastic results defined in Equation 5.The rate constant k, the number of elements, and the normalization constant N determine the probability p, as indicated by Equation 5.The k value influences the probability variation between 0 and 1 and the reaction rate, but it is not the same as the conventional rate constant.
Equations 4, 5 have some minor formulas.One is in the case of a one-element reaction as Equation 6: X is a molecular element, and a is an order.The decrement ΔX_n and increment ΔY i _n are derived from the following formulas as Equations 7, 8 with Equations 9, 10: n int X n a (9) where X_n is the number of the element X.
Another exceptional reaction type is as Equation 11: This expression indicates that element X is decomposed and disappears, so Equation 3 does not work.Thus, only X decreases with Equation 7with Equation 9, 10.
Another type of reaction is defined as a linear decreasing and increasing reaction for one element.One needs these reactions for the mathematical formulation of linear changes via molecular addition and subtraction.The linear decreasing case is defined similarly to Equation 11 as Equation 12: aX → 0 zero ( )by linear, with a rate constant k (12) Moreover, the increment of X is determined by the following Equation 13: where p is the same as in Equation 10.Subsequently, the linear increasing case is defined by the following reaction as Equation 14: 0 zero ( )→ bY by linear, with a rate constant k (14) The following Equation 15 also determines the increment of Y: where p is the same as in Equation 10.
A biological system has numerous incredible reactions (Nelson and Cox, 2021).One can easily define each reaction if one knows the elements before and after the reaction, their orders, and rate constants.
Executing an input file in Spyder, an Integrated Development Environment requires setting "fName = "inp_file.txt""in the main program (i.e., binomial_v016.py) and then utilizing the "Run" command.For command line execution, the command "$ python binomial_v016.pyinp_file.txt"yields similar outcomes to Spyder's run.Therefore, it is essential to position the input file in the program's directory or one level above it.
This file needs to define calculation time, elements, reactions, and plots (  S1, S2), including the above formulas used to calculate reaction increments.
The *ElementInOut statement implemented definition statements to express the inflow and outflow of molecules in and out of the system (Supplementary Table S3).The author developed the Python program used for this algorithm, and it is available under the MIT license on GitHub at https://github.com/taka-b/binomial/tree/binomial_v016_01 (Sato, 2023).This algorithm is publicly accessible and can be used by anyone without any charge.Figure 1A shows the calculation for schematic flow, while Figure 1B depicts the detailed calculation algorithm.The simplicity of the proposed procedure is demonstrated through six examples using their corresponding input files.

Results
Some example calculations are shown below.Note that the user only needs to prepare a text file for the calculations to obtain the results.The important point is to carefully formulate the stoichiometric equation and give appropriate rate constants.
The following results are obtained for calculations based on the text file shown in each figure.The start and end times are defined by *Time; for example, zero, 10,000, displays the results of calculations for every step from zero to 10,000.If you want to compare these results with actual experimental results, you will need units and rate constants that correspond to the experimental results.Here, however, the units of calculation are denoted in Steps, and figures show changes in the number of elements.

Simple reaction model
Consider a simple reaction with rate constant k on as Equation 16: where A and B represent protein molecules that irreversibly bind to form C.  1) defining one reaction r2_1_100 (details in Supplementary Table S1), where the calculation spans time zero to 10,000 in dimensionless units.In the actual calculation, each increase or decrease in the number of elements is calculated according to the above algorithm for each increase or decrease.The user must assign the appropriate time unit for a particular system.Here, for the sake of schematic calculation, the unit of time is used as the calculation unit Steps.NNS uses natural numbers, and as shown in Figure 2B, the y-axis represents the number of elements, guaranteeing an accurate number of elements.
In contrast, Supplementary Figure S1 introduces the inverse reaction r1_2_200 as Equation 17.
The rate constant k off differs from k on in Equation 16 due to the thermodynamic principle in the system (Craig et al., 2021).Supplementary Figure S1 shows the equilibrium numbers of A, B, and C.

Water ionization
NNS can perform water ionization.Water, H 2 O, usually dissociates slightly into proton ions (H + ) and hydroxyl ions (OH − ), representing a    dynamical equilibrium between dissociation and reassociation.Figures 3A, B are input files in this scenario.Suitable rate constants show this equilibrium in Figures 3C, D. Notably, NNS accounts for system size through normalization parameters.Although the input files present different initial water molecule counts, their rate constants remain consistent.These normalization parameters adjust the system size by changing the success probability, p, in Equation 5.If a pseudo pH is defined as Equation 18: The results from Equation 18 using Figures 3B, D simulating data have almost the same values, 6.999993 and 6.9999996, respectively.The exact rate constants for the dissociation of water and recombination of proton ions (H + ) and hydroxyl ions (OH − ) remain unknown.Therefore, the horizontal axis in Figures 3C, D is labeled in Steps, the unit of calculation.If the exact values of these rate constants are known, the appropriate unit of time should be determined accordingly.

Michaelis-Menten model
Michaelis-Menten kinetics (Gilbert et al., 2006) is applied using NNS.In Figure 4A, E is an enzyme, S is the substrate, ES is the enzyme-substrate complex, and P is the product (Craig et al., 2021).The corresponding scheme is as Equation 19: NNS defined three reactions in the input file.The total E number (E_n + ES_n) is constant, though E and ES fluctuate in time series (Figure 4B).Furthermore, S decreases with increasing P (Figure 4C).

Monod-Wyman-Changeux allosteric model
NNS can manage allosteric models (Goodey and Benkovic, 2008;Machado et al., 2015;Wodak et al., 2019).The allosteric transition of hemoglobin is one of the most interesting behaviors because of molecular adaptation in vertebrates (Eaton, 2022).Supplementary Figure S2A features an input file for the binding of oxygen to hemoglobin at different rate constants, 0.1, 0.5, 10, and 50.However, Supplementary Figure S2B uses a constant rate of 0.1.Figure 5A illustrates the allosteric effect; its increasing rate of oxyhemoglobin (Hb(O 2 ) 4 ) binding was higher than that observed in the no allosteric effect model of Figure 5B initially (Supplementary Figure S3A).The ratios of Hb(O 2 ) 4 to other complexes are also consistently higher than those observed in the non-allosteric model (Supplementary Figure S3B).

Feedback loop model
NNS supports the models of feedback loops in biological systems, which are common regulatory mechanisms (Ferrazzi  Feedback system model for one-gene DNA (Alon, 2019).Stimulus: protein molecules bind DNA; DNA: one gene; Ribonucleotide: some ribonucleotides; mRNA: messenger-RNA; Amino: some amino acids; Ribosome: ribosome for translation; and Protein: a protein made by the ribosome.DNA is transcribed into mRNA, which is then translated into Protein.The protein binds to DNA to deactivate and is modified into DNA_d (deactivate).(Reproduced from Alon, 2019, with permission from Chapman and Hall/CRC).et al., 2011;Alon, 2019).Figure 6 depicts a feedback loop where a stimulus activates DNA, but the resultant protein then deactivates DNA into DNA_d (deactivated DNA).Supplementary Figure S4's input file models ribonucleotide (Ribonucleotide) and amino acid (Amino) as singular types.The advantage of NNS is its ability to represent individual DNA molecules and deliver their specific properties.In Figure 7 the DNA is inactivated by the protein, mRNA production is stopped, and protein production is suppressed.
Figure 7 and Supplementary Figure S5 exhibit contrasting results with and without feedback, respectively, highlighting the protein's role in limiting DNA activation in feedback systems.

Feed-forward loop in a biological system
The feed-forward loop is a prevalent biological system (Kremling et al., 2008;Macía et al., 2009;Le and Kwon, 2011).Figure 8 depicts a schematic reaction process (Alon, 2019).Stimuli Sx and Sy ultimately promote protein pZ production.The system encompasses 10 elements, two input elements and six reactions, whose detailed functions are shown in the input file of Supplementary Figure S6.Using *elementInOut (Supplementary Table S3) and shifting the inflow timing of Sx and Sy as shown in Supplementary Figure S6 affects the behavior of the protein, as shown in Figure 9: Sy is inputted later than Sx, the production of Pz is delayed.On the other hand, in Supplementary Figures S7, S8, Sx and Sy are inputted simultaneously, there is no significant delay in Pz.  obtained from Equation 21 using Figure 7 and Supplementary Figure S4.Information on reaction and RE offers fundamental tools for dissecting reaction properties and understanding the complexity of biochemical systems.

FIGURE 1
FIGURE 1 Calculation flow and algorithm of NNS.(A) Workflow executed by Python programs.The upper left shows an input file (Supplementary Figure S2A for an example), and the lower left depicts the resulting graph (Figure 5 for an example).(B) Detailed NNS calculation algorithm of the calculation in the Calculation steps of NNS: (A).
FIGURE 2 (A) Input text and (B) calculation result of (A).

FIGURE 3
FIGURE 3 Ionization of water.H 2 O means water, H 2 O; H+ hydrogen ion H + ; and OH-hydroxide ion, OH − .(A) Input text with N = 2e19 and (B) with N = 2e24.(C)The calculation result of (A,D) that of (B).

FIGURE 7
FIGURE 7 Results of the feedback system of the input file are in Supplementary Figure S4.(A) RNA and Amino.(B) DNA and DNA_d.(C) mRNA and Protein.(D) Stimulus.

FIGURE 8 Feed
FIGURE 8 Feed-forward system (Alon, 2019).Sx and Sy: Stimuli; pX, pY, and pZ: proteins; pX* and pY*: activated proteins of pX and pY; DNA-Y and DNA-Z: DNA for each protein pY and pZ.(Reproduced from Alon, 2019, with permission from Chapman and Hall/CRC).

Table 1 )
. Currently, model systems are defined in text files.Currently, a Python program is available on GitHub that converts xml files that conform to Systems Biology Markup Language into text files for NNS.Users can perform calculations via NNS by setting the appropriate initial number of elements and rate constants.The *Reaction statement has numerous definitions (Supplementary Tables

TABLE 1
Standard input file style.The *Time, *Element, *Reaction, and *Plot are commands for calculation in an input file.
Figure 2A is an input file (details in Table