Signal execution modes emerge in biochemical reaction networks calibrated to experimental data

Summary Mathematical models of biomolecular networks are commonly used to study cellular processes; however, their usefulness to explain and predict dynamic behaviors is often questioned due to the unclear relationship between parameter uncertainty and network dynamics. In this work, we introduce PyDyNo (Python dynamic analysis of biochemical networks), a non-equilibrium reaction-flux based analysis to identify dominant reaction paths within a biochemical reaction network calibrated to experimental data. We first show, in a simplified apoptosis execution model, that despite the thousands of parameter vectors with equally good fits to experimental data, our framework identifies the dynamic differences between these parameter sets and outputs three dominant execution modes, which exhibit varying sensitivity to perturbations. We then apply our methodology to JAK2/STAT5 network in colony-forming unit-erythroid (CFU-E) cells and provide previously unrecognized mechanistic explanation for the survival responses of CFU-E cell population that would have been impossible to deduce with traditional protein-concentration based analyses.


Reducing execution mode uncertainty through parameter measurements
Given that the aEARM calibrated parameter vectors yield three execution modes (Figure 2C-D in the main text) with their respective probabilities, there is uncertainty about which execution mode is most representative of the cellular process.We then asked whether we could identify parameters that, if measured experimentally, would reduce the execution mode uncertainty.We hypothesize that identifying key parameters that inform execution mechanisms could guide experiments to improve our knowledge about network-driven signal processing.To measure the uncertainty of the execution modes, we used Shannon's entropy 1 formulated as  = − ∑ ( ! ) log " ( ! ) # !$% .As aEARM has three execution modes, the maximum entropy in the system is log " 3 = 1.58, which would signify that each execution mode has a 33% probability.Using the probabilities of the previously obtained execution modes (Figure 2C in the main text) and Shannon's formula, we calculated an entropy of 1.54 indicating a high uncertainty in the execution across all modes.To determine the most informative parameters that should be measured to reduce execution mode uncertainty, we used XGBOOST 2 , a gradient-boosted Machine Learning technique that can classify parameter vectors into their corresponding execution modes.We used the calibrated parameter sets as training data where each kinetic parameter (kf, kr, kc, i.e., forward, reverse, and catalysis reaction constants, respectively) is a feature, and the mode of execution is our target variable.
Feature importance analysis from the XGBOOST analysis shows that the model parameters kf6 and kf7 contribute the most to training loss reduction during the classification task (Figure S6A).As illustrated in Figure S6C, parameters kf6 and kf7 correspond to the binding rate of MOMP* to inactivated MOMP, and MOMP* to eC, respectively.These two parameters are part of the reactions where the signal flux is bifurcated in the network, indicating that their values play an important role in the definition of the execution modes.To show the differences in parameter values for each execution mode we plotted the values of the kf6 and kf7 parameters.As shown in Figure S6B the execution modes have different distributions of the kf6 and kf7 parameters with some overlap.To show how measuring the kinetic parameters, the binding rate of MOMP* to eC, reduces the uncertainty in the mode of signal execution, we simulated 100 measurements of the kf7 parameter, and calculated the entropy with the updated execution modes probabilities.As shown in Figure S6D, measurements in the range of -8.1 and -7.15,where there is no overlap between the parameter distributions, reduce the entropy to less than 0.05 which indicates that we know the execution mode with certainty.For measurements larger than -7.15 but lower than -5.37 the overlap between distributions starts to increase and with it the uncertainty increases as well, as indicated by the entropy.Then, for measurements larger than -5.37 the entropy starts to decrease.This indicates that in general measuring kinetic parameters reduces the entropy and thus the uncertainty in the execution mode with the exception where the overlap between distribution is maximum and increases the entropy.

Modes of signal execution in a detailed apoptosis model with increased parameter uncertainty
Based on our results with aEARM, we then asked how a larger model with higher parameter uncertainty would fare under the presented signal execution analysis.We shifted to a larger extrinsic apoptosis reaction model (EARMv2.0),which has been studied and characterized in previous work 3 .As illustrated in Figure S7A, EARMv2.0 is considerably larger than aEARM as the biochemical interactions are described with higher molecular resolution.In all, EARMv2.0 has 77 molecular species and 105 kinetic parameters.As described in Methods, we used PyDREAM to calibrate the model to published experimental data 4 .Although the calibration yielded parameter vectors that fit the experimental data indistinguishably well (Figure S8), we note that only 62 model parameters converged according to the Gelman-Rubin diagnostic (GR < 1.2) after two million iterations (see Table S5 and Figure S9).Distributions of 9 converged parameters are shown in Figure S7B.The remaining parameters exhibited GR values between 1.21 and 13.52 (Table S4 and Figure S10a, b, and c).From a Bayesian perspective, non-convergent parameters imply that the experimental data simply cannot constrain their values to a distribution and thus results in higher variability.As our analysis is focused on understanding execution modes in network-driven processes, a model with poorly identified parameters presents an opportunity to explore how signal execution could be best interpreted and understood in large model systems with high parameter uncertainty.
We followed the same procedure used in the main text to explore the execution modes in EARMv2.0 (See Methods for details).Our analysis shows that calibrating EARMv2.0 to experimental data constrains the signal flow to eleven execution modes that can be represented by 53 dominant subnetworks out of 2067 possible subnetworks.As shown in Figure S7C, the apoptosis execution signal could flow through any of these paths with varying degrees of probability, with Execution Mode 1 (EM1) exhibiting a probability of ~20% and the first four modes capturing ~50% of the signal probability, thus suggesting high path entropy as we have seen in previous work 5 .Supplemental videos show animations of signal flow for all execution modes in the context of EARMv2.0.
If one needs to compare EMs observed for aEARM and EARMv2.0,we see that there exists an overlap between the underlying highly probable dominant pathways although they visually seem different.The main difference in the EMs is that complex interactions in EARMv2.0 are coarse-grained into a single node in aEARM and thus the EMs appear simplified.For instance, the MOMP node in aEARM represents the mitochondrial outer membrane pore formation which is explicitly implemented in EARMv2.0 by the Bid to Smac pathways.Thus, the EMs obtained from aEARM and the EMs obtained from EARMv2.0 are comparable and comprise functional commonality.While the EARM models represent the same mechanism, their resolution and, hence, the underlying system of differential equations are different.Therefore, the functional commonality in the observed EMs may imply that the EMs are intrinsic features of the underlying system and independent of the specific way in which the model is implemented.Additionally, having more uncertain parameters, as well as a larger network size, results in more execution modes in EARMv2.0.Therefore, we believe the number of EMs would be related to the network size rather than the amount of data because as the size of the network increases, the number of possible pathways playing a role in the production (or consumption) of the target node might increase as well, which may result in the identification of more EMs.
Next, we tested whether each execution mode exhibits different responses to the same perturbation.We selected EM1 and EM2 for analysis as these modes exhibit the highest probability for signal execution.As illustrated in Figure S7D, the mBid interaction with Mcl1 is dominant in EM1.In contrast, the mBid interactions with both Mcl1 and Bcl2 are dominant in EM2, thus highlighting the importance of both antiapoptotic proteins to understand the signaling mechanisms during apoptosis execution when the cell is exposed to an apoptotic inducer.
We then performed two in silico experiments for EM1 and EM2: For the Bcl2 KD, we found that the median ToD in EM1 has a minor change from 10022.46 s to 10011.87 s, as expected because mBid activity is not significantly affected by Bcl2 in this mode.By contrast, in EM2, mBid activity is modulated by Mcl1 and Bcl2.Thus, a reduction in the initial protein levels of Bcl2 enables more mBid proteins to activate pro-apoptotic proteins and this leads to an increase in ToD to 9580 s (Figure S11A-upper panel).Taken together this data shows that distinct execution modes respond differently to the same perturbation and that their responses can be predicted based on the dominant reactions for a given execution mode.
To further emphasize the importance of transient dynamics on signal processing, we explored EM1 dynamic fingerprints and found that SMAC inhibition of XIAP occurs at later time points of the simulations (>8640 s).Therefore, we hypothesized that XIAP inhibition would be more effective earlier during signal execution.To test this, we added an XIAP inhibitor to EARMV2.0 at either 4000 s or 8000 s.As shown, when the inhibitor is added at the later time point, we observed a small reduction (Figure S11A lower panel) in the median ToD from 9943.65 s in the WT to 9380.44 s (Δt = 563.21s).In contrast, when the inhibitor is added at the earlier time point, when SMAC is not yet released from the mitochondria, the inhibitor binds to XIAP enabling C3 to cleave PARP and thereby reducing the median ToD to 6766.10 (Δt = 3177.55s).
As combination therapies have become important to combat drug resistance 6,7 , especially in cancers, we explored whether our analysis provided information about potential targets for cotreatment.As we previously mentioned, Mcl1 and XIAP are dominant antiapoptotic proteins in EM1, thus we hypothesized that inhibition of both proteins would yield a shorter ToD compared to only inhibiting XIAP.To test this, we added two drugs that independently inhibit XIAP and Mcl1 and obtained a ToD of 5951.11s representing a 12% reduction in the ToD compared to XIAP inhibition only (Figure S11A lower panel).Finally, to guide experiments that would identify the most likely execution mode out of the 11 execution modes obtained, we developed an XGBOOST model of execution mode estimation and performed feature importance analysis.As shown in Figure S11B, we found that the parameters controlling the kinetics of mBid binding to BcxL, and XIAP binding to C3 yield the most information about execution modes in EARMv2.0.Taken together, these results suggest that the analysis of signaling dynamics from uncertain parameters helps us identify dominant reactions that control signal flow in a network during signal processing and how these networks are more sensitive to the perturbations of those reactions.

Videos:
Video S1-S11: Dynamic visualization of apoptosis execution for the 11 execution modes found in EARMv2.0 obtained using PyViPR 9 , related to Supplemental Section 2. Nodes represent molecular species, and pie charts inside the nodes depict relative species concentration over time.Edges represent the interaction between species, and shades of red show the intensity of the reactions.
(i) a 50% knockdown (KD) of the antiapoptotic protein Mcl1 as well as (ii) a 50% knockdown of the antiapoptotic protein Bcl2.For the Mcl1 KD, we found that the EM1 median Time of Death (ToD) decreased from 10022.46 s (WT) to 8686.52 s (Figure S11A-upper panel).This is expected since mBid and Mcl1 interactions are dominant in this mode.By contrast the median ToD in EM2 decreased from 9943.65 s (WT) to 9335.85 s.This modest decrease in ToD can be attributed to the fact that although Mcl1 and mBid interactions are important in EM2, the dominance of Bcl2 compensates for the absence of Mcl1 and reduces the impact on ToD for the Mcl1 KD.

Figure S2 .
Figure S2.Negative log-posterior values of the posterior distribution of aEARM sampled by 3 chains using PyDREAM after a burn-in of 200,000 iterations, related to Figure 1C.

Figure S3 .
Figure S3.Probability of each of the unique parameter vectors sampled after burn-in in the PyDREAM calibration, related to Figure1C.To obtain the probability of each parameter set, the number of visits to a specific parameter vector was normalized by the total number of visits.

Figure S4 .
Figure S4.Algorithm to obtain dynamic fingerprints from a simulation, related to Figure 2 and STAR Methods.Briefly, the algorithm consists in building a bipartite graph from the model, and then simulating the model with a specific parameter vector.Next, the algorithm obtains the dominant reactions from a network and builds a subnetwork.This procedure is repeated for each simulated time point.

Figure S5 .
Figure S5.Dominant pathways over time versus the pSTAT5 abundance with respect to increasing Epo doses, related to Figure5.A, B, C, and D) Dominant pathways versus pSTAT5 trajectories when Epo = 0 U/ml, 0.032 U/ml, 0.8 U/ml, and 20 U/ml, respectively.E) Dominant pathways over time framed with the corresponding colors in A-D.

Figure S6 .
Figure S6.Parameter measurements reduce aEARM execution mode uncertainty, related to Supplemental Section 1. A) List of the 10 parameters that contribute the most to model prediction.Parameters with higher total gains, compared to another parameter, provide larger improvements to accuracy in model prediction.B) Parameter values of kf6 and kf7 grouped by the execution mode they belong to.A Gaussian kernel was used to estimate the density probability of parameter values in each execution mode.C) Schematic representation of the aEARM network.Kinetic parameters kf6 and kf7 and their corresponding reactions are highlighted in the network.D)Changes in the execution modes entropy after simulated measurements of kf7.

Figure S7 .
Figure S7.Modes of signal execution in the full Extrinsic Apoptosis Reaction Network, related to Supplemental Section 2. A) Network of the interactions between the proteins in the apoptosispathway.Proteins highlighted in green are nodes where the signal flux can be divided.The convention of Kitano 8 was followed.B) Marginal probability distributions of 9 individual kinetic parameters converged by the Gelman-Rubin diagnostic.C) Dynamic fingerprints organized by the execution modes they belong to.Each cluster plot is composed of horizontal lines that correspond to dynamic fingerprints, i.e. sequences of dominant subnetworks, and each subnetwork is assigned a different color.Execution modes are sorted from highest to lowest probability.D) Signal execution in Mode 1 (left) and Mode 2 (right) as defined by the most common subnetwork in each mode at t=7000s.

Figure S8 .
Figure S8.Fitness of EARMv2.0 to experimental data, related to Supplemental Section 2. Simulated trajectories of truncated Bid and Cleaved PARP calibrated to reproduce the experimental data.Red dots and bars indicate the mean and standard deviation of the experimental data and blue lines correspond to the simulated trajectories.

Figure S9 .
Figure S9.Negative log-posterior values of the posterior distribution of EARMv2.0 sampled by 4 chains using PyDREAM after a burn-in of 1 million iterations, related to Supplemental Section 2.

Figure S10 .
Figure S10.Marginal probability distributions of EARMv2.0 individual kinetic parameters, related to Supplemental Section 2. Distributions were recovered from the PyDREAM run by integrating all other dimensions.

Figure S11 .
Figure S11.In silico experiments to probe the different execution modes in EARMv2.0,related to Supplemental Section 2. A) Upper panel: Time to death distributions in the execution modes 1 and 2 in the "wild type" condition, after a 50% Mcl1 knockdown, and after a 50% knockdown of Bcl2.The boxplot inside the distributions shows the median, first quartile, and third quartile of the datasets.Execution modes 1 and 2 show substantial differences in their response to the knockdowns.Lower panel: Time to death distributions in the execution mode 1 after adding a drug that binds XIAP at t=8000 s, and at t=4000, and a drug that binds XIAP and Mcl1 at 4000 s.B) List of the 10 parameters in EARMv2.0 that contribute the most to the model prediction.

Table S2 . ODEs of aEARM, related to Figure 1A.
Only the equations for the individual species are provided, and the equations for the intermediate species are omitted.

Table S3 . Gelman-Rubin (GR) diagnostic values for all calibrated parameters in the aEARM model, related to Figure 1C.
Parameters with GR values lower than 1.2 are considered to be converged.

Table S5 . Gelman-Rubin (GR) diagnostic values for all calibrated parameters in the EARMv2.0 model, related to Supplemental Section 2.
Parameters with GR values lower than 1.2 are considered to be converged