Introduction

Progress in identifying mechanisms of human disease and new therapeutic approaches has been steady over the last 20 years, but limited in pace by several factors that govern discovery and innovation. These factors include choice of time-consuming experimental approaches based on phenotypes exhibited by gene knock-out animals, incremental exploration of steps in known signaling pathways without quantitative consideration of connectivity to important cellular outcomes, and single-target approaches. Quantitative systems analysis of cell signaling networks gives new perspective into mechanistic events that will lead to the development of novel experimental strategies and therapeutic concepts.1,2,7,14,16,24,26 This quantitative modeling approach promises a more thorough exploration of possible disease mechanisms. Individual groups have shown this with the construction of small (5–20 species) system-level models for selected protein signaling pathways, illustrating that this approach is critical for: (1) quantitatively determining the relative significance of two signaling events with regard to an important cell behavior16,26; (2) understanding the role of feedback loops in a signaling system3,16; and (3) testing current drugs and identifying new targets based on system-level properties.26 Even though these models are based on a deep body of experimental research and developed with well-defined methodology, this approach has not yet been broadly adopted and used by experimentalists. Despite the unique perspective gained from a systems approach, models are viewed as difficult to construct due to unavailability of quantitative parameter values10 and uncertainty regarding appropriate model boundaries.

Here, we illustrate how even small, simple computational models rapidly integrate the best available experimental data, providing new mechanistic insight and corresponding new experimental strategies and therapeutic concepts. We constructed eight small (4–25 species) ordinary differential equation (ODE)-based models (we term ‘modules’) (Fig. 1) relevant to addiction, Alzheimer’s disease (AD), cancer, diabetes (DM), heart disease, human immunodeficiency virus (HIV), malaria, and tuberculosis (TB) (Table 1). Together these diseases account for four of the six leading causes of disease-related death for adults in the United States (58% of all deaths per year (2.5 million people) CDC data, 2006), and four of the top ten leading causes of death for adults in low-income countries (26.5% of deaths per year (8 million people) WHO data, 2002). Because our models were small (constrained to 4–25 species), we were able to construct them over a short, 6-month time period. We built these small models under the assumption that a system-level analysis would guarantee new insight into the system, regardless of the amount of current information available and irrespective of the specific boundaries chosen for a given module. Each one was designed based on important questions in the current literature, validated where possible against published experimental data, and subjected to a sensitivity analysis to identify key parameters. Results were used to suggest new directions for future experimental research. Creation of eight models at once also allowed for comparative analysis of system-level properties of all eight signaling pathways. Altogether our results suggest new experimental strategies and therapeutic concepts relevant to these diseases, provide a basis for comparative study of system-level properties of cell signaling networks, and promote future broad adoption and widespread use of small-module computational analysis for writing the next generation of experimental roadmaps for known diseases and other important biological processes.

Figure 1
figure 1

Module schematics. (a) Addiction: Drug-induced dopaminergic PKA input stimulates ΔfosB accumulation in the brain to increase addictive behavior. (b) Alzheimer’s disease: Inherited alterations in PS1 and combine with elevated GSK3 activation to increase production of plaque components. (c) Cancer: Deregulation of Akt elevates formation of the MTORC1 complex to increase G1-phase cell cycle progression. (d) Diabetes: Altered availability of metabolic ligands alters the balance of transcriptional complexes involved in hepatic lipid oxidation and lipid synthesis. (e) Heart disease: Angiotensin both stimulates and inhibits fibrotic cardiac remodeling via AT1R and AT2R receptors, respectively. (f) HIV: HIV produces Vif to promote degradation of the innate immune response protein APOBEC3G and increase the release of infectious virus. (g) Malaria: Glycophosphatidylinositols (GPIs) from P. falciparum initiate an immune response and activation of NFκB. (h) Tuberculosis: M. tuberculosis produces ManLam and SapM, two virulence factors that interfere with host endosomal phagocytosis. (green dotted arrow) indicates enzymatic activation, (red dotted arrow) indicates enzymatic inhibition, (brown arrow) indicates degradation (null sign) or production, pink circles represent species altered in disease state, blue circles represent quantified output

Table 1 Representative findings and associated new experimental strategies and therapeutic concepts

Methods

Module Design and Implementation

We constructed each module with a workflow that is readily implemented for any disease and its associated signaling network. Details regarding the following design and implementation of specific modules can be found in Supporting Information. We use a detailed description of the diabetes module here as an illustrative example. First, we identified a major signaling pathway relevant to each of the eight major diseases and then designed a small (4–25 species) bounded module to generate insight into questions raised by the current experimental literature. Each module was constructed to quantify an important output as a function of upstream interactions known to be altered in the respective disease state (Fig. 1). For example, we designed our diabetes module around peroxisome proliferator-activated receptor alpha (PPARα) and liver X receptor alpha (LXRα), two nuclear receptors that act as nutrient sensors to activate genes involved in hepatic lipid oxidation and synthesis, respectively (Fig. 1d). The PPARα receptor is also the target of PPARα agonists currently used clinically to treat dyslipidemia and stimulate the lipid oxidation pathway. Our goal was to quantify transcriptional complex formation output as a function of known upstream protein binding interactions and metabolic ligands (free fatty acids (FFA), polyunsaturated fatty acids (PUFA), oxysterols, and glucose) that are altered in the diabetic state. Figure 1 depicts species altered in the disease state and quantified output for each module.

Overall our goal in the design of each module was to give insight into questions raised by the current experimental literature. In the diabetes module example: What is the main mechanism by which one pathway is able to suppress the other, counteracting pathway? Is it (1) PPARα and LXRα competition for the retinoid X receptor (RXR) partner required for formation of both types of transcriptional complexes9,30; or (2) the formation of the PPARα:LXRα complex?29,31 What is the role of the recently identified LXRα ligand glucose,20 which has an LXRα binding affinity that is five orders of magnitude weaker than any other known ligand? What is the most effective way to improve PPARα agonist drug design and stimulate transcriptional complex formation in the PPARα lipid oxidation pathway? Our module quantitatively integrates species concentration and binding affinity information to reveal information about these events that was not accessible with isolated experimental work.

Even though our model did not include all possible known interactions of PPARα and LXRα in the hepatocyte nucleus, this seven species module gave us new mechanistic ideas about a very important region of the signaling pathway and its quantitative role in an output known to be significant in diabetes. The small size and bounded nature of each of our modules empowered a better understanding of its limitations, and allowed for learning despite an incomplete set of quantitative parameter values.

Species interactions within each module were described using quantitative relationships, expressed as ordinary differential equations (ODEs). The system of ODEs for each module was defined using initial conditions, protein concentrations, binding affinities, and rate constants mined from published experimental work and solved using MatLab (Mathworks, Natwick, MA). The same foundation MatLab code was used to solve the system of ODEs and perform the sensitivity analysis (described below) for each of the modules. Unknown parameters were estimated as best as possible using a variety of methods including use of parameters for enzymes/substrates similar to those in the module, fitting of independent experimental data, and/or appropriate order-of-magnitude parameter estimates. For the diabetes module example, quantitative binding affinity values for most of the interactions were available in the literature, but time-dependent data (binding on and off rates) were not. Therefore, analysis was limited to steady-state conditions. Additionally, quantitative nucleoplasmic concentration values for RXR and LXRα were unavailable, so they were assumed to be equal to the nucleoplasmic concentration of PPARα. These values, as well as unknown parameter values initially estimated for all modules, were re-evaluated using the sensitivity analysis, described below.

Validation

To build confidence in our modules, we validated module output against independent experimental literature where possible. In our diabetes module, we compared predicted ligand:LXR:RXR heterodimer concentration after addition of oxysterol ligand to a dose response curve generated from independent experimental work11 (Fig. 2a). We also compared results to total predicted ligand:LXR:RXR heterodimers after overexpression of both PPARα and LXRα30 (Fig. 2b). In both cases, there was good agreement between the model predictions and published data, as was true for the other modules in the study. More details about the diabetes module validation simulation and other modules can be found in Supporting Information, in Figures S5A–C; S10A, B; S15, S32 A, B; S33A–D; S46A, B; S50A, B.

Figure 2
figure 2

Diabetes module validation. (a) In the diabetes module, total output concentration of ligand:LXRα:RXR heterodimers were calculated after addition different concentrations of 22(R)HC, an endogenous oxysterol (red). Module predictions were qualitatively similar to cell culture experiments in which a luciferase reporter system was used to measured activation of LXR response elements (LXREs) in under the same conditions (black; Janowski et al. 11). Module predictions (c) were also qualitatively similar to experiments that measured LXRE activity after overexpression of PPAR alpha and LXR (b)

Sensitivity Analysis and Module-Specific Simulations

A sensitivity analysis4,26 quantitatively determined the importance of individual parameter values on model output. This analysis was critical in our small-module approach as it allowed for a better understanding of whether or not the set of estimated parameters values was sufficient and also suggested additional informative module-specific simulations. We subjected each module to a sensitivity analysis in which steady-state module output was calculated after variation of all parameters, including individual protein concentrations, binding affinities, and rate constants over six orders of magnitude around the estimated physiological value. Information from the sensitivity analysis of each module was then used together with the known literature to direct other module-specific simulations. For example, in the diabetes module, when we evaluated the sensitivity of the different transcriptional complex outputs to concentration parameters, we found that PUFA:PPARα:RXR, FFA:PPARα:RXR, and oxyst:LXR:RXR heterodimers (Figs. 3a–3c) were exclusively sensitive to their respective ligand concentrations whereas glucose:LXR:RXR heterodimers were sensitive to both LXR and RXR receptor concentrations (Fig. 3d). This sensitivity occurs because other ligands are at a much lower nucleoplasmic concentration (low nM) compared to glucose (μM), though they have a much stronger binding affinity for their receptors than glucose does for LXRα (Table S5). The result directed further simulations (Figure S25) that generated the novel hypothesis that glucose:LXRα:RXR heterodimers are likely much more sensitive than other LXRα and PPARα heterodimers to feedback loops15,25,27,31 that alter receptor concentration and could act synergistically with other ligands to enhance or inhibit transcriptional complex formation (Result 7, Table 1). The fact that glucose:LXRα:RXR transcription complexes are sensitive to LXRα receptor concentration also turned out to be important in the context of PPARα agonist drug design, described below.

Figure 3
figure 3

Diabetes module sensitivity analysis on concentration parameters. FFA:PPARα:RXR (a), PUFA:PPARα:RXR (b), and oxyst:LXRα:RXR heterodimers (c) were all exclusively sensitive to their respective ligand concentrations while glucose:LXRα:RXR heterodimers (d) were most sensitive to LXRα, PPARα, and RXR concentration. This would make glucose:LXRα:RXR heterodimers more responsive than other heterodimers to reported autoregulatory feedback loops that alter the concentration of these receptors. For the full diabetes module sensitivity analysis including all parameters, see Figures S21 and S22

The complete diabetes module sensitivity analysis on ligand:PPARα:RXR heterodimers (Figure S29) led to the hypothesis that changing the nucleoplasmic concentration of PPARα agonist drugs would considerably alter ligand:PPARα:RXR heterodimer formation whereas changing drug binding affinity for PPARα has no effect. This non-intuitive result was explained by the low nucleoplasmic concentration of the drug and most endogenous ligands (nM) compared to a much higher nucleoplasmic concentration of the receptors (~300 μM, Table S5). Additional simulations confirmed the result (Figs. 4a–4d), and showed that this prediction held over a range of RXR (Figs. 4a, 4b) and LXR concentrations (Figs. 4c, 4d), which have not yet been measured in the experimental literature. Simulations also illustrated how PPARα agonism could inadvertently activate the counter-acting LXRα signaling pathway via disruption of PPARα:LXRα complexes, especially in diabetic conditions of high glucose (Fig. 4e). Elevated LXRα signaling in high glucose conditions occurred because, as mentioned above, glucose:LXR:RXR complexes were much more sensitive to LXRα concentration, and were therefore sensitive to an increase in free LXRα after PPARα was removed from PPAR:LXR complexes by PPAR agonism. This new concept (Result 8 in Table 1) was especially exciting in the context of a recent independent report that indicated that diabetic animals are resistant to PPARα agonists.23 As in the diabetes module example, results of the sensitivity analysis of each module directed additional simulations that resulted in potential new experimental and therapeutic directions that were previously unrecognized.

Figure 4
figure 4

Insight into PPARα agonist design. PPARα agonists are used clinically to stimulate the lipid oxidation pathway in patients with dyslipidemia. Our diabetes module predicted that changing agonist binding affinity over four orders of magnitude would not affect PPARα transcriptional complex formation (a, c), whereas changing nucleoplasmic concentration of the drug would have large effects (b, d). These predictions held true over a range of RXR (a, b) and LXR (c, d) concentrations, which have not yet been measured in experiments. Our diabetes module additionally illustrated that PPARα agonism and stimulation of the lipid oxidation pathway could also inadvertently stimulate the lipid synthesis pathway via activation of LXRα transcriptional complexes, especially in diabetic conditions of high glucose (e). This could be one explanation for recent reports that diabetic animals are resistant to PPARα agonist (Satapati et al. 23)

Comparative Meta-Analysis

Simultaneous construction of eight small models allowed for an unprecedented comparative analysis of system-level properties from all eight modules. For each module, we defined parameters as ‘sensitive’ if a 100% change in parameter value (normalized to baseline parameter value) resulted in a 50% change in at least one module output (normalized to baseline module output). We defined parameters as ‘hypersensitive’ if a 50% change in parameter value resulted in a 150% change in at least one module output. We used these definitions to look at correlations between module properties (size, parameter availability, feedback loops, redundant paths). We also investigated whether certain parameter types were more likely to be sensitive than others.

In order to examine the relative sensitivity rank of different parameter types within respective modules, we assigned each parameter a percentile sensitivity rank based on the calculated sensitivity (described above) and total number of parameters in the respective module. We used a Wilcoxon Rank Sum test to determine whether there were significant differences between the average percentile rank of different parameter types within our modules. Specifically we evaluated whether the percentile rank of concentration parameters was significantly higher than the percentile rank of binding parameters.

In order to determine whether trends we observed were also present in existing published models, we compared some of our results to similar models in the published literature.5,13,16,24 We chose these models based on the fact that they were ODE-based, focused on protein signaling, contained both binding affinity parameters and concentration parameters, employed some type of sensitivity analysis to include individual perturbation of all concentration and all binding affinity parameters, and parameter sensitivity ranks in overall top 15% sensitive were clearly discernable. As our modules and findings focused on mammalian signaling, we did not include models of bacterial or yeast signaling in this analysis. For each existing model, we recorded total concentration parameters, total binding affinity parameters, and total of each type that were reported in the top 15% most sensitive for a given model. For our models, literature models, and both combined we computed proportion of concentration parameters in the top 15% most sensitive of each model and proportion of binding affinity parameters in the top 15% most sensitive of each model. We included concentration density parameters (#species/area) as concentration parameters, and we did not include production or degradation terms. We used standard error for proportion to determine whether a significantly higher proportion of concentration parameters appeared in the top 15% of their respective modules than did binding affinity parameters.

Three of our modules (HIV, malaria, TB) focused on host–pathogen interactions in infectious diseases. We used the percentile sensitivity rank of parameters in our modules to determine whether parameters related to pathogen virulence factors (Vif, GPI, ManLAM) exhibited different sensitivities in our modules compared to host parameters.

Overall Results

Module Findings Result in New Experimental Roadmaps for the Study of Elements of Eight Human Diseases

Representative findings and associated new experimental strategies and therapeutic concepts for each module are listed in Table 1, with a complete description of module construction and findings in Supporting Information. Module findings based on system-level properties of cell signaling generated new experimental roadmaps for the study and treatment of each of these diseases. The finding from our diabetes module that PPARα transcriptional complexes were sensitive to nucleoplasmic concentration of PPARα agonists but not to agonist binding affinity for PPARα recommends a shift in focus from current PPARα agonist design goals of controlling agonist binding affinity, to a new design goal of controlling agonist nucleoplasmic concentration (Table 1, Finding 8). The module gave the additional new insight that PPARα agonism and stimulation of the lipid oxidation pathway could also activate the counteracting LXRα lipid synthesis pathway in high glucose conditions, providing a new potential mechanism for PPARα agonist resistance in diabetic patients. Key results from each of the eight modules are summarized in Table 1 (see Supporting Information for all results) and illustrate how this approach generates novel, system-level insight to guide future experimental research and treatment of human disease. Specifically, system-level analysis of these small modules was able to quantitatively evaluate whether an experimentally measured pathway alteration in a disease could be responsible for pathological output (Findings 1, 4), help clarify the role of specific feedback and feed-forward events in disease pathology (Findings 6, 7, 10), and evaluate current therapeutic strategies and uncover new ones based on system-level insight (Findings 3, 5, 8, 9, 12, 15). New therapeutic strategies based on these findings were very specific and highly actionable. Several are highlighted below.

Bolstering Innate Resistance to Retroviruses in HIV: Helping APOBEC3G Overcome HIV–Vif

The HIV module (Fig. 1f) focused on the recently discovered innate mammalian defense to retroviruses via the cytidine deaminase apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like 3G (APOBEC3G), and its circumvention by the HIV-encoded protein viral infectivity factor (Vif). The module was used to answer two questions: can this innate anti-retroviral immunity be boosted to overcome HIV infection? If so, how best might this be achieved? One major finding of this module was that innate A3G production above 1 μM/h effectively shuts down HIV production, while a proposed antibody to Vif12 is only predicted to be effective at high production rates (due to an excess of Vif available) (Figs. 5a, 5b). This limited efficacy happens because only a small amount of Vif is required to sequester APOBEC3G and thus a high level of Vif blockade is required to reduce infectious HIV release rate, compared to a relatively lower amount of APOBEC3G required to inactivate the virus. The finding directly resulted in the new hypothesis that interventions to increase APOBEC3G may be more effective than targeting Vif. The new therapeutic concept held true regardless of the exact value of baseline APOBEC3G production (pA3G), which has not yet been quantitatively measured (as reported in the experimental literature) and was classified as an unknown parameter in this module. With the finding that increased production of APOBEC3G could reduce the release of infectious virus, the HIV module imparted both an innovative new therapeutic hypothesis as well as an accompanying quantitative hypothesis for its mechanism of action.

Figure 5
figure 5

Representative module results generate new therapeutic concepts. One result from the HIV module suggested that a proposed antibody to Vif would only be effective at reducing release of infectious HIV at very high production rates (a), due to an excess of Vif available. In contrast, increasing production of the innate immune response protein APOBEC3g (A3G) would be effective over a range of production rates (b). The exact quantitative value of baseline A3G production was an unknown parameter in this module, but the result is true over a large range of estimated values. One result from our heart disease module was that AT2R agonists (green) would not be effective in reducing production of MMPs and cardiac fibrosis compared to ACE inhibitors that reduce production of angiotensin (red, used clinically), and AT1R antagonists (blue, currently in clinical trials) (c, d). This non-intuitive finding was due to limited availability of AT2 receptors and saturation of downstream AT2R phosphatases

Angiotensin and Angiotensin Receptors in Cardiac Fibrosis

Our heart module (Fig. 1e) quantitatively explored the ability of angiotensin to both stimulate (via angiotensin 1 (AT1R) receptors) and inhibit (via angiotensin 2 (AT2R) receptors) the production of matrix metalloproteinases involved in cardiac fibrosis (Fig. 1e). Treatments targeted at Ang II signaling are effective in curbing the effects of heart disease28 and currently target either the production of Ang II (ACE inhibitors) or Ang II association with the AT1Rs (AT1R antagonists). Agonism of AT2Rs has also been considered as a potential therapeutic target, but it is unclear if AT2R signaling is anti-fibrotic under physiological conditions. Moreover, there is no quantitative evidence to compare AT2R-targeted drugs with current therapies in use or development. To determine if AT2R agonists may be a meaningful treatment for fibrotic remodeling, we compared the responses of our model to simulated ACE inhibitors (presently used in the clinic), AT1R antagonists (in clinical trials now), and AT2R agonists (proposed in the literature). Our module found that of the three types of therapies simulated, ACE inhibitors had the most robust result, reducing MMP2 and MMP3 expression to nearly basal rates, followed by AT1R antagonists which also reduced MMP expression. Surprisingly, AT2R agonists showed little effect in blocking MMP expression (Figs. 5c, 5d). This counterintuitive result was explained by the observation that the activities of the phosphatases activated by AT2R are largely saturated at basal Ang II levels (Figure S35). However, this saturation may be overcome by increasing the total number of available AT2Rs (Figures S34A, S34B, and S34C). Taken together, these results suggested that while AT2R signaling is anti-fibrotic, AT2R agonists would have limited effect in treating fibrotic remodeling. This directly results in the therapeutic concept that therapies designed to limit Ang II signaling should focus more on either inhibiting Ang II production or increasing AT2R availability than either blocking AT1R activity or stimulating AT2R activity (Table 1, Finding 9).

Novel Mechanistic Insight Relevant to Current Therapeutic Targets

The small-module approach was additionally validated by the predictive capabilities of several modules that were able to independently identify current drug targets as sensitive species. For example, a 2009 Alzheimer’s disease study found glycogen synthase kinase-3 (GSK3) inhibitors to be beneficial to animals with the disease but detrimental to controls.8 Our Alzheimer’s module independently identified GSK3 concentration as a sensitive parameter and strong determinant of phosphorylated tau (p-tau) (Figure S11A), a major component of neuritic plaques. In cancer, interest in protein phosphatase 2A (PP2A) as a drug target has recently arisen.22 Our cancer module independently predicted PP2A concentration as a sensitive parameter in elevated mammalian target of rapamycin (mTOR) (Figure S16), which is believed to involved in pathological cell cycle control observed in cancer.

These modules also generated novel, specific mechanistic insight into these previously identified drug targets. For GSK3 in Alzheimer’s disease, our module revealed that GSK3 concentration, as opposed to kinetic activation by Akt, has the most dramatic effect on p-tau levels (Figure S11A, C). This result suggests that drugs aimed at controlling GSK3 concentration may be more useful in reducing p-tau levels than the above mentioned GSK3 inhibitors which block GSK3 kinetic activity. Our Alzheimer’s module also predicted that no GSK3-associated parameters had any effect on amyloid beta 42/amyloid beta 40 ratio (another major component of neuritic plaques) (Figure S11A, top panel), suggesting that GSK3-related therapy alone may not be sufficient to reduce plaque formation. For PP2A as a cancer target, the additional insight provided by the model was that mTOR negative feedback could compensate for PP2A dysregulation (Figure S17). Both of these results illustrate how systems analysis of signaling pathways facilitates improved mechanistic understanding of current therapies and their targets.

Comparative Meta-Analysis of System-Level Properties of All Eight Signaling Modules

The comparative analysis in our study demonstrated that the ability to gain insight into therapeutic targets based on the sensitivity of module output to individual species is preserved across models of different sizes and in varying states of parameter availability. Pearson correlation and Spearman correlation tests indicated no correlation between the module size and the overall percentage of sensitive parameters (Fig. 6c, R 2 = 0.025, p > 0.05 for both tests), and no correlation between percentage of parameters available in the literature and overall percentage of sensitive parameters (Fig. 6d, R 2 = 0.009, p > 0.05 for both tests). Interestingly, however, certain network properties were a stronger determinant of module sensitivity to parameter perturbation. For example, modules lacking both redundant pathways and negative feedback loops had a much higher percentage of sensitive parameters (69 and 72% for HIV and heart modules, respectively), than all other modules (<30%, Fig. 6e). The comparative analysis confirms that for small-module systems analysis, size and parameter availability do not constrain discovery based on sensitive species.

Figure 6
figure 6

Comparative meta-analysis across all eight modules. Modules varied in size and parameter availability (a). Module output was much more sensitive to species production/degradation parameters than to parameters for protein binding interactions (b). Percentage of sensitive parameters for each module was not correlated to module size (c, Pearson correlation and Spearman correlation p > 0.05 in both) or to parameter availability in the literature (d, Pearson correlation and Spearman correlation, p > 0.05 in both) but modules with no redundant paths or negative feedback loops (HIV and heart disease) had twice the percentage of sensitive parameters (69 and 72%, respectively) than all other modules (<28%), % sensitive parameters is proportional to circle diameter for each module (e)

In the interest of uncovering molecular targets of high therapeutic potential, we categorized sensitive and hypersensitive parameters by type and found that module outputs were least sensitive to protein binding interaction parameters (7% of all binding parameters were sensitive, 0% were hypersensitive, Fig. 6b) and often hypersensitive to species concentration and production/degradation parameters (12% were sensitive, 28% hypersensitive, Fig. 6b). To examine the finding in more detail, we ranked each parameter within its module and computed a percentile ranking for each based on sensitivity and the total number of parameters in the respective module (Fig. 7a). Since all of our modules contained concentration and binding affinity parameters (only some contained production/degradation parameters), we chose to focus on these two parameter types for further analysis. We found that percentile rank of concentration parameters was significantly higher than the percentile rank of binding affinity parameters (Fig. 7a, Wilcoxon Rank Sum, two-tailed test, p < 0.05). Intriguingly, this analysis also revealed that a large number of concentration parameters appeared in the top 15% most sensitive parameters of each module, while relatively few binding affinity parameters were in this category (Fig. 7a, red box).

Figure 7
figure 7

Higher percentile sensitivity rank for concentration parameters compared to binding affinity parameters. The average percentile sensitivity rank of concentration parameters in our module was significantly higher than percentile sensitivity rank of binding affinity parameters (a, Wilcoxon Rank Sum test, p < 0.05). A large number of concentration parameters in our modules were in the top 15th percentile sensitivity in their respective modules, while only a small number of binding affinity parameters in (a) (red box). Other ODE-based protein signaling models5,13,16,24 in the published literature also had a significantly higher proportion of concentration parameters in the top 15% sensitivity percentile rank compared to binding parameters (b). *Significantly higher proportion compared to binding affinity proportion, 99.7% confidence level using standard error for proportion

When we examined the published literature to determine whether this phenomena occurred in other small models of protein signaling, we found that in our models, in four other similar published models,5,13,16,24 and in the combined data, there was a significantly higher proportion of concentration parameters than binding parameters in the top 15% most sensitive for each model (Fig. 7b, * indicates significance with 99.5% confidence using standard error for proportion).

Altogether, this indicates that detailed, quantitative experimental measurements of protein concentrations may be more useful in determining targets of high therapeutic potential than measurements of protein binding affinities. This meta-analysis across our eight networks and four published models also suggests that there may be selective pressure for fine-tuned control and more regulatory steps to govern sensitive cell processes such as concentration, while there is less pressure and fewer regulatory steps for control over more robust processes, such as protein binding interactions.

When we evaluated host–pathogen interactions in our infectious disease modules (HIV, malaria, and TB) we found that concentration parameters associated with pathogen virulence factors had a significantly higher percentile sensitivity rank that all other concentration parameters in our modules (Fig. 8, Student’s t test, p < 0.01). We found no noticeable differences in sensitivity rank associated with pathogen kinetic parameters compared to other kinetic parameters. This has interesting implications in quantitatively understanding host–pathogen interactions in infectious disease, discussed below.

Figure 8
figure 8

Host networks in infectious disease modules are highly sensitive to pathogen virulence factor concentration/production. Pathogen virulence factor concentration/production parameters (Vif, GPI, ManLAM) in our infectious disease modules (HIV, malaria, TB) had a significantly higher percentile sensitivity rank than all other concentration parameters in our modules (*Significance compared to concentration parameters with Student’s t test, p > 0.01)

Discussion

Systems analysis of protein signaling pathways has not yet been broadly adopted and used to enhance most experimental research. While many individual studies have illustrated the value of small-module systems analysis,1,2,7,14,16,24,26 we show here that it can be performed rapidly to generate new ideas to guide experimental research. Our unprecedented comparative meta-analysis across eight real biological signaling networks serves as a basis for future work toward more comprehensive understanding and control over system-level properties of biological signaling networks.

Small, simple models can be highly informative. All eight of our models were limited to 4–25 species and based exclusively on ordinary differential equations, yet they were still able to reveal new mechanistic insight relevant to major human diseases. The small size of each module was advantageous in that it allowed for a better understanding of module limitations and new insight, even though an average of 44% of our module parameters were unknown. Validation of module output against independent experimental data and a sensitivity analysis were useful in determining whether estimations for unknown parameter values were appropriate. The small size also enabled rapid analysis and generation of new ideas from existing experimental research. We believe that this approach fills a significant unmet need between broad genome-wide analytical work on properties of very large networks and highly detailed work that focuses on individual signaling pathways.

We argue that small models, constrained to a limited number of species, will generate important new insight even when some quantitative parameter values are unknown. Our modules employ a variety of methods for estimating parameters, including fitting of experimental data, use of parameters for similar enzymes/substrates to those in the module, appropriate order-of magnitude estimates, and even Monte Carlo sampling. These approaches, combined with a sensitivity analysis, enable discovery despite unknown parameters. In many cases, we found that information could be gained in recognizing system-level interactions that were not as dependent on exact parameter values. In the HIV module, the prediction that additional production of A3G would be more effective than an antibody to Vif held true regardless of the precise value of A3G production, which has not yet been experimentally measured (Figs. 5a, 5b). We also illustrate that the diabetes modules predictions regarding PPARα agonist drug design are not dependent on the precise values of nucleoplasmic RXR (Figs. 4a, 4b) or LXR (Figs. 4c, 4d) concentrations, which are currently unknown. In the most extreme case that all parameters needed for a small model are unknown, a sensitivity analysis still provides new, significant information about the network by highlighting key parameters based on network topology alone.

Focused quantitative systems analysis of small modules from protein signaling pathways not only accelerates new experimental discovery, but also provides concomitant new mechanistic explanations for these discoveries. This approach complements time-consuming experiments that correlate related signaling events. With an understanding of sensitive nodes in a pathway gained from a sensitivity analysis, models can identify which experiments will be most insightful and prioritize experiments that will be most likely to succeed. In addition to addressing specific questions relevant to the current literature, we found that all of our modules produced new ideas and insight that were not obvious during the module construction process. This new perspective will drive more innovative experimental research.

Small models of protein signaling pathways can enhance all areas of drug development from identification of novel targets, to understanding the mechanistic basis of current therapies, and validation and testing of targets in clinical trials. As more and more disease-specific genetic data is collected from human populations, small-module analysis will allow for rapid identification and a quantitative understanding of mechanisms by which genetic alterations might interact with epigenetic factors to alter disease-related signaling pathways. Genetic information from individual patients could also be used to create personalized modules and move towards more personalized therapy.

Increased use of system-level models to study protein signaling will allow for discovery of unifying quantitative principles that govern protein signaling. By comparing system-level properties in our eight modules, we discovered the increased sensitivity of protein signaling to concentration parameters, and were able to confirm this using data from other published models. Intriguingly, we also discovered the extreme sensitivity of host network to pathogen virulence concentration parameters. This suggests that host immune response networks may have developed over time to be fine-tuned to concentrations of pathogen virulence factors. In turn, pathogens may have developed over time to produce a high enough concentration of virulence factor to perturb host signaling networks. Insights such as these will be valuable as the next major steps are taken to understand and treat complex infectious diseases.

Several species (PP2A, PI3K, and Akt) appeared in multiple modules. Interestingly, we found that PP2A concentration was a key sensitive parameter in all four modules in which it appeared. It was classified it as “sensitive” in the Alzheimer’s, heart, and addiction modules, and “hypersensitive” in the cancer module. However, a Student’s t test to compare ranks of PP2A concentration sensitivity to other concentration sensitivities indicated no significant differences because one of the outputs in the Alzheimer’s module was not sensitive to PP2A. In Alzheimer’s and addiction modules, two kinetic parameters associated with PP2A were classified as sensitive. No kinetic parameters associated with PP2A were sensitive in cancer and heart modules. Several other parameters (Akt, PI3K) that appeared in multiple modules were classified as sensitive parameters in some, but not others. Altogether, this data from our modules indicates that common species exhibit different sensitivities depending on the network structure. The best evidence for this is the fact that one output of the Alzheimer’s module is extremely sensitive to PP2A, but the other output is not. In addition, some kinetic parameters associated with PP2A are sensitive, while others are not.

One notable comparative analysis by another group6 has used 17 protein signaling models in the literature to show that parameter sensitivities in these models have eigenvalues evenly distributed over several decades, termed a ‘sloppy’ spectrum of parameter sensitivities. Though not equivalent to the analysis performed in (6), the distribution of sensitivities (Figure S1) from our pooled models largely agrees with this data. Comparing our Figure S1 to Figure 1 in Ref. 6 shows that in both cases, most of the models are robust, insensitive to most parameter changes. In fact, ~70% of the parameters in our models were not ‘sensitive’ or ‘hypersensitive’. Since eigenvalues depict how strongly the systems are carried along by their eigenvectors, we would expect to find few eigenvalues for parameters in our models (corresponding to the 10% of parameters >350% sensitivity in our Figure S1) that are ‘large’ with the remainder being comparatively small. We therefore expect to see a skewed distribution to smaller relative eigenvalue size (e.g., toward 10−6), similar to that depicted in Fig. 1b from Gutenkunst et al. 6 We would also expect to see more ‘sloppy’ distribution (spanning multiple decades) of parameter sensitivities if we plotted our data on a log scale (as in (6)), as opposed to a linear scale.

By documenting and revealing system-level properties of actual biological signaling networks, our small-module approach also adds to and benefits from theoretical studies of synthetic networks. One recent theoretical study computationally searched all possible three-noded enzymatic signaling networks and a large associated kinetic parameter space to identify core topologies that are involved in biological adaptation.18 Findings from our comparative meta-analysis may enhance and direct future work in this area. The fact that module outputs in our study were sensitive to concentration and production/degradation parameters suggest that it is critical for future theoretical work to consider these parameter spaces in addition to those associated with enzyme kinetics.18 Insight gained from the theoretical approaches such as Ma et al. 18 in turn complement small-module analysis. Intriguingly, we identified an incoherent feed-forward loop (IFFL), one of the core network topologies discovered to perform adaptation, as one of the main motifs driving angiotensin-stimulated MMP production in our heart disease module. The identification deepens system-level understanding of the angiotensin signaling pathway, and will lead to more effective study and treatment of cardiac fibrosis and heart disease.

This work helps to define the role of small-module computational models in the overall taxonomy of computational systems biology, a growing field which also includes the complementary areas of genome-wide analysis of biochemical networks,17 more extensive pathway models in single cells,19 and integrative systems bioengineering models that link multiple scales and incorporate multiple cells or organisms.21 We show that small-module computational biology can be performed rapidly in a reproducible workflow with research teams of appropriate expertise, can delineate substantial new questions for experimental, therapeutic and/or computational study, and thus can produce useful new knowledge relevant to human disease.

Supplemental Methods

A detailed description of the construction and results of each module can be found in Supporting Information.