Models from experiments: combinatorial drug perturbations of cancer cells

We present a novel method for deriving network models from molecular profiles of perturbed cellular systems. The network models aim to predict quantitative outcomes of combinatorial perturbations, such as drug pair treatments or multiple genetic alterations. Mathematically, we represent the system by a set of nodes, representing molecular concentrations or cellular processes, a perturbation vector and an interaction matrix. After perturbation, the system evolves in time according to differential equations with built-in nonlinearity, similar to Hopfield networks, capable of representing epistasis and saturation effects. For a particular set of experiments, we derive the interaction matrix by minimizing a composite error function, aiming at accuracy of prediction and simplicity of network structure. To evaluate the predictive potential of the method, we performed 21 drug pair treatment experiments in a human breast cancer cell line (MCF7) with observation of phospho-proteins and cell cycle markers. The best derived network model rediscovered known interactions and contained interesting predictions. Possible applications include the discovery of regulatory interactions, the design of targeted combination therapies and the engineering of molecular biological networks.


Introduction
Our ability to measure increasingly complete and accurate molecular profiles of living cells motivates new quantitative approaches to cell biology. For example, a key aim of systems biology is to relate changes in molecular behavior to phenotypic consequences. To achieve this aim, computational models of cellular processes are extremely useful, if not essential. Computational models can be used for the analysis of experimental data, for the prediction of outcomes of unseen experiments and for planning interventions designed to modify system behavior. We have developed a particular approach to constructing, optimizing and applying computational models of cellular processes, which we call Combinatorial Perturbation-based Interaction Analysis (CoPIA). The key ingredients of the approach are combinatorial intervention, molecular observation at multiple points, model construction in terms of nonlinear differential equations, optimization of model parameters with simplicity constraints and experimental validation.

The power of combinatorial perturbation
In molecular biology, a targeted perturbation typically inhibits or activates function of biomolecules, e.g. as a result of drug action, small RNA interference, genetic or epigenetic change (Figure 1). In a single experiment, targeted perturbations can be applied either singly or in combination. Combined perturbation by several agents can be much more informative than that by a single agent, as its effects typically reveal downstream epistasis within the system, such as non-additive synergistic or antagonistic interactions. In addition, a large number of independently informative experiments can be performed if in each experiment a different small set of, e.g. two or three, perturbants is chosen from a larger repertoire. Thus, combinatorial perturbations are potentially powerful investigational tools for extracting information about pathways of molecular interactions in cells (such as A inactivates B, or X and Y are in the same pathway) (Avery and Wasserman, 1992;Kaufman et al, 2005;Kelley and Ideker, 2005;Segre et al, 2005;Yeh et al, 2006;Lehár et al, 2007). Combinatorial perturbations can also be powerful application tools when Figure 1 Combinatorial perturbation and multiple input-multiple output (MIMO) models. Upper left: intuitive view of perturbations and their points of action. Small inhibitory RNAs alter gene expression; natural protein ligands and small compounds act, e.g., on receptors, transporters or enzymes. Genetic alterations have diverse functional effects. Perturbations can be natural or investigational. Observations (readouts) typically focus on a phenotype of interest, such as growth or differentiation, and on observation points that are both practical and informative, such as transcripts, protein levels or covalent modifications, e.g. phosphorylation. Upper right: MIMO model. All key system variables are represented as real number variables, the combinatorial perturbations u i as well as their targets, internal variables, observation points and phenotypic outputs y i . Inputs (upper layer, squares) affect the different dynamical variables of the system (circles), some of which can be observed (lower layer, squares). The model represents a processing system that relates the input to the output through the interacting dynamical variables. Representation of coupled perturbations (nonlinear effects) is a key requirement of the modeling method. When rate equations are linear (lower left), perturbation effects will combine additively. However, in a well-parameterized system with nonlinear transfer functions (lower right), epistasis effects will arise, and downstream effects depend on pathway organization. Below: uses of a derived MIMO model. When inputs and outputs are known, a model can be used to infer the internal mechanism of the system (Interpretation). When the inputs and the system are known, the model can be used to predict a response (Prediction). When the system and the desired output are known, the model can serve to design optimal modes of control (Control). rationally designed to achieve desired effects. For example, combination of targeted drugs is considered a promising strategy to improve treatment efficacy, reduce off-target effects and/or prevent evolution drug resistance (Borisy et al, 2003;Keith et al, 2005;Komarova and Wodarz, 2005;Chou, 2006).
With recent advances in molecular technologies-e.g., targeted perturbation by small molecules, full-genome libraries of small RNAs, highly specific antibody assays, massive parallelization and imaging techniques-there is intense interest in the investigational power of multiple perturbation experiments in a variety of biological systems. The inherent complexity of such experiments raises significant challenges in data analysis and an acute need for improving modeling approaches capable of capturing effects such as time-dependent responses, feedback effects and nonlinear couplings.

Deriving system models from combinatorial perturbation experiments
Computer simulation of pre-defined pathways can be used to predict epistasis effects and explore how pathway organization shapes the perturbation response (Omholt et al, 2000;Segre et al, 2005;Lehár et al, 2007). In many situations however, observational data are provided but the pathway is unknown or only partially known. To solve this problem, our computational modeling approach enables users to construct a complete differential equation model for a system from combinatorial perturbation experiments. In the context of this paper, the system of interest is defined by a particular type of cell, its environment, a time interval of observation and a phenotypic change, such as cell death or growth. The system is further characterized by its points of intervention, such as drug targets, and the points of observation, such as the phosphorylation state of proteins involved in signaling processes ( Figure 1). To represent such a system mathematically, we choose network models in which nodes represent molecular concentrations or levels of activity and edges reflect the influence of one node on the time derivative of another. The time evolution of the system is modeled by linear differential equations, modified by a nonlinear transfer function to reflect properties of the system that are not explicitly modeled ( Figure 1). We present efficient optimization algorithms to find models that achieve maximum agreement between observation and prediction. Our algorithm is based on a combination of a gradient descent method (to set dynamical parameters) and a Monte Carlo process (to explore alternative network connectivities). We make a software implementation of CoPIA available as platform-independent software (http://cbio.mskcc.org/copia).
Testing the predictive power of derived system models We perform combinatorial perturbation experiments in an MCF7 breast cancer cell line to test the modeling framework in the steady-state limit. In this test, we demonstrate how observation of the effects of drug pair perturbations can be exploited to deduce a network model of signaling and phenotype control (reverse engineering of pathways). We use observed molecular state and growth phenotype responses to build predictive models and use these to explain the perturbation-phenotype relationship in terms of coupling between proteins in the EGFR/MAPK and PI3K/AKT pathways. Without using known pathway biology, the resulting model reproduces known regulatory couplings and negative feedback regulation downstream of EGFR and PI3K/AKT/mTOR, and makes predictions about possible roles of PKC-d and eIF4E in the control of MAPK signaling and G1 arrest in MCF cells.
We conclude that CoPIA may be of interest as a broadly applicable tool to construct models, discover regulatory interactions and predict cellular responses. For instance, researchers can measure a set of protein phosphorylation responses to drug combinations and use the method to automatically construct network models that predict the response to novel drug combinations. Application of this methodology to time-dependent experimental observations would extend this predictive capability to the regimen of timedependent, rationally designed combinatorial therapy.

Multiple input-multiple output models
State space representation is commonly used in mathematical modeling of input-output behavior in natural systems. In this representation, the time behavior of the system state is described by a first-order differential equation where the vector y(t) represents state variables (the activities of the system's components), the vector u(t) represents perturbations (external influences on the components) and f is a linear or nonlinear transfer function (de Jong, 2002). For example, y(t) can be the abundances of specific mRNAs or proteins, whereas u(t) can be the concentrations of different chemical compounds to which the cells are exposed ( Figure 1). In essence, state space models relate a system's input to its output. State space models with multiple inputs-outputs (that is, y and u have more than one coordinate) are called multiple input-multiple output (MIMO) models.

Linear MIMO models
When f is a linear function of y and u, the above model is called a linear MIMO model. The mathematical properties of linear MIMO models are well known (Ljung, 1986) and such models have been applied to many biological problems, for example, the construction of transcriptional network models (Tegner et al, 2003;Xiong et al, 2004;di Bernardo et al, 2005). Nevertheless, linear models have a limitation in that they can only model uncoupled perturbation effects (linear dose-response relationships), whereas nonlinear effects (coupled perturbation effects) are ignored (Figure 1; 'Model representation'). As a result, linear MIMO models are unable to capture important phenomena that are known to occur in cellular systems, such as saturation effects, switch-like effects and nonlinear interaction phenomena such as genetic epistasis and pharmacological synergism.

Nonlinear MIMO models
To overcome this limitation, we construct nonlinear MIMO models capable of representing coupled perturbation effects. Previously, other authors have observed that complex gene knockout effects, including epistasis effects, can be predicted in metabolic flux networks where bounds on the reaction rates are introduced (Fell and Small, 1986;Edwards and Palsson, 2000;Segre et al, 2005;Deutscher et al, 2006). Similarly, metabolic systems with Michaelis-Menten kinetics or transcriptional networks with bounds on transcription rates will exhibit epistasis behavior (Omholt et al, 2000;Lehár et al 2007). In the particular case of the MIMO model, we expect more biologically realistic behavior if one replaces the linear transfer function f with a nonlinear transfer function f that imposes bounds on the rates of change of the system. Accordingly, we propose the class of models In this class of models, the matrix w ij represents the interactions between the molecules and processes represented by the state variables of the system. (Intuitively, the matrix elements w ij can be thought of as a map of the system, in which w ij 40 means 'node j activates node i', whereas w ij o0 corresponds to inhibition.) Furthermore, a i 40 represents the tendency of the system to return to the initial state (y i ¼0); b i 40 are constants and f i is a transfer function capable of capturing both switch-like behavior and bounded reaction rates. Examples of such functions include sigmoid functions, piece-wise linear approximations of sigmoids or biochemically motivated approximations such as the Hill or Michaelis-Menten equations (Materials and methods).

Application of nonlinear MIMO models to combinatorial perturbation experiments
We developed computer algorithms to infer nonlinear models of the above type from experimental data, as specified by the best-performing values of the coupling parameters w ij and other parameters. As detailed in Materials and methods, the current implementation of our approach consists of the following steps. First, the system of interest is subjected to a set of independent single or multiple target perturbation experiments; and, for each perturbation vector (time-independent instance of u), a readout vector (steady-state instance of y) is recorded. Second, we infer a nonlinear model that best reproduces the experimental data (Materials and methods). Specifically, we rely on parameter estimation techniques for feedback systems to find a model that minimizes a quadratic error term between observed and predicted readouts, subject to simplicity constraints on the number of interactions in the system. Third, the fitted model can be used to predict the system's response to unseen perturbations (for example, combinations of drugs), and to gain new insight into the system's architecture.

Dual drug perturbation experiments in MCF7 breast cancer cells
To directly test the power of the approach, we performed an independent experimental study in MCF7 human breast carcinoma cells. As perturbants of the system, we chose compounds targeting EGFR (ZD1839), mTOR (rapamycin), MEK (PD0325901), PKC-d (rottlerin), PI3 kinase (LY294002) and IGF1R (A12 anti-IGF1R inhibitory antibody). As relevant readouts of molecular and phenotypic responses, we chose phosphoprotein levels of seven regulators of survival, proliferation and protein synthesis (p-AKT-S473, p-ERK-T202/Y204, p-MEK-S217/ S221, p-eIF4E-S209, p-c-RAF-S289/S296/S301, p-P70S6K-S371 and pS6-S235/S236) as well as flow cytometric observation of two phenotypic processes (cell cycle arrest and apoptosis) ( Figure 2). Inhibitors were administered singly and in pairs, followed by EGF stimulation. When recording responses of protein phosphorylation, we used the average response at 5 and 30 min as the surrogate for steady-state values. To build models, we represented the state of each of the above perturbation targets (signaling proteins), as well as each of the readouts, by one state variable y i . We then used the proposed optimization procedure (Materials and methods) to estimate the coupling parameters w ij and other parameters, resulting in predictive models of response in terms of these system variables.

Quantitative prediction of system response
We first assessed the predictive power of the derived models using leave-one-out cross-validation, in which one pair perturbation is left out of the analysis and then its effect predicted from information gained from all other perturbations. The resulting predictions were reasonably accurate for the nine different readouts. The best prediction was obtained for p-S6 phospho-protein levels (cross-validation error CV¼0.02, Pearson correlation r¼0.96) and the weakest for the G1 arrest phenotype (CV¼0.07, r¼0.45) (Figure 2 and  Supplementary Table 1). We directly compared the performance of our modeling approach to one using a corresponding set of linear differential equations with the same optimization procedure. By comparison, predictions using the nonlinear approach agreed better with experimental observations for eight of the nine readouts. Using the nonlinear modeling approach, the prediction error was lower by up to 50% with correspondingly better correlation between predictions and experimental observations (Supplementary Table 1). Thus, we conclude that our method is capable of deriving reasonably accurate network models for the input-output behavior of MCF7 cells with respect to the readouts used.

Detection of key regulatory mechanisms without prior knowledge
From a set of perturbation experiments, how can one deduce the logical network structure of activating and inhibiting interactions between the key molecular components, similar to the familiar pathway diagrams in publications summarizing a set of molecular biological experiments? Here, we use the derived network models with the smallest global error (E total ¼E SSQ þ lE STRUCT , Materials and methods) to infer causal connectivity diagrams. The inference is based on the assumption that interactions in sufficiently simple models that fit experimental observations, called 'good' models, represent an underlying causal relationship between system components modeled by the system variables y i . Such a relationship can be either an indirect regulatory effect or a direct physical interaction that would be observable in vitro with purified components. Using our Monte Carlo algorithm, we generated a population of 450 good models from the MCF7 dual drug perturbation experiments. From these, we assessed the statistical significance of the individual interactions both in terms of a posterior probability (which is obtained directly from the Monte Carlo process, see Materials and methods) and a 90% confidence interval constructed by boot-strapping simulations (Table I). We now discuss the connectivity of the best model, i.e. the one with the smallest error (schema in Figure 3, explicit equations in Materials and methods) relative to the known biology of regulatory pathways in the MCF7 breast cancer cell line.

Figure 2
Breast cancer cells as a multiple input-multiple output system. To generate data for model construction, we treated human MCF7 breast tumor cell lines with one natural ligand (epidermal growth factor (EGF)) and six inhibitors, singly and in combination. The treatment protocol used EGF, an IGF1 receptor inhibitory antibody (A12) and inhibitors of the signaling molecules EGFR, PI3K, mTOR, PKC-d and MEK. The inhibitors are ZD1839, LY294002, rapamycin, rottlerin and PD0325901. In the perturbation matrix (top panel, columns¼experiments, rows¼perturbations), a blue box indicates the presence of a particular perturbation and white indicates absence. For each treatment, we used western blots to detect levels of the proteins phospho-AKT, phospho-ERK, phospho-MEK, phospho-eIF4E, phospho-c-RAF, phospho-p70S6K and phospho pS6. We used a FACS-based assay to quantify apoptosis (measured as the 'sub-G1 fraction,' Materials and methods) and G1 arrest (measured as the G1 fraction). Here, representative flow histograms depicting cell cycle distribution in MCF7 cultures treated with or without drug are shown (one experiment is shown, see Supplementary information for all measurements). Evaluation of predictive power: After model construction based on these experiments, we see good agreement between experimental observation of the response of the MCF7 cell line to the 21 different perturbations (top, columns¼experiments, rows¼readouts) and the model prediction (bottom). Statistical assessment is in Supplementary Table 1. For each readout, we quantify the system's response by the phenotypic index defined as log relative response in treated versus untreated cells. For some drug combinations, the phenotypic readout increases as a result of perturbation (orange), for others it decreases (blue).

Interpretation of derived network structure
In comparing the inferred connectivity with mechanisms known to occur in MCF7 cells (Table I), two caveats are important. (1) The logical nodes in our models are defined precisely as the perturbed and observed molecular species, i.e. the targets of drug perturbation and the targets of specific observed antibody reactions, and may not be exactly identical to a single molecular species. For example,'EGFR' refers to the direct target(s) of activation by EGF and of inhibition by the drug ZD1839, and these two are assumed to be identical. (2) The models make no reference to unperturbed or unobserved nodes, e.g. whereas p-AKT is in the network model, the unphosphorylated AKT is not. With these caveats in mind, one can use the models both for confirmation and prediction of interactions. Of the 23 interactions in the best model, 14 had a posterior probability in the range of 20-99% (Table I). Of these, several statistically robust interactions clearly confirm canonical pathway structures. (i) The MAPK cascade downstream of the EGF receptor is detected as a chain of interactions between EGFR, MEK and ERK (Figure 3 and Table I). (ii) The negative feedback regulation of MAPK signaling is captured as negative interaction from ERK to EGFR, and as a moderately significant self-inhibition of MEK (see Discussion). (iii) PI3Kdependent signaling and the tendency for MCF7 cells to be dependent on AKT activation for survival are detected as interactions between PI3K, AKT and the apoptosis phenotype.
(iv) The model inference that apoptosis is controlled by p-AKT, but not p-ERK, is in agreement with previous results in MCF7 cells (Simstein et al, 2003;DeFeo-Jones et al, 2005). (v) mTOR downstream signaling is detected as interactions between mTOR, p70S6K and ribosomal S6 protein (Mingo-Sion et al, 2005). The derivation of these expected interactions from a small set of perturbation experiments, without prior pathway   The interaction matrix w ij from a set of good models can be used to infer regulatory interactions (squares¼inputs; circles¼internal system variables and other observables). Positive w ij means activation and negative w ij means inhibition of the target. Interestingly, some of the interactions are well known in MCF7 cells (green arcs) and others constitute predictions (orange arcs). See Table I for functional comments on interactions. No underlying pathway model was used-the network is a straightforward interpretation of the optimized model parameters w ij . The EGFR-MEK-ERK and PI3K-AKT, canonical pathways are identified. Also, note the detection of selfinhibitory interactions in MEK/ERK signaling, identification of eIF4E and AKT as direct regulators of apoptosis and G1 arrest.
knowledge, underscores the non-trivial value of the model building approach and provides some confidence in the concrete predictions of logical regulatory interactions for MCF7 cells (Table I), which are discussed below.

Discussion
In summary, our evaluation in breast cancer cells supports two main conclusions. First, our approach to model construction can be used to build reasonably accurate quantitative predictors of pathway responses to combinatorial drug perturbation in MCF7 cells. Second, the quality of the deduced interaction network suggests that well-parameterized nonlinear MIMO models are interpretable in terms of a network of (direct and/or indirect) regulatory interactions. The inference of network structure is surprisingly effective: the logical network diagram in Figure 3 was derived de novo based on only 21 experiments, using non-temporal data and only nine experimental readouts and accurately reflects important known regulatory interactions. This bodes well for future applications in which the amount of readout data can easily be an order of magnitude greater. In addition to yielding details of intermolecular coupling, the method is sufficiently general to allow predictive modeling of causal relationships between biomolecular events and cellular phenotypic consequences, such as growth or cell cycle arrest. The method lends itself to multi-level modeling in the sense that molecular, mesoscopic and macroscopic events can be modeled in a single framework once appropriate state variables y i are defined.

Software and technical aspects of implementation
We aim to put these tools into the hands of both computational and experimental biologists for widespread use and are providing a software distribution of CoPIA in the supplement. When applying the method in practice, three crucial technical details are important. A user has to choose (i) which system properties to represent by dynamical variables; (ii) a specific form for the transfer function f; and (iii) protocol and parameter values for the Monte Carlo simulation, or for a similar exploration of solution space. The key parameters include l, which enforces network sparsity to avoid overfitting, and T, the temperature parameter, which fine-tunes the extent of non-optimal exploration of network space. In Materials and methods, we provide guidelines for these choices.

Complementarity to response surface models and epistasis clustering
In a recent interesting work, Lehár et al (2007) used drug pairs to perturb signaling pathways in cancer cells, and provided an interpretation framework based on traditional pharmacological models for two-drug response surfaces. Drug targets in the PI3K and MAPK pathways were characterized by correlating 'synergy profiles,' demonstrating a link between network connectivity and drug pair response. Such synergy profiles, in turn, can be thought of as a generalization of the epistasis matrix used by Segre et al (2005) as a basis for functional clustering of genes. The approach proposed here is different in the sense that it performs a global optimization that aims to find a fully parameterized model for the entire system. Such models, in turn, can be used for additional purposes such as making predictions of system responses, or making connectivity information explicit as pathway diagrams. Preliminary data suggest that CoPIA models can be used to interpret or predict response surface data, as a function of drug concentrations, as an alternative to the approach of Lehár et al, e.g. to reduce experimental cost (S Nelander, unpublished data). Finally, the differential equation CoPIA models can be easily represented in standard systems biology formats, such as BioModels (Le Novère et al, 2006) and be used with a number of tools for model visualization, numerical simulation or analytical characterization.

Relationship to neural models and Hopfield networks
The nonlinear representation proposed here, or related neural models, has been used in biological contexts such as transcriptional network modeling (Marnellos and Mjolsness, 1998;D'haeseleer et al, 2000;Omholt et al, 2000;Vohradsky, 2001;Li et al, 2004;Bonneau et al, 2006;Hart et al, 2006), in synthetic biology (Kim et al, 2005(Kim et al, , 2006 and for problems such as approximation of inorganic chemical reactions (Shenvi et al, 2004), but not for general cellular processes and/or drug perturbations. In addition, CoPIA models are similar, but not identical, to Hopfield networks, a formalism introduced to study computation in physical systems (Hopfield, 1982). To further motivate this class of models in representing biological systems, we propose an extended effort to theoretically and empirically analyze how well biochemical reactions can be approximated by neural functions, e.g. reactions involved in DNA switches (Kim et al, 2005).

Confirmed and predicted regulatory interactions in MCF7 cells
In our analysis, we detected self-inhibitory feedback loops downstream of the EGF receptor. This is compatible with the observation that receptor activation of MAPK signaling frequently leads to rapid feedback inhibition, for instance by induced expression of inhibitory proteins (such as Sprouty (Kim and Bar-Sagi, 2004) or MAPK phosphatases), or inhibition of RAF by direct phosphorylation (Dougherty et al, 2005). In our experiments, we are not able to identify the full complexity of the feedback loops, as we did not perturb nodes such as ERK or RAF-1 or other proteins and used a short EGF stimulation time. Additional predictions, such as (i) eIF4E acting as a downstream effector of ERK, as well as (ii) PKC-d counteracting the G1 arrest phenotype, are supported by results in other cell types (Waskiewicz et al, 1997). Furthermore, the model predicts a mutually inhibitory interplay between eIF4E activation by phosphorylation and G1 arrest, consistent with the established role of eIF4E as a potent oncogene and a master activator of a 'regulon' of cell cycle activator genes (Culjkovic et al, 2006). However, the predicted increase in p-RAF by PKC-d is paradoxical: the observed phosphorylation sites on c-Raf (S289/S296/S301) are regarded as inhibitory, which seems inconsistent with the facts that PKC-d can activate MAPK signaling in a RAF-dependent way (Jackson and Foster, 2004). Our prediction might suggest an unknown direct effect mechanism, or an indirect effect that is not captured in the present analysis. Finally, three less interpretable and therefore interesting or potentially problematic features of the network in Figure 3 are (i) the selfactivation of ERK; (ii) the activating arrow between apoptosis and G1 arrest and, (iii) the fact that RAF is not placed between EGFR and MEK, as in the usual representation of this pathway. Overall, a number of predictions can be used to design experiments to validate or refute the model predictions.

Future challenges
There are a number of future challenges and opportunities to apply the method to important problems and to increase its power. A key challenge is to use the method to extend known pathways, by combining exploratory perturbation experiments with the richness of biological knowledge in pathway databases. This can be achieved by adding a priori known nodes y j into the formalism and introducing a bias in the network search that favors solutions compatible with prior knowledge. To deal with off-target effects of perturbations and incompletely known drug-target specificity, we propose a variant algorithm in which drug-target couplings are parameters that are determined by optimization. Such a variant can be used in target identification for interesting drugs, e.g. compounds that have a desirable effect but for which the target is not yet known. To maximize the information value of experiments, we propose to develop algorithms for the design of experiments, e.g. based on the change of outcomes with respect to particular parameters (King et al, 2004;Vatcheva et al, 2006). We see tremendous opportunities in new types of experiments. To generate more comprehensive and more informative perturbations of a larger set of cellular components, one can use combinatorial RNA interference (Friedman and Perrimon, 2006;Sahin et al, 2007). To generate readout richer by one or two orders of magnitude, one can use mass spectrometry of protein and phospho-protein levels (Mann et al, 2002). The CoPIA method can be generalized to go beyond the steady-state approximation and explicitly model the time behavior of system components by minimizing the error function for a set of time series experiments.

From models to therapies
The proposed combinatorial perturbation approach to cell biology, CoPIA, presents a well-specified experimental-computational procedure to construct predictive models for perturbation responses in malignant cells. We suggest use of such models to optimize therapeutic protocols, especially by designing interventions using a combination of targeted compounds administered in an optimal time sequence. Our method constitutes a concrete step toward the active development of network-oriented pharmacology.

Materials and methods
Computational methods

Phenotype prediction
The nonlinear MIMO model for combinatorial perturbation in cellular systems is introduced in the Results section (equation (2)). When this system is propagated through time, it will generally converge to a stable, fixed point (Pineda, 1987). We interpret this fixed point as the phenotypic response to the perturbation u. To calculate the fixed point given in a model, we used standard numerical integration methods (ode15s (Mathworks Inc.) and DLSODE (Hindmarsh, 1993)). As the class of models studied here can in principle have more than one solution to the steady-state equation (Smits et al, 2006), we used the convention-for practical purposes-to start each predictive simulation from the unperturbed, wild-type steady state y¼0.

Overview of model fitting algorithm
The procedure used to find parameter values (for the a i 's, b i 's and the w ij 's) from experimental data is outlined below. As an overall approach, we minimize a global error function that combines the requirements of data fit and simplicity. The error function is defined as where E SSQ is the residual sum of squares error, which measures the difference between the model's predicted values and the corresponding observational values for the subset of variables that are observed.
The term E STRUCT is a penalty term that measures the complexity of the network and l is a tuning parameter that needs to be chosen; for l¼0 no emphasis is put on the model structure and increasingly sparse (uncomplicated) models are obtained for increasing values of l. We used the l 0 -norm of the regulatory matrix w to define E STRUCT as where 0 0 ¼0. The l 0 -norm is a common approach to enforce sparse solutions in many machine-learning applications (Weston et al, 2002). In principle, other norms can be used, such as the l 1 norm (Yeung et al, 2002).
Inner loop: minimization of E SSQ using a gradient descent algorithm Assume a MIMO system with N dynamical variables y 1 ,y 2 , y, y N , of which a subset O of the variables can be observed experimentally. A perturbation experiment is described by the pair (u, Y), where u¼(u 1 , y, u N ) is the perturbation treatment and Y¼{Y i |iAO} is the experimental observation. As a mathematical model for the relationship between the perturbation u and the experimentally observed response Y, we use the dynamical system described in the Results section (equation (2)). Let Ỹ denote the steady state of this dynamical system under the perturbation u. We then define the sum of squares error for a single experiment as We consider a fixed network structure, where some w ij 's are fixed to zero. To describe the structure, we define a matrix U such that w ij can adopt a non-zero value if U ij ¼1 and w ij is zero if U ij ¼0.
Given N, (u, Y) and U, we want to find parameters a i 's, b i 's and the non-zero w ij 's that minimize the error E SSQ . For the special case where l¼0, a¼1, b¼1, Pineda (1987) described a gradient descent procedure, based on solving a set of differential equations in which the weights w ij are updated following the gradient descent rule Here, Z is a (small) number that sets the convergence speed, and t is a 'pseudo-time' that increases as the fitting procedure progresses. We use the update equations derived in D'haeseleer et al (2000) to extend to an arbitrary a and b. The computation formula to minimize E SSQ thus becomes: In these equations, z is an error propagation variable introduced for computational purposes (Pineda, 1987). To fit the model for a single (u, Y) pair, we integrated these equations (DLSODE or ode15s) with initial value 0 for w and 1 for a and b. The parameters were not subjected to constraints such as lower and upper bounds. Solutions for different stimulus-response pairs were combined using online learning with momentum described in Duda et al (2000).
Outer loop: minimization of E TOTAL with an l-zero penalty using stochastic search We used a Markov Chain Monte Carlo approach (Ewens and Grant, 2005) to minimize E TOTAL , and hence find the optimal model defined by the network structure U and parameter values for a, b and non-zero w's.
In the algorithm, a set of models are maintained and a particular model survives to the next iteration with probability proportional to e ÀE total =T (the Boltzmann factor, where T denotes the temperature of the search). Hence, low-error models are more likely to be propagated to next iteration. The temperature is typically high in the beginning of the search and low in the end.
The algorithm is outlined as follows: 1. Initialize with U current ¼U start . Here, subindexes of U (U current , U start , U 1 , U 2 , y) refer to different realizations of the U matrix (as opposed to U matrix elements. As U start , we use a N Â N matrix of zeros. 2. Generate a set S¼{Ũ 1 , y, Ũ k } of structures that are variations of U current . For simplicity, we consider every structure that differs from Ucurrent by one edge. 3. Estimate the parameters for each structure Ũ 1 , y, Ũ k using the variant of Pineda's algorithm presented above. Record the corresponding sum-of-square errors E 1 , y, E k.
4. Calculate the total error for each topology as E j 0 ¼E j þ l P U j . 5. Use a decision rule R to select one of the alternate topologies, U selected. 6. Update the current topology, Ũ current 'Ũ selected , potentially update T, and repeat from step 2.
As decision rule R, we randomly select topology U j with probability Under certain assumptions (the number of neighbors k is the same for every topology U, neighbor is a mutual relationship, and all possible topologies can be reached in a finite number of steps), the above Markov chain will have a stationary probability distribution in which the probability for a certain topology is proportional to its Boltzmann factor Ewens and Grant (2005). For a sufficiently low temperature T, the algorithm will converge to a probability optimum/error minimum.

Bootstrapping confidence intervals
For a given model structure U, we used re-sampling of residuals to generate boot-strapped confidence intervals for the model parameters. First, the model was fitted using structure U and the original data, and residuals were calculated as the best model fit minus the original data. A total of 200 'new' data sets was then constructed by adding randomly drawn residuals to each measurement (using residuals for the corresponding experimental readout, i.e. p-MEK residuals were added to p-MEK values and so on). For each such re-sampled data set, a model was fitted using the structure U. Subsequently, confidence intervals for each coupling parameter w ij were calculated as percentiles 5-95% across the 200 data sets.

Data preprocessing and parameter choices
The relationship between the model variable y i , a corresponding experimental observation Y i and an experimental reference point Y ref or Y max is defined by a mapping function. In our evaluation in breast cancer cells, we used the log relative change defined as The transfer functions f i should be chosen such that the interval spanned by the experimental data corresponds to the target domain of the function. We found it useful to standardize data to the interval [À1, þ 1] and then to choose the sigmoid function accordingly. As the reference ('wild-type') value Y ref , we used the untreated controls. As only one concentration level was used for every drug (chosen to be around the ED 90 ), we represented perturbation as u i ¼1 if the drug was added, and u i ¼0 otherwise. We used f i ¼tanh(y i ) as the sigmoid (suitable as it maps to the interval [À1, þ 1], another function with this target domain, f i ¼2/p tan À1 (cy i /2), gave very similar results).
Immunoblotting MCF7 cells were grown in 100 mm dishes, and starved for 20 h in PBS. They were then treated with indicated concentrations of inhibitors (details see Cell culture and reagents) or vehicle (DMSO) for 1 h, followed by adding EGF into the media (final EGF concentration was 100 ng/ml). After EGF stimulation for 5 or 30 min in the presence of drugs or DMSO, western blots were performed by harvesting MCF7