Edinburgh Explorer Zero, one and two-switch models of gene regulation

We compare a hierarchy of three stochastic models in gene regulation. In each case, genes produce mRNA molecules which in turn produce protein. The simplest model, as described by Thattai and Van Oudenaarden (Proc. Nat. Acad. Sci., 2001), assumes that a gene is always active, and uses a ﬁrst-order chemical kinetics framework in the continuous-time, discrete-space Markov jump (Gillespie) setting. The second model, proposed by Raser and O’Shea (Science, 2004), generalizes the ﬁrst by allowing the gene to switch randomly between active and inactive states. Our third model accounts for other eﬀects, such as the binding/unbinding of a transcription factor, by using two independent on/oﬀ switches operating in AND mode. We focus ﬁrst on the noise strength, which has been deﬁned in the biological literature as the ratio of the variance to the mean at steady state. We show that the steady state variance in the mRNA and protein for the three models can either increase or decrease when switches are incorporated, depending on the rate constants and initial conditions. Despite this, we also ﬁnd that the overall noise strength is always greater when switches are added, in the sense that one or two switches are always noisier than none. On the other hand, moving from one to two switches may either increase or decrease the noise strength. Moreover, the steady state values may not reﬂect the relative noise levels in the transient phase. We then look at a hybrid version of the two-switch model that uses stochastic diﬀerential equations to describe the evolution of mRNA and protein. This is a simple example of a multiscale modelling approach that allows for cheaper numerical simulations. Although the underlying chemical kinetics appears to be second order, we show that it is possible to analyse the ﬁrst and second moments of the mRNA and protein levels by applying a generalized version of Ito’s lemma. We ﬁnd that the hybrid model matches the moments of underlying Markov jump model for all time. By contrast, further simplifying the model by removing the diﬀusion in order to obtain an ordinary diﬀerential equation driven by a switch causes the mRNA and protein variances to be underestimated. the type of multi-scale approximation that is commonly used to make simulations more tractable. This leads to a stochastic diﬀerential equation driven by a Markov chain, and we show that a generalized version of Ito’s lemma can be used to analyse ﬁrst and second moments.


Introduction
Gene expression is a fundamental biological process that attracts a great deal of attention from both experimental and theoretical scientists. Because some important components are present at very low copy numbers, mathematical models typically involve discrete-valued state variables and have a stochastic nature [10,14,15,16,17].
The noise strength or Fano factor is often used to summarise the level of fluctuations observed in a system; for a random variable X, this is simply the ratio of variance to mean: .
Typically, the steady state noise strength in the mRNA or protein level may be of interest. Experimental or computer simulation-based measurements can then be recorded for different parameter regimes in order to understand which sources contribute to enhancing and suppressing intrinsic noise [15,16,17].
In this work we are concerned with the way that intrinsic noise depends on the choice of mathematical model. We look at this issue in two senses.
First, we consider a hierarchy of three continuous-time discrete-space gene regulation models of increasing complexity, where either zero, one or two switches affect the activity of the transcription process. In this case we are able to derive explicit expressions for the first and second moments of the mRNA and protein at steady state and make clear statements about whether switches increase or decrease the noise strength. Second, we look at a simple case of a hybrid version of the two-switch model based on the type of multi-scale approximation that is commonly used to make simulations more tractable. This leads to a stochastic differential equation driven by a Markov chain, and we show that a generalized version of Ito's lemma can be used to analyse first and second moments.
The article is organised as follows. The next section motivates and describes the models. In section 3 we derive an ordinary differential equation (ODE) for the first and second moments and correlations in the two-switch case. In section 4 we use this ODE along with existing results in order to compare the steady state noise strengths of the models. Section 5.2 looks at the multiscale model where mRNA and protein levels are assumed to be relatively abundant and diffusion is used to describe the stochasticity. In section 6 we summarize our findings and point to future work. Figure 1 illustrates a simple schematic of the process by which mRNA is created through transcription and protein is then created through translation. In this setting, as used, for example, in [17], an underlying gene is assumed to be creating mRNA at a constant rate. In the language of chemical kinetics, this gives a first order reaction network [3] that can be in interpreted as a Markov jump process, where ∅ → mRNA represents production from a source, mRNA → Protein represents catalytic production, and mRNA → ∅ and Protein → ∅ represent degradation.

Gene Regulation Model
In Figure 2, we follow [15] by supposing that the gene is not always available to create mRNA, but rather switches between an active state and an inactive state. The switch operates independently of the mRNA and protein levels, and we may regard Active ↔ Inactive as reversible isometric reactions.
The biological mechanisms through which a gene is activated and deactivated are, of course, extremely complicated, and Figure 2 presents a very simplified view. Quoting from the Wikipedia website http://en.wikipedia.org/wiki/Transcription factor: "In the field of molecular biology, a transcription factor (sometimes called a sequence-specific DNA binding factor) is a protein that binds to specific DNA sequences and thereby controls the transfer (or transcription) of genetic information from DNA to RNA. Transcription factors perform this function alone or with other proteins in a complex, by promoting (as an activator), or blocking (as a repressor) the recruitment of RNA polymerase (the enzyme which performs the transcription of genetic information from DNA to RNA) to specific genes." and "Transcription factors may be activated (or deactivated) through their signal sensing domain by a number of mechanisms. . . " Figure 1: Zero-switch gene regulation diagram. This motivates the diagram in Figure 3, where gene activity is controled by a pair of independent switches in AND mode. We may imagine that the gene is active only when a transcription factor (TF) is bound and this TF has become active. Either unbinding or deactivation of the TF will cause the rate at which the gene produces mRNA to drop to zero. Although we will use the bound/unbound active/inactive terminology throughout this work, we mention that the model could be motivated from other mechanisms, for example [1, Figure 2] describe a circumstance where two separate "activators" must operate in tandem for transcription to occur. The AND operation in Figure 3 could be regarded as a second order (or bimolecular reaction)-the rate at which mRNA is produced depends on the product of two {0, 1} valued species. Generally, second order reaction networks are not amenable to analysis; for example closed form ordinary differential equations cannot be derived for their moments. However, we will show in this work that the special structure of this network allows analysis to be performed, both in the discrete-space Markov jump setting, and in the case where a hybrid stochastic differential equation (SDE) is used.
To simplify the language, we will say that Figures 1, 2 and 3 represent the zero, one and two-switch models, respectively.

Moments for Two Switches
Moment analysis for the zero and one-switch models has already appeared in the literature [9,11]. In this section, we focus on the new two-switch model. Interpreting Figure 3 as a discrete-space, continuous-time Markov jump process, we may introduce scalar processes A(t) and B(t) to record the activation and binding of the TF: at time t, • A(t) = 1 if the TF is active and A(t) = 0 if the TF is inactive, • B(t) = 1 if the TF is bound and B(t) = 0 if the TF is unbound.
Given the rate constants u A , d A , u B , d B , we may characterise these processes by Now let M(t) denote the level of mRNA at time t. Since mRNA is produced with rate constant u R only when the TF is bound and active, we have This takes the form of a second order reaction-the rate of production of the species M(t) depends on the product of the levels of "species" A(t) and B(t). In general, second order systems are not amenable to analysis [6, Section 2.7.B], but we will show in this section that the special form of this system can be exploited. To do this, we introduce artificial species Y i (t) for i = 1, 2, 3, 4, each of which takes values in {0, 1}. These are defined according to • Y 3 (t) = 1 ⇐⇒ A(t) = 1 and B(t) = 0, and • Y 4 (t) = 1 ⇐⇒ A(t) = 1 and B(t) = 1.
We note that 4 i=1 Y i (t) = 1 for all time t. Letting P denote the protein level, we may write the overall system in the form This system now has the form of first-order reaction network. In the terminology of [3], reactions (2)-(9) are of conversion type, (10) and (11) are of catalytic production type, and (12) and (13) are of degradation type. So, we may use the general results in [3] to obtain a closed system of ODEs that express the evolution of the first and second moments and correlations. In the notation of [3], we have we can eliminate some redundancy in order to obtain ODEs for the means and second moments We are now in a position to compare the noise strengths of the three models.

One-Switch versus Zero-Switch
The one-switch model in Figure 2 may be interpreted as the first order reaction system where D and D ⋆ denote the inactive and active states of the gene, respectively.
Here, the initial condition for the gene must be either D(0) = 1 and D ⋆ (0) = 0 (inactive) or D(0) = 0 and D ⋆ (0) = 1 (active), and D(t) + D ⋆ (t) ≡ 1 for all time. A closed, stable linear system of ODEs describing the evolution of the first and second moments and correlations can be found in [11]. Using M (t) and P (t) to denote the mRNA and protein levels for this system (28)-(33), in order avoid confusion with M(t) and P (t) from the two-switch model, we find that the steady state moments have the form The zero-switch case in Figure 1 can be written as the reaction network where D(t) ≡ 1 for all time. A linear ODE system for the first and second moments is given in [9]. Letting M ⊕ (t) and P ⊕ (t) denote the mRNA and protein levels for this model, we find the steady state values Comparing the steady state mRNA and protein variances from the two models, we find that Now recalling the definition of noise strength in (1), some further manipulation of (34)-(37) and (43)-(46) shows that . (49) It is clear that the right hand sides in (48) and (49)    repeats this experiment using rate constants u C = 0.1, d C = 0.1, u R = 10, d R = 5, u P = 10, and d P = 0.1 from [15] with the same initial conditions as in Figure 4. For these rate constants and initial conditions, the difference between noise strengths for mRNA and protein remains positive for all time.

Two-Switch versus Zero-Switch
Equations (14)-(27) give a stable, linear ODE system for the moments of the two-switch model (2)- (13). Solving for the steady state and comparing with the result for the zero-switch model (39)-(42), we find that the difference between mRNA noise strengths is which is clearly positive. The corresponding difference for the protein levels, ns[P ] − ns[P ⊕ ], is too complicated to display, but is also positive for all parameter values.
To illustrate the results, in Figure 6, using the same rate constants and initial conditions as Figure 4, except u A = 0.1,    1, we show the difference between noise strengths for mRNA and protein. The values at steady state are 0.524 and 7.215 for mRNA and protein, respectively, but we see that the difference between noise strengths changes sign over time. Figure 7 shows an example where the difference between noise strengths is positive for all time. Here, the difference at steady state is 1.425 and 43.294 for mRNA and protein, respectively. In this case we used the same rate constants and initial conditions as Figure 5 together with u A = 0.1, d A = 0.1, u B = 0.1, d B = 0.1 and the deterministic initial condition Y 4 (0) = 1.
In summary, like the one-switch model, the two-switch model always gives greater noise strengths at steady state than the zero-switch model, but not generally for all time.

One-Switch versus Two-Switch
Using the results from the previous subsections, we can characterise the difference in noise strengths of mRNA and protein between the two-switch model (2)- (13) and one-switch model (28)-(33). The expressions are too complicated to display, but a key fact is that they contain both negative and positive terms, and their overall sign depends on the model parameters.
To illustrate this, in Figure 8 we use the same rate constants and initial conditions as Figure 6, with u C = u A and d C = d A . We see that the     state differences are −0.022 and −0.314 for mRNA and protein, respectively. On the other hand, Figure 9 shows a case where the differences are positive for all time. Here we used the same rate constants and initial conditions as in Figure  7 together with u C = u A and d C = d A . In this case, the steady state values are 0.463 and 9.985 for mRNA and protein, respectively. Figure 10 shows that the differences in both mRNA and protein can change sign . Here we used the same rate constants and initial conditions as in Figure 9, The differences at steady state are −0.064 and −63.833 for mRNA and protein, respectively.

Hybrid Moments
The Markov jump framework gives an accurate microscale-level picture of chemical kinetics, and sample paths can be computed using the well-known Gillespie algorithm [4,5]. However, for large or complex systems involving one or more fast reactions, such simulations may be prohibitively expensive. This has motivated the use of multiscale models that sacrifice some of the discreteness or stochasticity by using deterministic or continuous-valued variables. These modeling approximations can be justified when certain species are present in large numbers.
A simple approach to multiscale modeling and simulation is to model certain   reactions as SDEs, also known as Chemical Langevin equations (CLEs) or diffusion approximations [7,8]. Alternatively, standard ODEs can be used. When a mixture of these approaches is combined, we obtain a hybrid system. Paszek [14] argued that in gene regulation systems it may be reasonable to treat either protein or both mRNA and protein levels as being continuous-valued. In that reference ODEs were used, but it has been shown in [9,11] that the SDE setting is generally better at recovering the moments of the underlying exact model.
Our aim in this section is therefore to study hybrid versions of the new twoswitch model (2)- (13). We will show that, as in the Markov jump setting, although it appears to be a second order reaction network, moments of the hybrid SDE model can be analysed.

Hybrid Diffusion Moments
In order to describe the two-switch model as a hybrid SDE, we let r(t) be a Markov switch with state space S = {1, 2, 3, 4} and let γ ij denote the transition rate for the switch from state i to j. Hence, for i = j, Here, • state 1 corresponds to A(t) = 0 and B(t) = 0, • state 2 corresponds to A(t) = 0 and B(t) = 1, • state 3 corresponds to A(t) = 1 and B(t) = 0, • state 4 corresponds to A(t) = 1 and B(t) = 1.
For this switch, we move from state 1 to 2 when the TF binds, so we have Similarly, we move from state 1 to 3 when the TF activates, so that Continuing this manner, we obtain the transition rates in Table 1. Now, let g(r(t)) be any function such that g(r(t)) = 1 when r(t) = 4, 0 otherwise. Table 1: Transition rates γ ij for switch r(t).
We may then express the two-switch model as Now we look at a hybrid model based on (50)-(53) where the effect of the TF is modeled as a Markov jump process r(t), and the evolution of the mRNA and protein levels, M(t) and P (t), is modeled with the CLE regime. This gives rise to Ito SDEs driven by an independent switch, of the form We use M ⋆ (t) and P ⋆ (t) to distinguish this process from the mRNA and protein levels, M(t) and P (t), arising from the full CME regime, while W i , i = 1, 2, 3, 4, are mutually independent Brownian motions that are also independent of r(t). We refer to [13] for a thorough treatment of the theory and numerics of SDEs driven by switches. Taking expectations, we find immediately that Since g(r(t)) ≡ Y 4 (t), comparing (56) and (57) with (17) and (18), we see that this hybrid diffusion regime gives the same first moments as the full Markov jump model . Applying the generalised Ito lemma [13] in (54) and (55), we find where "mart." denotes a martingale. Therefore, Now, considering the case where r(t) = 4, we have that γ r,

Hybrid ODE Moments
In this subsection we consider the case where, as in subsection 5.1, the binding/unbinding and activation/deactivation of the TF is modeled as a Markov jump process r(t), but now the evolution of the mRNA and protein levels are modeled with a simple ODE arising from the law of mass action. This general approach was proposed for the zero-switch model in [14], where steady state analysis was performed, and further time-dependent results were given in [9,11] for the zero and one-switch models. Our aim here is to look briefly at the two-switch model. In this case, we have two ODEs driven by an independent switch, of the form We use M (t) and P (t) to denote the continuous-valued stochastic process that represent the mRNA and protein levels in this regime.
Applying the generalised Ito lemma [13], we find Therefore,  (27) and (67) allow us to conclude that E[ P 2 (t)] < E[P 2 (t)]. In summary, for any set of non-zero rate constants, the hybrid ODE model underestimates the second moments of the mRNA and protein and the mRNAprotein correlation, for all time.

Summary
Some surprisingly simple stochastic models based on Markov jump processes have been successful at describing the level of intrinsic noise in gene regulation activities inside the cell. These models allow the important, and measurable, noise strength to be characterised in terms of the relative rates of transcription, translation and degradation.
In this work, we introduced a more general model that attempts to account more accurately for the indirect control exerted by a transcription factor. This new model does not fit naturally into the framework of first order reaction networks, but we showed that its noise strength is amenable to analysis.
Regarding this model as the two-switch successor to previously studied oneswitch and zero-switch versions, we were able to show the intuitively reasonable results that, given a set of rate constants, • the one and two-switch models always have greater steady state mRNA and protein noise strengths than the underlying zero-switch model, although the variances may be smaller.
So incorporating transcription factor effects in this way leads to a prediction of larger intrinsic noise. However, somewhat less intuitively, • before equilibrium is reached, the noise strengths of the one and two-switch models may be less than that of the zero-switch model.

Also,
• the two-switch model may be more or less noisy than the one-switch model, depending on the rate constants.
We also analysed hybrid SDE and ODE approximations to the two-switch model and showed that it is necessary to retain the diffusion term in order to avoid underestimating the mRNA and protein variances and correlation.
There are, of course, many ways in which gene regulation models can be extended, and we are currently considering the case of auto-regulation, where a protein can affect its own transcription rate [12]. In the hybrid SDE setting this will give rise to the more technically demanding regime of state-dependent Markov switching.