Amplitude distribution of stochastic oscillations in biochemical networks due to intrinsic noise

Intrinsic noise is a common phenomenon in biochemical reaction networks and may affect the occurence and amplitude of sustained oscillations in the states of the network. To evaluate properties of such oscillations in the time domain, it is usually required to conduct long-term stochastic simulations, using for example the Gillespie algorithm. In this paper, we present a new method to compute the amplitude distribution of the oscillations without the need for long-term stochastic simulations. By the derivation of the method, we also gain insight into the structural features underlying the stochastic oscillations. The method is applicable to a wide class of non-linear stochastic differential equations that exhibit stochastic oscillations. The application is exemplified for the MAPK cascade, a fundamental element of several biochemical signalling pathways. This example shows that the proposed method can accurately predict the amplitude distribution for the stochastic oscillations even when using further computational approximations. PACS Codes: 87.10.Mn, 87.18.Tt, 87.18.Vf MSC Codes: 92B05, 60G10, 65C30


Introduction
Oscillations are a widely occurring phenomenon in the dynamics of biological systems. On the intracellular level, oscillations occur for example in the activity of various genes or signalling proteins. To gain more insight into the processes related to these oscillations, mathematical models for the underlying biochemical and genetic networks are commonly constructed. Such models have proven helpful in connecting the underlying biochemistry with the temporal characteristics of emerging oscillations and the associated biological function. Examples for this type of results include the intracellular circadian clock [1] or the developmental process of somitogenesis [2].
In dynamical models for biological systems, oscillations are due to limit cycles or more complex attractors in deterministic frameworks, or may emerge from various stochastic effects in stochastic frameworks. Concerning the biological function, one would not expect that it makes a significant difference whether oscillations are due to a deterministic limit cycle or arise from stochastic effects. For the biological function, rather the temporal characteristics of oscillations are relevant, such as their frequency or amplitude. For deterministic limit cycle oscillations, these characteristics are easily computed by a numerical simulation of the model. For stochastic models, computing the temporal characteristics of oscillations is much more involved. Two approaches are common for solving stochastic systems. In the first approach, the so called chemical master equation (CME) is used [3]. The solution to the CME (or approximations thereof) yields the probabilities for each of the possible configurations of the system to be attained at a given point in time. However, since the temporal correlations of these probabilities are usually not obtained from the CME, the temporal characteristics of oscillations are not directly accessible from such a solution. In the second approach, a large number of realisations of the stochastic process are computed. These realisations can then be used to compute various temporal characteristics of the system, in particular oscillation amplitude and frequency distributions. Yet, the computation of a sufficiently large number of realisations typically entails a large computational effort.
In this paper, we develop a new method to compute the amplitude distribution for systems exhibiting stochastic oscillations. We thereby focus on systems where the deterministic part has a weakly stable equilibrium point (EP), typically with damped oscillations, and the stochastic effects induce sustained oscillations around this EP. In such systems, we can distinguish two mechanisms by which stochastic oscillations occur. For the first mechanism, the system needs to have the property that a certain small perturbation away from the equilibrium point leads to a large excursion in the state space before the system returns to the proximity of the EP [4]. If the noise is strong enough to reliably generate such a perturbation, but does not disturb the system during the round trip too much, we recognize regular oscillations with a well defined amplitude and frequency. This effect is called Coherence Resonance [5] or Stochastic Resonance [6, p. 149 ff.]. For the second mechanism, the deterministic part of the system already gives rise to damped oscillations, and the noise just serves to sustain the oscillations [7,8]. In this case, the density distribution of realizations in the state space will typically be similar to a Gaussian-like distribution with a maximum at the EP, while individual trajectories show pronounced oscillations with a frequency close to the frequency of the deterministic part. The amplitudes of this type of oscillations increase with the noise power, in contrast to the other type of systems, which mainly exhibit oscillations of a fixed amplitude independent of the noise power.
For the second mechanism, it is of interest to compute the amplitude distribution of the stochastic oscillations. Several methods to solve this problems were developed previously. In [9] the average change of a stochastic Lyapunov function was determined. By setting this average change to zero one may identify an orbit around which a stochastic realization fluctuates, and thereby estimate a typical amplitude. In the context of this article the results of Kuske et al. [7,8] are very interesting. They apply a multi-scale ansatz to separate the properties of the oscillations mainly determined by the deterministic part of the analyzed system (the frequency) from those determined to a high percentage by the stochastic part (the amplitude). Therewith it is possible to calculate both the amplitude distribution and the mean frequency of the oscillations. Nevertheless this method has the disadvantage of requiring the discussed systems to be almost linear and the frequency of the oscillations not to be noteworthy disturbed by the stochastic effects.
In this article we focus on biochemical systems that can be modeled as a set of interconnected, possibly nonlinear, stochastic differential equations (SDEs). With the help of the Fokker-Planck equation (see e.g. [3, p. 193 ff.] or [10, p. 120 ff.]), we calculate the stationary density distribution in the state space. We provide a theorem allowing to calculate the amplitude distribution with only the knowledge of the density distribution and the corresponding SDE. The theorem allows not only to consider linear, but also a wide class of non-linear systems and therefore makes it possible to analyze not only systems with oscillations being of low amplitude compared to average concentrations but also of intermediate and high amplitudes. Although stochastic oscillations are a mainly two dimensional effect, we show that they also occur in higher order systems and give an example on how to analyze them then. As far as we are aware of, this paper introduces for the first time a method to analytically analyse higher order possibly non-linear systems being affected with intermediate to high amounts of noise, and thereby showing stochastic oscillations.
The paper is structured as follows. To give an easily accessible introduction to the material, we first present our results on the calculation of the amplitude distribution of stochastic oscillations and afterwards state the underlying assumptions and theorems. Then we explain the application of our theory by calculating the amplitude distribution of a simple example, the damped harmonic oscillator in the presence of additive noise, to get a first insight into the reasons for stochastic oscillations. We want to remark that this example simplifies, due to its structure, the necessary calculations significantly. As a more realistic example, we discuss oscillations in the MAP kinase cascade with an incorporated negative feedback with limited amounts of entities of each molecular species. For this system, the stationary probability distribution can be estimated by a linear approximation, or it can be computed numerically. We compare the amplitude distributions predicted by our method, based on these two approaches, to an "experimental" amplitude distribution obtained from a long-term stochastic simulation.

Amplitude Distribution of the Stochastic Oscillations
In this article we develop a method to derive the amplitude distribution of stochastic oscillations from the knowledge of the stationary density distribution of a stochastic differential equation (SDE). Here we first give a short outline of our results for which we present the corresponding theorems and proofs in the next sections.
Biochemical networks with stochastic dynamics can be modelled by the Langevin equation [11]. For the derivation of the theoretical results, we consider a two-dimensional Langevin equation having an equilibrium point at zero: with the state vector x ℛ 2 , the system's dynamics f = (f 1 , f 2 ) T , where f 1 and f 2 are smooth functions ( C ∞ ), Σ a 2-by-2 matrix of smooth functions and Γ = (Γ 1 , Γ 2 ) T , where Γ 1 and Γ 2 are uncorrelated, statistical independent Gaussian white noise with zero mean and variance of one.
Let us denote the stationary density distribution of the system (1) by P (x), and the amplitude distribution for the oscillations by P A (χ), where χ denotes the oscillation amplitude. Under certain assumptions (see following sections) the amplitude distribution P A (χ) then satisfies x (2) with ν(x) = ||f(x)|| the average speed at state x andx (c) given bŷ Hereby A is defined as A = {x ℛ 2 |〈 x| e 〉 = c } and e is the unit vector in the direction where the amplitude is measured.
The product on the right hand side of (2) represents the steady state flux of reactions through a certain statex . Intuitively speaking,x is chosen so that most of the realizations going throughx will also reach their maximal value in a small neighborhood ofx , so that under certain assumptions (see following sections) it is justifiable to approximate the amplitude of such a realization by the value reached atx .
The apparent complicated definition ofx is a result of the freedom of choice in which direction e the amplitude is measured. In a biological system it is obviously important to distinguish whether one measures the oscillations of the concentration of species A or of species B. Sometimes it is experimentally not possible to distinguish between two species, e.g. if To normalize the probability distribution P A of the amplitude, we have to divide equation (2) through the integral of the probability of all amplitudes: Simple models may have additional properties allowing us to simplify (2). For linear systems the speed ν(x (c)) is directly proportional to ||x (c)|| simplifying (2) to withx given as above.
We want to emphasize here that (2) allows to analytically or numerically calculate the amplitude distribution only from the knowledge of certain properties of the SDE describing the biological system without the necessity to run numerical simulations. The formula yields a good approximation for a wide class of nonlinear problems and can be used to calculate the amplitude distribution even for certain systems having more than two dimensions as we demonstrate in the second example of this paper.

Derivation of the Results
To derive the results presented in the proceeding section we analyze nonlinear SDE systems of order two as given by (1), with an asymptotically stable equilibrium point and damped oscillations in the deterministic part. We restrict our analyis to two dimensional systems, because many higher dimensional systems can be reduced to two dimensions for the purpose of analysing stochastic oscillations. In the derivation we assume w.l.o.g. that we want to compute the amplitude of the oscillations in the positive direction of x 1 , i.e. e = (1, 0) T , and that the equilibrium point of the deterministic formulation ( x = f(x)) of (1) is at the origin. Both assumptions can easily be satisfied for an arbitrary system by an appropriate coordinate transformation.
We establish an angular phase relationship of the state vector x(t) with respect to a fixed reference vector in the state space [12]. An oscillation period of the system (1) is defined as the time during which the angle between the state vector and the fixed reference vector changes by 2π. Further, the amplitude of an oscillation is defined as the maximal value of x 1 during the corresponding oscillation period.
We make the following assumptions on the system (1): 1. There exists a stationary density distribution P: ℛ 2 ℛ + : x ↦ P (x) for (1) which is sufficiently smooth in x. We demand the curvature of the level curves of this distribution to exist and not to change its sign. For the computation of the amplitude distribution P A (c), P (x) may be computed analytically, numerically, or may be obtained by the long term limit of a measurement. 2. The deterministic formulation ( x = f(x)) of (1), which is at the state x := x(t 0 ) at time t = t 0 with the probability of the state x being P ( x ) =: P , will approximatively evolve tangential to the level curve of the probability distribution, thus for small Δt the probability of the state at time t + Δt won't have changed significantly (P (x(t 0 + Δt)) ≈ P ) This assumption can be easily checked by calculating the Lie derivative of P with respect to f: which has to be small for almost all x R 2 having a non-negligible probability P (x).
3. In addition we require that the average speed ν = ||f(x)|| of (1) does not vanish for almost all x ℛ 2 having a non-negligible probability P (x).
Assumption 3 is obviously not satisfied in a small area around the EP where for most systems ν becomes very small. However when the system is very close to the EP, the motion of most realizations can be approximated by a random walk. Thus during the time when the system is in the small neighborhood of the EP it is not justifiable to speak about oscillations anymore until the system leaves the neighborhood again. Away from the EP, ||f(x)|| is significantly bigger than zero for most biological relevant systems so that Assumption 3 is fulfilled.
With the help of these assumptions we are able to formulate a theorem for the calculation of the amplitude distribution of stochastic oscillations. For the formulation of the theorem let us definêˆ: i.e. the unit vector e x 1 in the direction of x 1 is orthogonal to the tangent on the level curve of the probability distribution inx (see Figure 1). Because of Assumption 1 there exists exactly one statex X for every amplitude c satisfyingx 1 = c. Furthermore we define the following variables, which can be calculated with the knowledge of the system (1) and the stationary probability distribution P: ) .
The variable ν can be thought of as the deterministic speed, L is the standard deviation of the Lie-derivative of P with respect to the right hand side of (1), a the derivative of P in the direction of x 1 , and the curvature of P with respect to x 2 , each evaluated at the statesx X . The variables are illustrated in Figure 2.
With the help of these definitions we can formulate the following two lemmas and a theorem, which characterizes the amplitude of the stochastic oscillations.  Lemma 1 Assume the Assumptions 1-3 are satisfied. Let Ψ f be a realization of (1) being at the statê x OEX at time t 0 . Then the amplitude of Ψ f during the current oscillation will lie with a probability of 70.8% in the set [x 1 ,x 1 + δx 1 ] and with a probability of 95.5% in the set [x 1 ,x 1 + 2 2 3 1 Lemma 2 Assume the Assumptions 1-3 are satisfied. Then the net flux density ϕ (x) of realizations at the state x ℛ 2 has an absolute value proportional to the product of the probability P (x) and the average speed v(x) of (1) and is tangential to the level curve of the density distribution P at this point. Lemma 2 is derived by considering the flux of realizations through an infinitesimaly small region around the state x, as illustrated in Figure 3.

Theorem 1 (Distribution of amplitudes)
Under the Assumptions 1-3 and if δx 1 as defined in Lemma 1 is small, the probability P A (c) of an amplitude c of the stochastic oscillations of (1) in the direction x 1 is proportional to the absolute value of the net flux density || ϕ( ) x || at the statê Our results for the calculation of the amplitude distribution P A (c) presented in the previous section (2) directly follow from the application of Theorem 1. The proofs of the respective lemmas and theorem can be found in the appendix.

Remarks
• To check if the assumptions necessary for the application of Theorem 1 are fulfilled, one can calculate δx 1 according to the formula in Lemma 1. If δx 1 is significantly smaller thanx 1 for almost allx X having a non-negligible probability P (x ), Theorem 1 may be applied. For a general stochastic differential equation, it is hard to exactly indicate how much smaller δx 1 must be compared tox 1 . This highly depends on the structure of the system and also on the accuracy being required for the solution. Nevertheless, as a rule of thumb, the theorem gives good approximations if δ x x 1 1 < 0.2. It is not necessary that this is true for everyx X , but the calculated density of the amplitude may differ from the real one for the amplitudes c R for which the corresponding δx 1 is not small enough. Nevertheless the accuracy of the amplitude distribution for other amplitudes is not affected (see proof).
• It may be possible to relax the assumption that the curvature of the level curves does not change its sign globally. Intuitively it seems to be sufficient for most systems that the curvature does not change its sign only locally around the statesx X , and that the amplitude in one oscillation period reaches its maximum in the area aroundx . However this has to be shown for the system of interest explicitly.
• If the Lie derivative L f P of the density distribution P is not small enough, the net flux density ϕ cannot be approximated to be tangential to the level curves of P anymore. Depending on the size of L f P this may lead to a significant bias in the calculation of the amplitude distribution for certain systems. In this case, the derivation of the amplitude distribution would be more involved, and also depend on the term ∂ ∂x 2 χ , starting from an appropriate modification of (66) in the appendix.

The Damped Harmonic Oscillator in the Presence of Additive Gaussian White Noise 2.4.1 Equations and Properties of the System
In this section we give a motivational example for the application of Theorem 1 by discussing the behavior of the damped harmonic oscillator in the presence of additive Gaussian white noise. The oscillator satisfies the following Langevin equation: with Γ i , i {1, 2} uncorrelated, statistical independent Gaussian white noise with zero mean and variance of one, s R + and k R .
It is easy to show that the deterministic system (s = 0) is asymptotically stable for k > 0. The eigenvalues of the system matrix are at λ 1,2 = -k ± i, therefore the deterministic system will show damped oscillations upon perturbation from its steady state.

Calculation of the Density Distribution
, , withÂ i (x) and B i, j (x) > 0 real functions describing the influence of the system dynamics and the noise on the distribution. The diffusion term B may be calculated to with Σ the matrix of the noise terms of (1). We call (11) linear if ∀ iÂ i (x) is a linear function in x and ∀i, j B i, j (x) is constant. Then (11) can be simplified to , , , For the system (9), A i, j are given by the system matrix A from (10), and B i, j are the elements of the matrix B = s 2 I 2 , where I 2 is the second order identity matrix. To get the stationary density distribution, we set ∂ Following [3], the stationary density distribution for the linear Fokker-Planck equation (13) is given by with 〈 x 〉 the mean value of x and Ξ the matrix of the second moments of P, which is the solution of the equation AΞ + ΞA T + B = 0. In system (9), 〈 x 〉 = 0 and Ξ may be calculated to Ξ Ξ = σ 2 2 2 k I . The state of the system is therefore Gaussian distributed with the maximum of the density being at the EP.

Determination of the Amplitude Distribution
To get the amplitude distribution P A (c) of the oscillations of the harmonic oscillator (9), we apply Theorem 1 after checking the assumptions made in the theorem. Notice that due to the easy structure of this first example and its symmetry, we are always measuring the oscillations in a direction tangential to a principal axis of the distribution. Thereforex (c) = (c, 0) T for all c. Because of this it is easy to calculate the necessary variables with the help of the definitions After some calculations we obtain For s = 10 and k = 0.01, δx 1 is getting small compared tox 1 forx 1 ≥ 20, so our approximation should at least hold for every amplitude greater or equal to 20. The results will nevertheless show that we even get good estimations for amplitudes much smaller than 20.
Due to the linearity of the oscillator (9) and the symmetry of the harmonic oscillator, we can determine the amplitude distribution by utilizing formula (5) as For s = 10 and k = 0.01 the predicted amplitude distribution P A is plotted over the amplitude c in Figure 4. For comparison we also displayed the amplitude distribution of the stochastic oscillations as measured in a long term simulation run.
This easy example was discussed to give an insight into the reasons for stochastic oscillations. In the next section, we show the practical applicability of our algorithm by predicting the amplitude distribution of a complex biochemical system and therewith give an example of a more biologically relevant application for the results of the paper.

Oscillations in the MAP Kinase Signaling Cascade
In the following section we apply the algorithm developed in this paper to a realistic example from biochemical signal transduction. A frequent module in many eucaryotic cells from yeast to mammals is the mitogen activated protein (MAP) kinase signaling cascade. MAPK cascades are typically activated by extracellular stimuli such as growth factors, and regulate the activity of various genes, thereby provoking a cellular response to the applied stimulus. Many important cellular functions such as differentiation, proliferation and death are controlled by MAPK cascades [14]. MAPK cascades consist of three layers of kinases, where each kinase phosphorylates and thereby activates the kinase on the next layer, as shown in Figure 5. The kinases are named MAPKKK, MAPKK, and MAPK, in the order of activation, and the phosphorylated, active forms are denoted with a star. The basic structure may be complemented by additional feedback interconnections, giving rise to deterministic limit cycle oscillations [15]. More recently, oscillations in the MAPK cascade have been determined experimentally in yeast cells, where the oscillatory activity of the MAPK controls periodic changes in cell shape during the mating process [16].

The Deterministic Model
For this example, we use a basic ODE model of the MAP kinase signaling cascade. The model contains a negative feedback interconnection from the last kinase MAPK to the activation of the first kinase MAPKKK [15]. We use a simpler model compared to the one in [15], and therefore sustained oscillations do not occur in the deterministic version of the model considered here. Denote x 1 , x 2 and x 3 the concentrations of MAPKKK*, MAPKK* and MAPK*, respectively. Using three conservation relations x MAPK T 3 3 300 we get the deterministic description of the system using mass balancing with Michaelis-Menten reaction kinetics: x x x with the reaction rates given by 1 1  1 12 1 1 12 1 1 13 3 13 The parameter sets for the phosphorylation (k ij ) and for the dephosphorylation (p kl ) are given in Table 1.
The biochemically relevant equilibrium point x 0 of the system is computed as  and thus the deterministic system is locally asymptotically stable.

The Stochastic Model
For validation purposes and to get a first insight into the system's dynamics, we did stochastic simulations of the biochemical network model of the MAPK cascade as described by (19). We used the stochastic simulation software Dizzy by the Institute for Systems Biology [17], which is able to do stochastic simulations of models not only with mass action kinetics, but also with arbitrary kinetics such as Michaelis-Menten kinetics, which are used in our model. For the simulations we assumed a cell volume of V = 1 pl, which corresponds to the absolute number of molecules of each enzyme of where N A is the Avogadro constant. We provide a typical plot of the oscillations of x 1 , x 2 and x 3 against the time in Figure 6. As can be seen from the plot, the oscillations don't vanish as the deterministic calculations suggested, but are sustained oscillations with a determined amplitude range. In the following we will show that these oscillations can be analyzed with the help of Theorem 1.
As discussed in [11] it is possible, under certain assumptions, to describe a system of biochemical reactions by a set of stochastic differential equations, giving rise to the so called Langevin approximation. These assumptions are, roughly speaking, fulfilled if the amount of entities of each species does not get too small, like in our system. The Langevin approximation of the MAPK cascade model ( Figure 5) is given by with Γ i, f and Γ j, b , i, j {1, 2, 3} uncorrelated, statistical independent Gaussian white noise with zero mean and a variance of one, N A the Avogadro constant and V the cell volume. This system corresponds to the deterministic one with an additional noise part for every reaction added, with a standard deviation corresponding to the square root of the magnitude of the reaction rate divided by the cell volume. The factors of the white noises, , leading to the following equations: x

Transformation and Model Reduction
The theoretical results in this paper have been developped for two-dimensional systems only. In higher order systems stochastic oscillations may appear on a two dimensional manifold in the state space. Such a system can be reduced to an order two system by computing the slow manifold, making use of a time scale separation [18]. Following this approach, we first transform the system (25) with z 1 R the coordinate of the fast and z slow = (z 2 , z 3 ) T the coordinates of the slow manifold, ψ fast and ψ slow = (ψ slow, 1 , ψ slow, 2 ) T vectors of polynomials of order two and higher, j fast and j slow matrices of the noise strength and Γ = (Γ 1 , Γ 2 , Γ 3 ) T the noise vector of system (25).
The transformation needs to be done in such a way that the absolute values of the real parts of the eigenvalues of A fast are much larger than these of A slow . To this end, we compute the eigenvectors corresponding to the eigenvalues (22a)-(22b) as The two eigenvectors with non-zero imaginary parts, e 2 and e 3 , are both eigenvectors corresponding to eigenvalues having real parts with small absolute values. Therefore it is straightforward to define the desired transformation as with the transformation matrix T = (e 1 , e 2 + e 3 , (e 2 -e 3 )·i). We can get the reaction rates of the transformed system by calculating Because all of the eigenvalues (22a)-(22b) have non-zero real parts, the Hartman-Grobman theorem states that there exists a local transformation z = H(ζ) such that the system in ζcoordinates obeys the differential equation Because in the ζ-system there exists a slow manifold spanned by the eigenvectors corresponding to the two eigenvalues with real parts having low absolute values, in the zsystem there exists a slow manifold, too (see [19]). We now search the function z 1 = m(z 2 , z 3 ) which describes the dependence of the states on the manifold. We approximate this function with a truncated Taylor series expansion: We substitute (31) in z 1 = v z,1 and can therewith calculate the coefficients a j, k (see Table 2). The resulting manifold in the original x-coordinates is shown in Figure 7. The figure also includes an example trajectory. As can be seen, this trajectory first converges exponentially fast to an -neighborhood of the slow manifold and afterwards moves on it towards the EP.
The slow manifold is attractive enough so that realizations of the stochastic system (25) will stay close to it. Due to this, it is sufficient to just take the oscillations on the slow manifold into account and therewith simplify the problem to two dimensions. We substitute the formula of the slow manifold (31) in the differential equations for z 2 and z 3 (29) and get a two dimensional reduced description of the system to which Theorem 1 can be applied.

Calculation of the Density Distribution
As a first approach to obtain the stationary density distribution P (x), we make use of a linear approximation of the reduced system (32) around the EP. Taking this approach allows us to evaluate how well our method works with such an approximation, where the stationary density distribution can be obtained with minimal computational effort. For other systems, or if a high precision of the result is required, it could however also be necessary to solve the nonlinear Fokker-Planck equations numerically.
The linear approximation of the reduced system (32) is given by .
From the coordinate transformation T applied to the original system, Σ η is determined by Following the same approach as in the previous example, the stationary density distribution is obtained as for a cell with a volume of V = 1 pl (see Figure 8).

Determination of the Amplitude Distribution
With the preliminary work of the preceding sections we are now able to calculate the amplitude distribution according to Theorem 1. We consider the states x χ = (c, a, b) T with a, b R not yet specified. These states are transformed into the z-coordinates by The probability of the amplitude c can now be calculated with the formula under the constraint that z c has to lie on the slow manifold. This corresponds to that z c has to satisfy (31), which gives us a dependency of b on a. This dependency can be obtained numerically by solving a convex optimization problem.
The resulting amplitude distributions in each of the three original coordinate directions x i , i {1, 2, 3} are shown in Figure 9. For comparison, the figure also contains measured amplitude distributions, which are the results of a long term simulation via the Direct Gillespie approach with Dizzy with a simulation over 900000 steps.
As can be seen in Figure 9, the simulation results fit the calculated predictions very well. However, it seems that we tend to underestimate the amplitude of the oscillations by a small amount. This can be explained as a result of Lemma 1, which states that the amplitude of a realization going through the statex lies with a high probability in [x 1 ,x 1 + δx 1 ], whereas we estimate its value by the lower endx 1 of this interval (see appendix). The tendency to underestimate the amplitude seen in Figure 9 seems to be a direct result of neglecting the value of δx 1 . Further discussion of this point can be found in the conclusions.
We also want to mention that we compare the results obtained from our method to stochastic Gillespie simulations, and not to realizations of the Langevin equation (24). The reason herefor is that Gillespie simulations better predict the behavior of biochemical networks and are thus the method of choice. However in the derivation of the amplitude distribution we approximated the stochasticity of the system using white noise terms. The results of both methods to describe the intrinsic noise of the system may under certain circumstances lead to different results, which may be a further explanation of the small bias between the measured and the theoretical predicted amplitude distribution in Figure 9. Furthermore some bias may be explained due to the linearization of the system around its steady state.

Numerical Approach
Sometimes it is not adequate anymore to analyze the linearized system and one has to analyze the original nonlinear one. Although there might be several special cases where the probability distribution P of a system of nonlinear stochastic differential equations is analytically computable, this is not possible in the general case. As a consequence there is only the possibility to obtain the solutions numerically. To give an example how to solve a problem with the numerical approach, we decided to analyze the same system as in the preceding section, as defined in (24), except of changing the cell volume to V = 0.017 pl, which corresponds to 1 60 of the original cell volume. Because the variance of the disturbances is highly dependent on the amount of entities of the different protein species, the shape of the solution changes dramatically in a way that we cannot get good results anymore by analyzing the linearized system.
To calculate the density distribution of the nonlinear system we have to solve the nonlinear Fokker-Planck equation (11). We therefore applied the algorithm developed in [20]. The shape of the density distribution, which is non-Gaussian, is shown in Figure 10. From the density distribution P (x), we can compute the amplitude distribution P A (c) with the method proposed in this paper. In this example, we analyze the amplitude distribution P A (c) of x 2 for peak amplitudes below the equilibrium point values, i.e. e = (0, -1, 0) T , because then the effect of the nonlinearities is even higher than in the other direction. The reason for this is quite simple: The amount of entities of each species is not allowed to become negative. As can be seen in Figure 11 the amplitude distribution, as obtained by applying Theorem 1, quite well fits the predicted results gained from a long time simulation while the predicted values of the linearized system are far away from these values.

Conclusion
We introduce a method to determine the amplitude distribution of a wide class of linear and nonlinear stochastic systems given by (1), which display sustained stochastic oscillations. The method is applicable to systems where a stationary density distributions exists and can be computed either analytically or numerically.
The method is based on computing the flux density of realizations for the states where the tangent on the level curve of the density distribution is normal to the direction in which the oscillations are measured. We showed that under certain conditions this flux density is directly proportional to the probability of an oscillation with an adequate amplitude to occur. Our results can be used in the analysis of systems being influenced by strong internal or external noise, as we often find them in biophysical problems.
As already discussed at the end of the second example, for certain systems the calculated amplitude distribution may contain a small bias depending on the exact structure of the system Figure 11 Nonlinear amplitude distribution of the oscillations of the MAP kinase signaling cascade. Nonlinear amplitude distribution of the oscillations of the MAP kinase signaling cascade in the negative x 2 -direction. The solid curve corresponds to data experimentally measured, the dashed red curve is the prediction according to the calculations in this section and the dash-dotted green curve is the outcome we get by calculating the amplitude distribution of the linearized system. and on how well the the assumption necessary for the application of our method are satisfied. For the wide class of nonlinear systems analyzed we could obtain as a result that realizations being at statex will most likely have an amplitude lying in [x 1 ,x 1 + δx 1 ] as stated in Lemma 1. By requiring δx 1 to be small we justified to approximate the amplitude byx 1 . However, if for specific systems the distribution in [x 1 ,x 1 + δx 1 ] can be calculated or estimated, it would be possible to reduce the bias for systems where δx 1 is small but not negligible. Such an extension would lead to a more refined approximation of the amplitude distribution.
Stochastic oscillations may occur in biological systems not only as a disturbing side effect, but also in a constructive manner [21,22]. This is not because stochastic oscillations have great benefits over more "traditional" types of oscillations like deterministic limit cycles, but due to the fact that there seems to be no reason for a preference of deterministic limit cycles. In biophysical systems, the type of oscillations we study in this paper often occurs at parameter values in the vicinity of a Hopf bifurcation in the deterministic part of the model [23]. This is important in the field of robustness analysis of biological networks [24], because stochastic oscillations can possibly improve the robustness of oscillations in a network. In this respect, our method to compute the oscillation amplitude may be helpful in order to characterize a parameter region in which either deterministic or stochastic oscillations of a comparable amplitude occur robustly.

Appendix
Proof for Lemma 1 In this section we give a short proof for Lemma 1. We therefore consider the level curve defined byˆ( )P P x = 0 of the density distribution going through the statex X .
In the following we argue that there exists a δx 1 such that for a high probability an arbitrarily chosen realization of (1) being at statex at time t won't reach a state x with x 1 >x 1 + δx 1 until the next oscillation and therefore the measured amplitude will lie in [x 1 ,x 1 + δx 1 ] with δx 1 positive and small but yet not further determined (see Figure 2). We first calculate the Taylor series expansion for P up to the order of two: If we stay on a level curve, it must hold that δP = 0 and δx 1 has to be a function of δx 2 locally aroundx . This function is approximated with another Taylor series expansion up to order two as with k 1 , k 2 yet unknown constants. By substituting (43) in (42) we get Because the coefficients of δx 2 and δ x 2 2 must vanish independently for (44) to hold, it follows that k 1 = 0 and k 2 = − κ 2a , giving This means that we may approximate a level curve of the density distribution locally by a parabola. For small deviations and small times t it is possible to make some approximations for (1). First we may approximate Σ (x) in an area aroundx by the constant We can rewrite the norm of the first row elements ofΣ Σ bŷˆ, which can be easily validated by the definitions of L and a given in the equations (8). We furthermore approximate the second element of f(x) in an area aroundx by ν : Putting everything together we get a one dimensional Wiener process for the movement of (1) in the direction of x 2 for small times Δt: with Δx(t) = x(t) -x . We approximate the average displacement in the direction of x 1 by combination of the average displacement of this Wiener process (〈 x 2 (t)〉 = νt) with equation Because the stochastic part of (49) may be neglected (see Assumption 4), we can get a linear one dimensional stochastic equation for the movement of (1) in the direction of x 1 for small times t: It can be shown [10, p. 129 ff.] that the variance of (51) evolves as var( ( )) . Δx t L a t 1 2 2 ≈ (52) We may think of the solution of (51) as a growing Gaussian distribution moving along the trajectory of the deterministic system (see Figure 2). We now determine the maximal value of Δx 1 (t) that a realization of (1) starting atx may have if it evolves in an area around the average displacement determined by the standard deviation var( ( )) Δx t 1 . The maximal value of Δx 1 (t) is time dependent and grows in the beginning due to the stochastic part of (51), but afterwards shrinks due to the deterministic part. It reaches its maximal value when the time derivatives of the mean value and of the standard deviation have the same absolute value: We can now calculate δx 1 to For a Gaussian distribution it is true that 68.3% of all realizations stay in an area around the average displacement determined by the standard deviation. This means that only x . Because it is also possible for a realization to have its maximal value in the direction of x 1 before passingx , the overall probability for a realization to have an amplitude lower thanx 1 + δx 1 is In the same way it can be shown that 95.5% of all realizations have an amplitude lower than x 1 + 2 2 3 1 δ x .
Proof for Lemma 2 Assume a small unit cell with the edge lengths dr and d with its center being x and the level curve of P through x given and going through the center of the edges with the lengths dr (see Figure 3). If dr and d are small, we may assume the density distribution P and the values f(x) and Σ(x) of (1) as constant inside the unit cell. Due to the small size of the Lie-derivative of P along f, the movement normal to the level curve is dominated by the stochastic part of (1). The net fluxes j 1 and j 3 trough the edges 1 and 3 are vanishing and therefore the overall flux is tangential to the level curve. The net flux j 2 (j 4 ), trough the edges 2 (the edge 4) can be determined by integration (see Proof for Theorem 1 Because the density distribution P is smooth and the curvature of its level curves does not change its sign, there exists exactly one statex X withx 1 = c for every c (0, ∞).X is a smooth simple curve in the state space which can be parameterized bŷ with n 0 ( ) x the normalized normal vector onX inx . We may determine the probability P A (c) of the amplitude c (0, ∞) by multiplying the flux density of realizations ϕ( ) x with Ω