Heavy-tailed distributions in a stochastic gene autoregulation model

Synthesis of gene products in bursts of multiple molecular copies is an important source of gene expression variability. This paper studies large deviations in a Markovian drift-jump process that combines exponentially distributed bursts with deterministic degradation. Large deviations occur as a cumulative effect of many bursts (as in diffusion) or, if the model includes negative feedback in burst size, in a single big jump. The latter possibility requires a modification in the WKB solution in the tail region. The main result of the paper is the construction, via a modified WKB scheme, of matched asymptotic approximations to the stationary distribution of the drift-jump process. The stationary distribution possesses a heavier tail than predicted by a routine application of the scheme.


Introduction
Bursty production of gene products (mRNA or protein molecules) makes an important contribution to the overall gene expression noise [1][2][3][4]. Bursts can be modelled as instantaneous jumps of a random process. Burst sizes have been suggested to follow geometric (in a discrete process) or exponential (in a continuous process) distributions [5,6]; we focus on the latter. Production of gene products is balanced by their degradation and/or dilution. Combining randomly timed and sized production bursts with deterministic decay leads to a Markovian drift-jump model of gene expression [7][8][9][10]. More fine-grained models of gene expression are based on a purely discrete [11][12][13][14][15] or a hybrid discrete-continuous state space [16][17][18][19][20]. The drift-jump model can be derived from the fine-grained processes using formal limit procedures [21][22][23][24][25][26].
In its basic formulation, the drift-jump model for gene expression admits a gamma stationary distribution [27]. The model possesses an explicit stationary distribution also in the presence of a Hill-type feedback in burst frequency [28]. Such regulation can result from common transcriptional control mechanisms [29]. In addition to feedback in burst frequency, there is evidence of feedback mechanisms that act on burst size or protein stability [30][31][32]. As a specific example of regulation of burst size, the RNA binding protein Puf3 destabilises the mRNA (hence shortening bursts of translation) of COX17 [33]; a synthetic gene encoding for the Puf3 protein while containing untranslated regions of the COX17 gene implements the desired feedback loop [34]. The explicit stationary solution to the drift-jump model has been extended to the case of feedback in protein stability [35]. However, in case of feedback in burst size, an explicit solution is unavailable, save for the special case of Michaelis-Menten-type response [36].
The near-deterministic regime of frequent and small bursts can be analysed using the Wentzel-Kramers-Brillouin (WKB) method; the WKB-approximate solutions closely agree with numerically obtained exact distributions even at moderate noise conditions [37]. Bursty production has been formulated and analysed with the WKB method also in the discrete state space [38][39][40][41][42]. Similar approaches have earlier been used in queueing systems [43,44]. The standard WKB-type/diffusion-like results are guaranteed to apply for jump-size distributions with super-exponentially decaying tails [45]. Contrastingly, in the sub-exponential case, large deviations are driven by single big jumps [46,47]. The exponential case can combine both phenomena for random walks: the Cramer/WKBtype result applies in a region of sample space called the Cramer zone, while single big jumps contribute to deviations beyond the Cramer zone [48,49].
In this paper, the standard WKB-type approach will be shown to be suitable for the drift-jump gene expression model with positive feedback in burst size. If the feedback is negative, the WKB-approach will be shown to apply below a certain threshold (referred to, by analogy with random walks, as the Cramer zone), whereas beyond the threshold (referred to as the tail zone) single big jumps contribute to large deviations. Matched asymptotic approximations to the stationary distribution in the Cramer zone, in the tail zone, and on their boundary will be constructed using a singular perturbation approach [50][51][52][53].
The structure of the paper is as follows. Section 2 formulates the model. Section 3 presents the standard WKB approximation scheme. The core of the paper is section 4, in which the modified WKB scheme is given. The boundary layer is treated in section 5. The asymptotic results are cross-validated by simulations in section 6. The paper is concluded in section 7.

Model formulation
The drift-jump gene-expression model is a Markov process with piecewise continuous sample paths ( figure 1, left). The state x of the process represents the concentration of a gene product (say a protein, for concreteness). The discontinuities in the sample path are the production bursts. Between bursts, the protein concentration decays deterministically with rate γ(x), i.e. as perẋ = −γ(x). Bursts occur with state-dependent frequency (propensity) ε −1 α(x). Burst sizes are drawn from an exponential distribution with rate parameter ε −1 ν(x), in which x is the state of the process immediately before the burst; the reciprocal ε/ν(x) of the rate parameter gives the mean burst size. Decreasing the noise strength ε makes bursts more frequent and smaller. The functions α(x), ν(x), and γ(x) can implement feedback in burst frequency, burst size, and protein stability ( figure 1, right).
The probability density function p(x, t) of being at state x at time t satisfies the integro-differential equation In the conservation equation (1), J = J(x, t) gives the flux of probability across a reference state x at time t. By (2), it consists of a negative local flux due to deterministic decay and a positive non-local flux due to stochastic bursts. The non-local term integrates, over all states y < x, the probability ε −1 p(y, t)α(y) that a burst occurs multiplied by the exponential probability that the burst goes beyond the reference state x. Estimating the integral in (2) by the Laplace method [54] as ε → 0, we obtain J ∼ (α(x)/ν(x) − γ(x))p(x, t), which is the probability flux of a purely deterministic process Equation (3) is the deterministic limit of (1) and (2) (sometimes also referred to as the fluid limit or the law-of-large-numbers limit). Retaining a further term in the asymptotic expansion of the non-local term leads to an ad-hoc drift-diffusion approximation to the drift-jump process [55]. Such truncations exhibit different ε → 0 asymptotics than the original problem [56]. Equating the flux in (2) to zero, we obtain a Volterra integral master equation for the stationary distribution. If ν(x) is constant (no feedback in burst size), then the integral kernel in (4) is both separable and convolution-type [57], and the equation can be solved by differentiation [35] or the Laplace transform [7]. If ν(x) is non-constant, the kernel is neither separable nor convolution-type, and a general solution seems to be unavailable. Multiplying a solution p(x) to (4) by a constant gives another solution. Below, we derive asymptotic approximations to a solution with an arbitrary choice of the multiplication constant, which does not necessarily integrate to one. Normalisation is performed before these approximations are cross-validated by kinetic Monte-Carlo simulations in section 6.

Standard WKB scheme
We seek an approximate solution to (4) in the WKB form where a regular dependence of the prefactor on ε is postulated. The function Φ(x) in (5) is referred to as the (quasi)potential. The ansatz (5) and (6) is equivalent to the frequently encountered alternative form p(x ; ε) = exp(−ε −1 (Φ 0 (x) + εΦ 1 (x) + ε 2 Φ 2 (x) + · · ·)) of the WKB expansion [58]; the correspondence between the terms is given by Inserting (5) into (4) gives where Differentiating (8) with respect to y and setting y = x gives the relations which tie up the local behaviour of Ψ(x, y) near the boundary y = x and that of the (yet unknown) potential. Provided that the dominant contribution to the integral on the right-hand side of (7) comes from an O(ε)-wide neighbourhood of the right boundary. Estimating the integral in (7) by the Laplace method and cancelling the common exponential term gives Inserting (6) and (9) into (11), and collecting O(1) terms, yields the potential while collecting O(ε) terms determines the prefactor (appendix A) The constants of integration in the indefinite integrals in (12) and (13) add up to the normalisation constant in the probability distribution (5) and can be chosen arbitrarily.
The weak point of this section is the assumption (10). Combining (8) and (12), we see that If ν(x) is decreasing (positive feedback case), then ∂ y Ψ(x, y) < 0 for y x, which confirms (10) post hoc. If ν(x) is constant (no feedback in burst size), then r 0 (x)exp(−Φ(x)/ε) given by (12) and (13) is the exact solution to (4) [35]. The case of an increasing ν(x) (negative feedback in burst size) is the subject of the rest of the paper; it requires a modification in the WKB scheme.

Modified WKB scheme
Since the derivation of (12) involved a local estimate of the integral in the Volterra master equation (7), we refer to the function Φ(x) as the local potential. We assume that it satisfies the conditions An example of a local potential which satisfies (15) is given by the parametric choice The constancy of the burst frequency and the linearity of the decay rate in (16) means that feedback occurs only in the burst size. The coefficient m can be interpreted as the number of protein molecules that need to cooperate to repress a production burst.
Although the analysis of this section is performed for a general parametric choice, all graphical examples pertain to the choice (16). Specialised calculations for (16) can be found in appendix B. We define the modified potential bỹ where Ψ(x, y) is given by (8) and (12). Since Ψ(x, x) = Φ(x), we immediately obtaiñ Φ(x) Φ(x). Graphical examination shows that there is a critical point x * such that Φ(x) = Φ(x) for x x * , whereas for x x * the graph ofΦ(x) is the envelope of rays x → Ψ(x, y) parametrised by y (figure 2, left). The region x < x * will be referred to as the Cramer zone, and the region x > x * as the tail zone. The critical point x * is the supremum of all x for which (10) holds. It follows that at x = x * , an internal minimiser y = y * < x * of (17) exists in addition to the boundary minimiser y = x * , so that Equations (18) determine the critical pair (x * , y * ) uniquely (figure 2, right).
In the tail zone, where y m (x) is the internal minimiser, which satisfies Note that y m (x * ) = y * and y m (x) < 0. The potential derivative satisfies The first equality in (21), which states that the envelope is tangential to the ray that forms it, is a simple example of an envelope theorem [59].
The local potential and the envelope of rays intersect at x = x * transversally with slopes Φ (x * ) =Φ (x − * ) >Φ (x + * ): the modified potential has a discontinuous derivative at x = x * . Convexity of Φ(x) and concavity ofΦ(x) imply Φ (x) >Φ (x) for x > x * . These comparisons, together with (12) and (21), lead to the inequality which will be important in the subsequent analysis.
If we look for a solution p(x ; ε) to (4) in a form that is logarithmically equivalent to exp(−Φ(x)/ε), then the integrand on the right-hand side is logarithmically equivalent to exp(−Ψ(x, y)/ε), wherẽ For x > x * , we have (appendix C) Notably,Ψ(x, y)-as function of y ∈ (0, x]-is minimised both on the right boundary and internally, whereas Ψ(x, y) is minimised only internally for x > x * . The integral on the right-hand of (4) side will be logarithmically equivalent to exp(−min y∈(0,x]Ψ (x, y)/ε) = exp(−Φ(x)/ε), which is the asymptotics postulated for the solution. The use of the modified WKB potential (17) thus leads to a desired balance between the sides, at least to a logarithmic precision, of the master equation (4). Important contributions to the integral term in (4) come from the neighbourhoods of the minimisers y = y m (x) and y = x ofΨ(x, y) for a fixed x > x * . The function is locally parabolic near the internal minimiser y = y m (x) < x * , but it is locally linear near the boundary minimiser y = x > x * . By the Laplace method [54], an O(ε 1/2 ) neighbourhood of the parabolic minimiser, but only an O(ε) neighbourhood of the linear minimiser, contributes. The internal minimiser lies in the Cramer zone, where the standard procedure of section 3 yields with the prefactor defined by (13). In order to balance the contributions of the internal and boundary minimisers, we compensate at the level of prefactor, seeking the solution outside the Cramer zone in the form of Inserting the WKB expansions (25) and (26) into the Volterra master equation (4), we find that for a δ ε 1/2 we have Estimating the integrals by the Laplace method, cancelling the common exponential term, and collecting at the leading order, we obtain Differentiating (23) with respect to y and using (21) gives We set y = x into (29) and insert the result into (28), arriving at Inequality (22) ensures that the denominator in (30) is positive.
In the next section, we complete the approximation scheme by constructing an inner solution in a neighbourhood of the Cramer boundary x = x * that matches (25) to the left and (26) to the right.

Boundary layer
The discontinuity in the potential derivative and the mismatch of prefactor magnitudes in (25) and (26) suggest the presence of a boundary layer near x = x * . We define the inner variable ξ via the transformation where the constant κ > 0 will be specified later. Qualitatively, as x increases towards x * , the integral in the Volterra equation (7) begins to feel the 'ghost' of the internal minimum of Ψ(x * , y) at y = y * ; the local approximation scheme of section 3 breaks down before x * is reached.
The inner solution is sought to be proportional to a regular function of the inner variable: Inserting (32) into the Volterra master equation (4) and seeking a dominant balance between terms gives (appendix D) Collecting the leading-order terms in the expansion of (4), the unknown f(ξ) is found to satisfy an inhomogeneous Volterra equation (appendix D). Unlike the original equation (4) for p(x ; ε), the Volterra equation for f(ξ) has a separable kernel, and is readily turned into a differential equation with a general solution (appendix D) The constant multiplying the particular solution in (34) is found by the method of undetermined coefficients; the constant of integration A multiplying the homogeneous solution remains undetermined at this stage.
The ξ → ∞ asymptotics of (32) agree with the x → x * behaviour of the tail-zone WKB solution (26) with arbitrary choices of the integration constant A and the constant κ in the offset of the boundary layer (31) (appendix E). In order to determine the two constants, the ξ → −∞ asymptotics of (32) need to be matched to the x → x * behaviour of the Cramer-zone WKB solution (25). By inequality (22) with y m (x * ) = y * , the first term in (34) dominates as ξ → −∞, so that in the overlap of the inner solution and the outer (Cramer-zone WKB) solution. On the other hand, inserting (31) into the outer solution (25) gives Comparing (36) to (37) yields inequality (22) thereby guarantees that κ > 0 as advertised at the beginning of the boundary-layer analysis. Equations (38) complete the inner solution and thus the asymptotic analysis of (4).

Numerical solution
Before being compared to a numerical solution, the asymptotic solutions are normalised by which is calculated by numerical quadrature.
For the numerical solution, sample paths x i (t), i = 1, . . . , N , 0 t T , subject to x(0) = x 0 are generated using the exact stochastic simulation algorithm (see appendix F). The solution is constructed by the histogram method from the dataset of final-time values {x i (T )} i=1,...,N . Specifically, we divide an interval [0, x max ] into n equally sized bins, count the number of data in each bin, and divide the counts by Nx max /n so as to normalise into a probability density. The histogram estimate is close to the exact solution p(x ; ε) to the Volterra master equation (4) if the number of samples N is large (so that the statistical error is small) and the simulation end time T is large (so that the process equilibrates to steady state). Figure 3 compares the three matched asymptotic approximations to the numerical solution for selected values of the noise strength ε. Decreasing ε leads to a close agreement between the numerical solution and the asymptotic approximations in their respective regions of validity (figure 3, top panels). As ε decreases further (figure 3, bottom panels), the Cramer-boundary and tail behaviour become exponentially improbable, and cannot be reliably estimated from a feasible number (say a billion) of samples. Nevertheless, the chosen examples demonstrate that the naive solution, which extends (25) outside the Cramer zone, underestimates the tail of the stationary distribution, whereas the alternative approximations provide an adequate description.

Conclusion
This paper provides matched asymptotic approximations to the stationary distribution of a drift-jump model for stochastic gene expression. The analysis revolves around the estimation of the integral term in the Volterra master equation (4). The integral term represents the flux of probability due to production bursts through a reference state x. In the Cramer region (x < x * ), the flux consists solely from local contributions (y ≈ x), whereas in the tail region (x > x * ), a contribution comes also from within the interval. The latter corresponds to the 'single big jumps' advertised in the abstract.
Negative feedback in burst size is a prerequisite for the singular behaviour in question. Conceptually, in the presence of negative feedback in burst size, it is 'cheaper' to hunker down and then take a giant leap, than to climb up with tiny steps. The result is thus in agreement with the broad principle that any large deviation occurs in the least unlikely of all the unlikely ways [60].
While the power-law non-linearity is the principal example of this paper, the popular Hill function can be reduced to the power non-linearity by an explicit transformation [61]. For feedback responses that do not satisfy the constraints introduced in section 4, one expects that multiple disconnected Cramer zones may exist, in which the potential is evolved locally, and which are interspersed by zones in which the potential is formed by an envelope of rays.
It is also expected that the current methods can be applied to the discrete framework, if this is extended so as to include feedback in burst size, with burst sizes drawn from the geometric distribution [5]. More widely, the current results can be pertinent to other fields, e.g. to the Takács equation for the amount of unfinished work in an M/M/1 queue, or other jump processes with jump measures with exponential tails [44].
Earlier studies argue that the subtleties that arise with feedback in burst size are an artefact of delay [36,37]. Indeed, the memoryless property of the exponential distribution of burst sizes implies a lack of control at the infinitesimal timescale of burst growth. In light of this argument, the results contribute to the understanding of the interplay between bursting and delay [63][64][65][66][67].

Appendix B. The power-law example
For the choices (16), the Cramer-zone potential (12) and prefactor (13) are given by The rays (8), whose envelope constitutes the modified potential, are given by The derivative of the rays with respect to the envelope parameter y is given by (14), which is The critical pair (x * , y * ) is the solution of the non-linear system (18), which here takes the form The author solved (47) iteratively, starting from an initial guess x = 1 + 1/m and y = 1, using Broyden's first Jacobian approximation method [68]. In order to find the internal minimiser y = y m (x) of Ψ(x, y), we are required by (20) to solve for a given x x * the equation The author did this in two steps: first, he found x i corresponding to the values y i = y * (1 − i/I), i = 0, 1, . . . , I − 1, by substitution into (48); second, he used a cubic spline to interpolate y = y m (x) between y i−1 = y m (x i−1 ) and y i = y m (x i ), i = 1, . . . , I − 1. The modified WKB potential and prefactor are calculated by substituting the spline of y = y m (x) into (19) and (30). The formulae (30) and (35) for the tail-zone WKB and boundary-layer solutions evaluate ∂ 2 y Ψ(x, y) at the internal minimiser y = y m (x). Differentiating (46) gives At y = y m (x) the first derivative (46) of Ψ(x, y) vanishes, and the first term in (49) simplifies to (m − 1)y −2 ; therefore, Note that y m (x) < y * < 1, so that (50) is positive, as required in (30) and (35).