Stochastic gene expression conditioned on large deviations

The intrinsic stochasticity of gene expression can give rise to large fluctuations and rare events that drive phenotypic variation in a population of genetically identical cells. Characterizing the fluctuations that give rise to such rare events motivates the analysis of large deviations in stochastic models of gene expression. Recent developments in non-equilibrium statistical mechanics have led to a framework for analyzing Markovian processes conditioned on rare events and for representing such processes by conditioning-free driven Markovian processes. We use this framework, in combination with approaches based on queueing theory, to analyze a general class of stochastic models of gene expression. Modeling gene expression as a Batch Markovian Arrival Process (BMAP), we derive exact analytical results quantifying large deviations of time-integrated random variables such as promoter activity fluctuations. We find that the conditioning-free driven process can also be represented by a BMAP that has the same form as the original process, but with renormalized parameters. The results obtained can be used to quantify the likelihood of large deviations, to characterize system fluctuations conditional on rare events and to identify combinations of model parameters that can give rise to dynamical phase transitions in system dynamics.

In stochastic systems, it is often of great interest to characterize the fluctuations that give rise to rare events. For example, a recurring theme in current biological research is rare events leading to phenotypic variation in clonal cells [1]. Prominent examples include latency in HIV-1 viral infections [2], sporulation in bacteria [3], and reversible drug tolerance [4] in subpopulations of cancer cells. In several cases, the corresponding rare phenotypic transition is primarily driven by the intrinsic stochasticity of gene expression. These observations provide strong motivation for analyzing rare large deviations in stochastic models of gene expression [5,6].
The development of a framework for analyzing rare events in stochastic gene expression needs to take into account multiple factors. Single-cell experiments indicate complex mechanisms underlying bursting (i.e. long periods of inactivity punctuated by shorter periods of activity) in gene expression [7], motivating the study of general stochastic models with multiple promoter states [8,9]. Furthermore, the random variables whose rare fluctuations are of interest can be diverse, ranging from promoter activity fluctuations [3] to the fraction of time spent in specific promoter states [2]. These observations motivate the analysis of a general class of gene expression models that can accommodate arbitrary complexity in promoter dynamics and bursting. The development of an analytical framework for rare events in such models can be used to address several questions of current interest: (1) How do combinations of underlying model parameters control the likelihood of rare events? (2) How can we characterize fluctuations in the system conditioned on the occurrence of a rare event? (3) Can we determine the changes in dynamical model parameters that mimic the effects of rare fluctuations?
Recent developments in nonequilibrium statistical mechanics using large deviation theory provide a framework for addressing such issues [10,11,12]. In particular, considerable interest has focused on deriving conditioning-free Markov processes, termed driven processes, whose statistics reproduce the fluctuations of the original Markov process conditioned on the occurrence of a rare event [13,14]. So far, nontrivial examples which explicitly characterize the driven process for infinite dimensional systems have been limited. Furthermore, it is of interest to determine systems for which the stochastic generator for the driven process has the same structure as the original process [15]. In the following, we combine large deviation theory framework with tools from queueing theory [16,17], to obtain analytical formulas for the statistics of rare events in a general class of stochastic models of gene expression. We find that the conditioning-free Markov process that generates rare fluctuations generically maintains the same structure as the original Markov process.
Model.-We consider gene expression from a promoter with N internal states, labeled i = 1, . . . , N , as illustrated in figure 1. The promoter makes random transitions among its N states switching from j → i with rate α ij . In each state, bursts of gene expression leading to the production of mRNAs occur with rates k i , and burst sizes n drawn from a state-dependent distribution b i (n). In the limit that protein degradation rates are much smaller than mRNA degradation rates, a widely-used approximation involves assuming that proteins are created in random instantaneous bursts from each mRNA [18,19,20,21]. Given the validity of this 'bursty protein synthesis' approximation, the model considered in figure 1 can also be used to represent gene expression at the level of proteins.
The statistical state of the system is specified by a vector whose elements are the probabilities p i t (M ) for the promoter to be in state i at time t having produced a total of M mRNAs. The dynamics is an example of what is known in queueing theory as a Batch Markovian Arrival Process (BMAP) [22], whose evolution is specified by the master equatioṅ where the N × N matrix D 0 is the transition rate matrix for intra-promoter transitions, whereas D n (n ≥ 1) is an N × N diagonal matrix whose nonzero elements specify the rate of creating n mRNA in a burst: The diagonal elements D ii 0 enforce probability conservation. While the representation of the dynamics in terms of the matrices D n is convenient, the full dynamics occur on the infinite-dimensional space spanned by the states (i, M ). Thus, the generator matrix for the master equation (1) is infinite dimensional, which is a formulation of the dynamics we will need in the following.
Finally, note that if we focus on promoter dynamics alone, the state of the system is specified by the vector p t = M p t (M ). The corresponding master equation has the generator D = n D n . In this case, the dynamics will be referred to as promoter-only dynamics, whereas the dynamics in (1) is the full system dynamics. In the following, unless otherwise stated, we will assume that the matrix D is irreducible.
Large deviations.-Large deviation theory for Markov processes specifies the relative likelihood to observe large fluctuations in trajectory observables [12,14]. We will be interested in the fluctuations of the class of trajectory random variables where the sums extend over all transitions along a random realization of the BMAP, with each transition weighted by the parameters g and the time spent in each state weighted by f . Notable examples are the promoter activity, that is the number of mRNAs/proteins produced up to time t (f i = 0 and g n ij = n δ ij ), or the total time spent in promoter state i (f j = δ ij and g n ij = 0). Now, the law of large numbers tells us that for large t, the rate a = A t /t, will approach its average valueā = lim t→∞ A t /t with near unit probability. However, rare large deviations from this mean value are possible, and can be quantified, because the probability density of a satisfies a large deviation principle for large t [11,23], with large deviation rate function I(a). One can view this as an extension of the central limit theorem, quantifying not just the Gaussian fluctuations about the typical value (given by the minimum of I), but also the relative likelihood of rare fluctuations. The theory of large deviations establishes that one way of obtaining the rate function I(a) is by evaluating the (scaled) cumulant generating function (SCGF) [23,11] where · t denotes an average over full system trajectories of duration t. The rate function can then be recovered using the Legendre-Fenchel transform [23,11] The function I(a) (ψ(λ)) is the nonequilibrium analog of the entropy (free energy) in equilibrium systems. It is important to note that (7) only gives the convex hull of I, that is the smallest convex set that encompasses I [11]. When I is not strictly convex, that is having multiple local minima or having a linear part, the SCGF ψ is not differentiable everywhere, and is characterized by a nonanalyticity such as a kink. The appearance of such a structure in ψ is called a dynamical phase transition [11,24,25] in analogy with the nonanalytic behavior of the free energy characterizing phase transitions in equilibrium statistical mechanics.
A key insight from large deviation theory is that ψ(λ) can be obtained from the largest eigenvalue of a modified or twisted generator matrix (cf. (3)) with elements [13,14] D ij To obtain ψ from the largest eigenvalue ofD(λ), we have to diagonalize an infinitedimensional matrix. As we will show, the structure of BMAPs greatly simplifies this calculation, and reduces the computation to finding the dominant eigenvalue of an Ndimensional matrix, where N is the number of promoter states. The basic idea is to investigate the generating function G t (λ) = e −λAt t for A t , from which, in the long-time limit, the SCGF can be obtained as ψ(λ) = − lim t→∞ (1/t) ln G t (λ) (by definition (6)). We can express G t (λ) in terms of the joint probability at time t to be at (i, M ) having accumulated A of the trajectory observable, p t (M, A): where we have introduced the state-dependent generating function G t (M ; λ) = dA p t (M, A)e −λA . The key insight from large deviation theory is that G t (M ; λ) is the solution of the twisted dynamics [13] and its long-time behavior is controlled by the largest eigenvalue of its generator, the infinite-dimensional matrixD. The structure of BMAPs allows us to simplify this calculation significantly. To this end, let us solve for the long-time dynamics of G t by introducing Γ t (z; λ) = M z M G t (M ; λ), which from (11) evolves according to the simplified N -dimensional linear equationΓ whereD(z; λ) = n z nD n (λ). Let us defineD(λ) =D(1; λ) = nD n (λ). Notably, D(λ) is the N ×N generator for promoter-only twisted dynamics. The largest eigenvalue ofD(λ) controls the long-time evolution of Γ t (z = 1, λ) and thus gives us the SCGF. Indeed, denoting the largest eigenvalue ofD(λ) by −ψ(λ) (anticipating the conclusion), and substituting the solution of (12) into (10) gives us While the generator for the full system dynamics is infinite dimensional, the SCGF is obtained from the dominant eigenvalue of the N -dimensional matrixD(λ) for a model with N promoter states, a substantial simplification.
Conditioning on rare fluctuations.-Suppose we want to know what typical trajectories of the system look like, conditioned on observing a (possibly rare) rate a for the random variable A t . Recent research in nonequilibrium statistical mechanics has shown that such information is encoded in the left eigenvectorL(λ) corresponding to the dominant eigenvalue ψ(λ) of the twisted generatorD(λ) [14,25,26]. Again, the structure of BMAPs leads to a significant simplification in determining this left eigenvector. Denotingl λ as the left eigenvector corresponding to ψ forD(λ), we have that the infinite-dimensional left eigenvectorL(λ) = {l(λ), . . . ,l(λ)} is formed from repeated blocks ofl(λ), due to the repeated block structure of D (3). Using this in combination with recent results for the driven process, we find that the conditioningfree Markov process that reproduces the statistics of the original process conditioned on a has a generator with the same structure as (3), with the corresponding transition rate matrices given by [14,26] with λ * = −I (a). This surprising fact crucially depends on the class of random variables A t considered in (4), which count promoter transitions with a weight that is independent of the number of mRNAs. Furthermore, given that the random variable A t is derived solely from dynamics related to the production of mRNAs/proteins, the results for the driven process are independent of the decay dynamics for mRNAs/proteins. In particular, we note that the results for parameters defining the driven model will be the same for the case where the decay rate is µ = 0 (which does not have a well-defined steady-state) and for the case for finite µ (which does have a well-defined steady-state). By analyzing the equations determining the left eigenvectorl λ we find, remarkably, that the driven process corresponds to another BMAP with modified rates and renormalized burst distributioñ The above results, in combination with the equations determining the SCGF ψ(λ) and rate function I(a), constitute an analytical framework for characterizing rare events in a general class of stochastic models of gene expression.
Applications.-In the following, we will focus on applications of the preceding theoretical framework to the case of activity fluctuations (i.e. f i = 0 and g n ij = nδ ij ). Let us consider rare events corresponding to low activity a from a given promoter with a typical activityā. This correspond to a fold-reduction in the mean activity of r = a/ā. Letψ λ * with λ * = −I (a) denote the SCGF for the driven process. Then, by construction a =ψ λ * (0). As a result, the fold-change r can be expressed in terms of the driven process as, However, we can flip this equation around and consider it as an equation for λ * in terms of the fold change r. Once λ * is obtained by solving this equation, the driven process for a given fold-change r is completely determined. The driven process represents the most likely way the rare event in activity fluctuations occurs and thus is an ideal choice for generating distributions that can be used in importance sampling. Recent work, using a control theory perspective, has also shown that the driven process is the optimal way to to achieve the fold-reduction in activity while minimizing the 'distance', in the sense of relative entropy, from the original process [27,26]. Thus explicit characterization of the driven process is also critical in designing regulatory strategies for reducing promoter activity (thereby reducing mean mRNA/protein levels) that give rise to processes that are closest to the original process.
To illustrate the theoretical framework outlined above, we consider a widely used model of gene expression at the protein level [18,19,20]: geometrically distributed bursts (with mean b) of proteins arriving according to a Poisson process with rate k m . In this case, the burst generating function is g b (z) = z b−z(b−1) and correspondingly, we obtain that the SCGF ψ(λ) = k m − k m g b (e −λ ). For a rare event corresponding to a r-fold reduction in mean activity, using (17) we see that the corresponding value of λ * is obtained by solving Interestingly, we find that the renormalized burst distribution for the driven process continues to be a geometric distribution with the corresponding mean given bỹ This indicates that the optimal way to reduce activity involves arrival of geometric bursts with a reduced mean and a modified arrival rate. Interestingly, in previous work [28], we have shown that post-transcriptional regulation (e.g by miRNAs) can alter bursts of protein expression by reducing the mean but keeping the burst distribution geometric.
These results indicate that that optimal control (in the sense discussed above) can be achieved using a combination of transcriptional regulation and post-transcriptional regulation by miRNAs.
As a second illustration, we consider the two-state promoter model [29] depicted in figure 2 with burst size one. The SCGF for mRNA activity is readily obtained by diagonalizing the twisted transition rate matrix: Remarkably, ψ develops a nonanalyticity or kink in either the α → 0 or β → 0 limit. Specifically for β → 0, we plot in figure 2 which is nonanalytic for α < k m . The observed kink in the SCGF is a signature of a dynamical phase transition, akin to the multiphase behavior observed in photon counting statistics of a quantum two-level system [30]. Here, the phase transition occurs because, as the activity level corresponding to the large deviation is changed, there is a qualitative change in the fluctuations that give rise to the rare event. In particular, below a critical activity level, these fluctuations involve long sojourns in the 0 (OFF) state (with no activity) that correspond to a nonvanishing finite fraction of the total time t. We do note, however, that the nonanalytic behavior only arises when the matrix for promoteronly dynamics D becomes reducible. A detailed characterization of fluctuations in this limit will be presented in future work. In conclusion, we have presented a framework for the quantitative analysis of the probability of large deviations and for characterizing system fluctuations conditional on rare events during gene expression. The framework developed provides explicit analytical formulae determining the driven process for a general class of stochastic models corresponding to BMAPs. Our results demonstrate that the driven process corresponding to a BMAP is another BMAP with renormalized parameters. This property may also be present in a related class of renewal processes, which, while distinct, share a similar structure in the dynamics [31]. Since BMAPs are used to model a wide range of applications in science and engineering (e.g. computer and communications networks [22]), the results derived are expected to have diverse applications. In the context of models of gene expression, the results derived can be used to: (1) determine the probability of rare events corresponding to a broad class of random variables for general promoter models with arbitrary mRNA/protein burst distributions; (2) directly simulate rare system trajectories corresponding to a large deviation in the random variable of interest and to predict the corresponding mRNA/protein distributions conditioned on the rare event; (3) optimally control gene expression to achieve, for example, a specific fold-regulation while minimizing deviation from the unregulated process; and (4) determine regions in parameter space corresponding to nonanalytic behavior of the SCGF ψ(λ). The results obtained have thus opened new avenues for analyzing rare events in gene expression and have multiple applications ranging from importance sampling to optimal strategies for cellular regulation of gene expression.