EFFECTS OF ELONGATION DELAY IN TRANSCRIPTION DYNAMICS

In the transcription process, elongation delay is induced by the movement of RNA polymerases (RNAP) along the DNA sequence, and can result in changes in the transcription dynamics. This paper studies the transcription dynamics that involved the elongation delay and effects of cell division and DNA replication. The stochastic process of gene expression is modeled with delay chemical master equation with periodic coefficients, and is studied numerically through the stochastic simulation algorithm with delay. We show that the average transcription level approaches to a periodic dynamics over cell cycles at homeostasis, and the elongation delay can reduce the transcription level and increase the transcription noise. Moreover, the transcription elongation can induce bimodal distribution of mRNA levels that can be measured by the techniques of flow cytometry.

1. Introduction.Transcription is the first step of gene expression, through which genetic information encoded in DNA is transcribed to mRNA by RNA polymerases (RNAP).Transcription begins with the binding of RNAP to the promoter in DNA sequence, followed by the movement of the transcription bubble along the DNA sequence in which RNAP creates an RNA copy through base pairing complementarity with the DNA template, and finally the nascent RNA is released when RNAP reaches the termination sequence [27].The elongation is frequently interrupt by sequence-specific pauses that are thought to play important roles in this process and therefore cause transcription delays in gene expression [19,41].Elongation pauses allow precise controls in the transcription regulation and the rate of RNA chain synthesis [3,4,22,35,51].The typical rate of transcription elongation varies 1432 XUAN ZHANG, HUIQIN JIN, ZUOQIN YANG AND JINZHI LEI over a broad range of 10 − 100 nucleotides per second [10,14,26,32,46].The elongation delays significantly contribute to the timing mechanisms during development [45], and is tightly controlled by the rate of translation [32].Such a cooperation mechanism ensures that transcription is always adjusted to translational needs at different genes and under various growth conditions [32].
Gene expression is an inherently stochastic process due to the low copy number of DNA and mRNA molecules [5,7,13,16,28,31].The stochasticity of gene expression and other molecular processes contributes to heterogeneity in populations of isogenetic cells, for example, in phenotypic expression and differentiation pathway selections [1,8,43,49].Noise in gene expression can induce switching and coordinate cellular decision during developmental patterning [56].
The stochasticity in gene expression has been extensively studied in recent years, both experimentally and theoretically.Real-time kinetics of gene expression at single molecule level can be tracked by fluorescent proteins and microscopy techniques [7,50], and observations have shown obvious discrete transcriptional events [7,10,16,31].The processes of gene expression that associate with transitions between promoter states, production of mRNA and protein molecules and degradation of the molecules can be modeled with chemical master equations.The statistical properties of gene expressions, including promoter dynamics, average expression levels, noise strength and distributions have been well documented by both analysis and stochastic simulations [15,20,29,30,40,52,54,55]. Extrinsic noise in gene expression arise when there are perturbations in the growth condition.In this situation, both intrinsic and extrinsic noises are important for stochastic gene expression [13,44], and fluctuations in the kinetic parameters have to be considered in modeling the gene expression process [23,39].
Noise and delays are important parts of normal operating constraints in gene regulation networks.Recently, many studies have focused on the effects of elongation delay to the stochastic dynamics in gene networks [36,45,21].Stochastic gene expression models with delays in transcription have been studied by many researches [11,38,58,37] (refer [36] for a review).Stochastic simulations showed a linear correlation between the variance and mean of protein numbers [58].Ribeiro studied a delayed stochastic model of transcription at single nucleotide level that includes the promoter complex formation, pausing, arresting, disincorporation and editing, pyrophosphorolysis, and premature termination [34,37].Simulations based on their model showed that mRNA production exhibits bursting and pulses dynamics, and the elongation dynamics can shape the bursting transcription [11,34].In additional to studies of single-gene expression, delay and stochasticity are also incorporated into gene regulation networks, and the introducing of delay into gene network can produce significant dynamics such as oscillations and bi-directional switches [6,9,24,42,48].
Despite extensive studies of delay and stochastic gene expression, to the best of our knowledge, it remains unclear how elongation delay affects the stochastic properties of single gene expression.In this paper we study the effects of elongation delay to transcription kinetics of RNA production, both analytically and numerically.Main results in this study show that elongation delay can induce decreases in the transcription level and increase the noise strength.Furthermore, long elongation delay can induce bimodal distributions in RNA numbers even for a constitutive expression gene, and thus provides a novel mechanism of having bimodal distribution in gene expression.
2. Model and methods.Figure 1 shows the model of transcriptional process studied in this paper, which refers the computational models proposed in [37,57,58].In the model, transcription begins with the binding of RNAP to the promoter.RNAP engages in abortive initiation prior to the productive initiation, during which RNAP synthesizes the first ∼ 8 − 15 nucleotides of RNA and form an RNAPpromoter initial transcribing complex, and then escapes to enter the synthesis of RNA product as an RNAP-DNA transcription elongation complex [17].Abortive initiation is common for both eukaryotes and prokaryotes and a critical stage in order to clear the promoter and to release signals to the next initiation after the abortive initiation [12].During the process of RNA synthesis, the RNAP-DNA transcription elongation complex moves along the DNA sequence from 5' to 3', during which the double helix DNA molecule splits into two strands of unpaired DNA nucleotides, and one strand of the DNA is used as a template for RNA synthesis.RNAP adds matching RNA nucleotides that are paired with complementary DNA nucleotides.Soon after RNAP initiates transcription, the nascent RNA is modified by the addition of a "cap" structure at 5' [27].RNA transcription can involve multiple RNAPs on a single DNA template and multiple rounds of transcription.Upon reaching the end of a gene, RNAP stops transcription ("termination"), the newly synthesized RNA is cleaved and the polyadenosine tail is added to the 3' end [33].Here the dashed arrow τ means a process of time τ .Here we consider the situation of constitutive gene expression in which the promoter is always active and ready to bind to RNAP with a rate λ 1 .The abortive initiation starts once an RNAP is bound to the promoter to form a complex RNAP-Promoter.As a result of abortive initiation, RNAP escapes from the promoter to form an RNAP-DNA transcription elongation complex, and at the same time releases the promoter with a rate λ 2 .Once the RNAP-DNA transcription elongation complex is formed, the elongation process procedures automatically until the terminator is reached (at time τ of the elongation process).Termination of transcription occurs immediately at the terminator where both RNA product and free RANP are released.Hence, the RNA production rate is determined by the formation rate of elongation complex [37].RNA products degrade at a rate δ 2 .Here we assume that the number of total number RNAP molecule is large enough so that the number of free RNAP is approximately unchanged during the process.Hence, the system state is given by the numbers of bounded RNAP-Promoter complex (X 1 ) and RNA molecules (X 2 ).
The effect of cell cycle is also considered in our model (refer [44]).Let T the period of one cell cycle, the genome doubles at time (chosen arbitrarily) t d = 0.4T into a cell cycle.The promoter is reset to free state right after cell division, and non-DNA molecules are divided into two cells randomly with binomial partitioning.One daughter cell is followed at each division.These processes are basic assumptions in following analysis and simulations.
The simulations are performed using the stochastic simulation algorithm with delay [6].

Results.
3.1.Chemical master equation with delay and analytical results.Let X = (X 1 , X 2 ) the system state, with X 1 the number of RNAP-Promoter complex and X 2 the number of RNA products.Let P (x, t; x 0 , t 0 ) the probability of X(t) = x given X(t 0 ) = x 0 .When t = 0 (the starting point of a cell cycle), the promoter is free (X 1 (0) = 0).The number of DNA copies is given by a step function as Time evolution of P (x, t|x 0 , 0) within one cell cycle (0 < t < T ) is formulated by a chemical master equation ( [47], also see Appendix A) Here H(s) is a Heaviside function In experiments, we are mainly interested at the average transcription level of a group of cells, which can be represented by the average level over one cell cycle.
Let the average numbers From the master equation (3.2), and note that we obtain a delay differential equation for (3.4) The equation (3.4) gives the equation for average numbers x 1 (t) and x 2 (t) at the first cell cycle.Now, we assume that the promoter is free at t = 0 (x 0 1 = 0) and is reset to free state after cell division, and the mRNA molecules are divided into two cells randomly with binomial partitioning so that on average lim t→kT + x 2 (t) = 1 2 lim t→kT − x 2 (t).Therewith, it is easy to extend the above arguments to any cell cycles, and we can obtain a delay differential equation for x i (t) with discontinuous boundary condition for each cell cycle: The equation (3.5) gives the dynamics of the average transcription level.In the following we show that any solution of (3.5) converges to a limit cycle at the stationary state.Theorem 3.1.For any initial condition x 2 (0) = x 0 2 , the solution of (3.5) converges to a periodic solution when t → +∞.
Proof.First, it is easy to see that the solution x 1 (t) is periodic, and From the equation for x 2 (t), we obtain a transformation from x 2 (kT + ) to x 2 ((k + 1)T + ) that gives a Poincaré mapping of the solution over one cell cycle: Since e −δ2T /2 < 1, the transformation (3.7) is a contractive mapping and has a unique fix point Since the transformation (3.7) is contractive, the solution x 2 (t) tends to a periodic solution that is given by the solution of at each cell cycle.The theorem is proved.
From Theorem 3.1, we only need to consider the periodic solution of (3.5) while we are interested at stationary dynamics.In one cell cycle, the periodic solution satisfies following boundary value problem Explicitly, the solution of (3.8) is given by where The above solutions give the stationary transcription dynamics (see Fig. 2 for stochastic simulations).

3.2.
Elongation delay can reduce the transcription level.The question of how the elongation delay changes the transcription dynamics and transcription level is important for understanding the role of elongation to transcription process.When there is no transcription elongation, RNAs are produced at a constant rate λ 2 and destroyed in a first-order reaction with a rate δ 2 , with a kick in the RNA number due to genes doubling at t d = 0.4T in each cell cycle [44].After a sufficient number of divisions, the RNA number tends to a periodic behavior (Fig. 2, τ = 0, also refer [44]).When transcription elongation is included, we see a three-phases dynamics of transcription: a decrease in the RNA number right after cell division with a rate δ 2 during which RNA is not produced due to the lag time τ of transcription elongation, followed by an increasing phase when the transcription elongation is completed, and finally a kick in RNA number due to gene doubling (Fig. 2).The two increase phases start at time t = τ and t = t d + τ in each cell cycle, respectively.We note that there is a delay τ for the second increasing phase that is induced by gene doubling, and  Figure 2. Dynamics of average RNA number with transcription delay τ various from 0 to 30min.There is only one copy of the gene on the chromosome.Gene replication occurs every t d = 0.4T into the cell cycle and is marked with shadow regions.Binomial partitioning of mRNA molecules are includes and one daughter cell is followed at each division.Parameters used are: λ 1 = 0.05 s −1 , λ 2 = 0.12 s −1 , δ 2 = 0.005 s −1 , T = 60 min.The mean values are calculated from 100 independent sample trajectories.this increasing phase may not appear if the transcription elongation is long enough so that t d + τ > T .
As a direct consequence of the decreasing phase in the RNA number during the transcription elongation, the average RNA number exponentially decreases with a rate δ 2 shortly after cell division.The RNA number reaches the minimum value and starts to increase when the transcriptional process is complete.Thus, it is possible to have extreme low transcription level in the case of long elongation delay.To analytically examine the effect of elongation delay on the transcription level, we define the average transcription level over one cell cycle M as A tedious calculation shows that if the average transcription level linearly depends on the elongation time as (see Appendix B for details) Hence, the average transcription level linearly decreases with respect to the elongation delay (Fig. 3A).

Elongation delay can increase the transcription noise.
To investigate how the elongation delay affects the transcription noise, we define the stationary fluctuations of RNA number by the coefficient of variance η as (3.15) Here • means the ensemble average and is defined as •P (x 1 , x 2 , t; x 0 1 , x 0 2 , 0).
Thus, from the master equation (3.2), we obtain (see Appendix C for details), when (3.13) is satisfied, Stochastic simulations show good agreement with the above analytical result (Fig. 3B).From (3.16), the stationary fluctuation is inverse proportional to the RNA number, which is similar to the situation with no elongation delay [23].Thus, increasing the elongation delay τ can increase the transcription noise through the decreasing of transcription level.Finally, define the Fano factor F as

Elongation delay can induce bimodal distribution of RNA levels.
Techniques such as flow cytometry are often used to measure the variability among a group of cells.The experimental results, integrated with discrete stochastic models, can provide a sensitive "fingerprint" to explore fundamental equations of gene regulation [25].The experiment data generated by flow cytometers often give a histogram of cell distributions.Thus, when applying flow cytometry to explore stochastic gene expression, we ask how transcriptional controls can affect the distributions of RNA and protein levels, whether different types of cell-to-cell variability can be induced by changing the transcription elongation, and how the RNA distributions depend on the chemical reaction rates?To answer these questions with our discrete model, we randomly chose control parameters and for each set of the parameters we numerically investigated the resulting distribution of RNA from a group of cells.We note that cells analyzed in flow cytometry can be at different phases of a cell cycle so that the generated data correspond to quantities at different time points along a trajectory in model simulations.Thus, experimental data of a single gene expression by flow cytometry can be represented by the distribution of mRNA levels along a trajectory in model simulations.
The results are given by Fig. 4. When the binding of RNAP to promoter is fast, the system gives continuous production and unimodal RNA distributions (class I).When the binding of RNAP to promoter is slow, the system gives occasional RNA bursts and long distribution tails (class III).Class II distribution is induced by large elongation delay that the RNA number decreases to a very low level after cell division.When the binding of RNAP to promoter is fast and with long elongation delay, we see bimodal distributions (class II) with clearly delineated continuous production together with RNA degradation during the early transcription elongation.These three distribution classes can be characterized by two non-dimensional parameters λ 1 /δ 2 and τ /T as shown at Fig. 4A.Hence, the binding efficiency and elongation delay are important in interpreting the experimental data for RNA number distributions obtained by techniques of flow cytometry.It is obvious from Fig. 4 that, based on the model and parameters used in our simulations, nonzero transcription delay is required for the bimodal distribution.
Here we note that concentration rather than absolute molecule number is more closely related to measurements such as flow cytometry, while the distribution in Fig. 4 is given for RNA numbers.Nevertheless, the above classification of distributions is not changed when we calculate the concentration distribution by considering growing cell volumes (Fig. 4B insets).In addition, Fig. 4A indicates that greater τ /T means more likely to have bimodal distributions.However, Fig. 3A shows that when the elongation delay is large, the average RNA number approaches zero and hence the probability with positive RNA number becomes very small.Therefore, the bimodal distribution is not obvious for large τ /T , and is likely to be seen only in certain range of τ /T .
An analytic study of gene expression dynamics have shown that a constitutive gene expression (two-stage model) only has class I or class III distribution, while regulated gene expression (three-stage model) can have bimodal distribution (class II) [25,40].Complex promoter structure (i.e., promoter dynamics and transition pattern) can also lead to bimodal or multimodal transcriptions [54,55].Our results show that constitutive gene expression with elongation delay can also induce bimodal distribution by slow down the RNA production.This suggests a new mechanisms while we explain the experimental observation of bimodal distribution in RNA number.We should note that according to Fig. 4A, the ratio τ /T needs to be at least 1/6 in order to induce bimodal distribution.This value is significantly larger than the values for both bacteria and mammalian cells.In bacteria, the average time to transcribe a gene is 1min and the minimum cell generation time is around 30min, and in mammalian cells, average time to transcribe gene is 20min and the minimum cell generation time is 20hr [2,10].4. Discussion.In this paper, we have studied the transcription dynamics with elongation delay through chemical master equation and stochastic simulation with delay.The transcription dynamics including transcription initiation and elongation, RNA production and degradation were considered in this study.To study the long time behavior, cell division and DNA replication were also included in the model, which yield a time dependent periodic coefficient master equation with transcription delay in the mathematical model.The effects of elongation delay to the stochasticity of transcription dynamics were studied both analytically and numerically.
Our results showed that there are serval effects of the elongation delay in gene expression.The transcription elongation induces a decrease phase in RNA number after cell division, hence decreases the average transcription over a cell cycle and increases the intrinsic noise.Both analysis and simulation showed that the average transcription linearly decreases with the elongation delay, and the stationary fluctuation (square of the coefficient of variance) inverse proportional to the elongation delay.Transcription delay can also regulate the shape of RNA distribution, and provides a new mechanism of inducing bimodal distribution in the RNA numbers.
Transcription delay is known to affect the stochastic dynamics in gene networks.The current study quantitatively clarifies the effects of elongation delay to stochasticity of single gene expression.The results and methods here are valuable for understanding the stochastic dynamics in gene network with elongation delay.The and Thus, (B.1) yields 3) then e −δ2(T −τ ) ≈ 0, e −(λ1+λ2)(T −τ ) ≈ 0, and hence Thus, (B.1) yields Hence, the approximation (3.14) is proved.
Appendix C. Proof of (3.16).To calculate the stationary fluctuation, we define A tedious calculation gives Similar to Theorem 3.1, the solutions of (C.3) approaches to a periodic solution when t → +∞.Hence, at the stationary state, the variance of mRNA number at one cell cycle is given by the solution of equation From (C.9), we obtain (C.12) The relation (3.16) is proved.

Figure 1 .
Figure 1.Illustration of a model of transcription including promoter initiation, RNAP elongation along the DNA sequence, and termination of the RNA synthesis.See the text for details.

Figure 3 .
Figure 3. Effects of elongation delay on transcription dynamics.(A) Dependences of RNA number with the elongation delay.Markers are obtained from stochastic simulation, and the solid line is given by (3.14).(B) Dependences of the coefficient of variance (CV) for RNA number with the elongation delay.Markers are obtained from stochastic simulation, and the solid line is given by (3.16).Parameters are same as Fig. 2.
) then(3.16)  indicates F ≈ 1, and hence the transcription delay would not change the Fano factor.

Figure 4 .
Figure 4. Effects of transcriptional delay on RNA distributions.(A) Three classes (I,II,III) distributions on RNA numbers in the parameter space characterized by the RNAP binding efficiency (λ 1 /δ 2 ) and elongation delay (τ /T ).Green dots for class I distribution, blue dots for class II distribution, and red dots for class III distribution.(B) Distributions of RNA number with parameters taken from regions I, II and III in (A).Insets show the distributions in concentrations by assuming linear growth of cell volume from an initial value until cell division and halves upon division.