Regulatory Mechanisms for Transcriptional Bursting Revealed by an Event-Based Model

Gene transcription often occurs in discrete bursts, and it can be difficult to deduce the underlying regulatory mechanisms for transcriptional bursting with limited experimental data. Here, we categorize numerous states of single eukaryotic genes and identify 6 essential transcriptional events, each comprising a series of state transitions; transcriptional bursting is characterized as a sequence of 4 events, capable of being organized in various configurations, in addition to the beginning and ending events. By associating transcriptional kinetics with mean durations and recurrence probabilities of the events, we unravel how transcriptional bursting is modulated by various regulators including transcription factors. Through analytical derivation and numerical simulation, this study reveals key state transitions contributing to transcriptional sensitivity and specificity, typical characteristics of burst profiles, global constraints on intrinsic transcriptional noise, major regulatory modes in individual genes and across the genome, and requirements for fast gene induction upon stimulation. It is illustrated how biochemical reactions on different time scales are modulated to separately shape the durations and ordering of the events. Our results suggest that transcriptional patterns are essentially controlled by a shared set of transcriptional events occurring under specific promoter architectures and regulatory modes, the number of which is actually limited.


Introduction
Gene transcription is a complicated dynamic process integrating regulatory signals with genetic information.mRNAs can be produced at constant rates or in short intervals followed by long periods of inactivity, termed transcriptional bursting [1][2][3].Although burst traces are qualitatively similar across organisms, the duration of the burst cycle varies remarkably, ranging from minutes to hours [4].Does this great variability reflect diverse molecular mechanisms or stem from a shared set of highly adjustable molecular events?Similarly, molecular processes involved in transcription typically last from milliseconds to hours [3][4][5].How can they be affected by regulators to underlie transcriptional bursting?How can bursting kinetics be modulated across such a wide range of time scales?These topics are fundamental to comprehending the regulation of gene transcription.
Live-cell fluorescence measurements have yielded large amounts of single-cell data [3,6,7].Durations of transcriptionally active and inactive states and burst size were extracted to characterize transcriptional bursting and expression noise.To theoretically interpret them, phenomenological models have been built upon independent effective processes [8][9][10][11][12][13][14], enabling the association between regulatory signals and experimental observables.By incorporating more transcriptional details such as multistep initiation and multistep degradation of mRNA, newly developed models gain more insights into regulatory mechanisms [15][16][17][18][19][20][21][22].To unravel gene regulation and promoter configurations through the lens of input-output relationships, however, precise knowledge of biochemical reaction mechanisms is essential; yet, such knowledge is often unavailable and must be inferred on the basis of assumptions.This could result in discrepancy between model predictions and experimental data, ignorance of global constraint on transcriptional noise, and inaccurate mapping between rate-limiting steps and effective processes under limiting conditions when deducing the underlying regulatory mechanism [23,24].A different modeling framework is required.
On the basis of a set of common biochemical reactions essential to mRNA synthesis, a generic model has been developed, which is compatible with those simplified models and fills in their gaps [25][26][27].Nevertheless, there always exist additional gene-specific reactions; some common reactions actually comprise multiple steps, and the transition process often occurs through distinct paths, leaving a memory in the transcriptional path [9,28].While these details could be incorporated to construct more intricate models, we may develop a simpler and more computationally efficient method using queuing theory, which is widely leveraged to understand the behavior of systems with queues.Indeed, queuing models [28][29][30] were proposed to analyze empirical data on mRNA copy numbers and waiting-time distributions of biochemical reactions while neglecting the underlying molecular processes.To elucidate the molecular mechanism of gene regulation, we may focus on essential transcriptional events, with each comprising multiple molecular processes, and simplify the transcription into a sequence of events.
Gene transcription is modulated by various regulatory factors, such as transcription factors (TFs) that bind DNA to activate transcription and modifiers and remodelers that induce changes to histone markers and nucleosome arrays to overcome structural barriers [31].They may exert a wide influence on transcription by altering the duration and directionality of molecular processes [32][33][34].To unravel the roles for regulators in gene expression, it is essential to differentiate their impacts on rate-limiting steps and probability flows of event transitions in transcription cycle.
Here, we first propose a network of first-order Markovian biochemical reactions.Because transcriptional bursting is composed of inactive and active phases involving 3 major stages, i.e., chromatin opening for promoter access, assembly of the preinitiation complex (PIC), and mRNA production, all gene states can be divided into 5 sets to separate the stages, with functional state transitions classified into 9 categories.A series of state transitions is combined into an event, and a 6-event model is developed for transcriptional bursting.This model is generic and requires no free parameters.Transcriptional commonality is reflected in the sequential order of the 4 events, while gene specificity is embodied in the duration distributions and ordering of the events.The kinetic features of transcription, such as the mean mRNA number, burst size, burst frequency, and duration of the (in)active phase, are interconnected and functions of 6 model parameters depending on the regulator concentration (i.e., 4 average event durations and 2 repetition probabilities).
On the basis of this event model, we clarify typical features of and intrinsic constraints on transcriptional bursting, illuminate how the regulatory mechanism can be inferred by fitting experimental data at the single-gene and genome levels, and explore the transient transcriptional dynamics upon stimulation.Both numerical and analytical results are presented to provide an integrated view of how reactions on different time scales are modulated to underpin specific burst profiles.The pivotal regulatory modes and their functional implications are elaborated in detail.

Establishment of the event model
Although gene transcription exhibits heterogeneity, its common features allow for the development of a generic model for transcriptional dynamics.To reveal the role for regulatory factors in transcriptional modulation, we need establish a model based on transcriptional events, which is both accurate and simple enough for theoretical analysis.
We first proposed a network model of inducible eukaryotic transcription (Fig. 1A).The transcription begins with activators binding to the promoter/enhancer [35], which promotes the recruitment of cofactors, chromatin remodeling [34,36] and histone modifications [31,37].General TFs (GTFs; including TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIIH) and RNA polymerase II (Pol II) are then recruited to the core promoter, forming the PIC.The enhancer-bound activator and the PIC are connected by the mediator complex [38,39]  ).mRNAs are synthesized in the active phase (A, red area) comprising a series of E P , while no mRNA is produced in the inactive phase (I, gray area) consisting of sequences of E S1 , E S2 , and E S3 .(C) Gene transcription is simulated based on the distribution functions f X (X = P, S1, S2, S3), p 1 and p 2 , which depend on the regulator occupancy rate (O)."B" denotes regulators bound and "U" regulator unbound.The left panel schematically shows f X for different O values.When O rises from 0 to 1, (f XU , p 1U , p 2U ) turns into (f XB , p 1B , p 2B ).(D) Eight basic regulatory modes in which regulator binding can promote transcription by decreasing the mean event duration τ X and p 2 or by increasing p 1 .Only 1 of the 6 bursting parameters is adjustable in each mode.The first "M" in the leftmost column stands for modulation, while "RF" stands for regulatory factor.The black arrows signify that RF binding accelerates the events or increases the event flows (J 1 + or J 2

−
), whereas the black lines with a flat head signify that RF binding decreases the event flows (J 1 − or J 2 + ).
via enhancer-promoter proximity.The PIC turns into the open complex after the DNA template strand is positioned into the active center cleft of Pol II.With the preparations completed, Pol II gets away into elongation to synthesize nucleotide chains [40], while the remaining scaffold complex (composed of activators, TFIIA, TFIID, TFIIE, TFIIH, and mediator) on the promoter facilitates transcriptional reinitiation.During early transcript synthesis, the early elongation complex (EEC) retains a measurable tendency to undergo backtracking until Pol II pauses about 30-base-pair downstream of the transcription start site (i.e., promoter-proximal pausing) [41] and then is released for productive elongation, restricting the frequency of transcriptional initiation [42].
Each node in the network represents a gene state, and the transition between adjacent states is a Markov process (Fig. 1A).The same reactions may occur at diverse sites around the promoter, and some reactions are not exclusive with each other, leading to various transition paths between 2 states.All states are divided into 5 sets (Table 1), separately responsible for maintaining chromatin inaccessibility (C), chromatin remodeling and histone modification when the core promoter is a nucleosome-free region (F), PIC assembly (A), initiation of mRNA synthesis by Pol II and promoter escape (I), and promoter-proximal elongation and pausing of Pol II (E).When a Pol II enters productive elongation (P), the gene state returns to set I, and another Pol II is recruited for transcription reinitiation.Each state can transition to a state in the same or adjacent set.
This network model is too complicated to be analyzed easily.To simplify it, we exploited queueing theory [28,43] to turn the network model into an event-based model.First, 9 types of functional transitions are specified: C F , F C , F A , A F , A I , I A , I E , E I , and E I P , each comprising a series of reactions with the same function (Table 2).X X′ (X, X′ ∈ C, F, A, I, or E) represents the process where the initial state stems from set X and all state transitions occur in X until the last state belongs to X′. E I P involves Pol II released into productive elongation, such that a nascent full-length transcript can be produced, whereas E I does not.The time scales of (I E , E I , E I P ), (F A , A I , I A , A F ), and (C F , F C ) are in the order of seconds to minutes, minutes, and minutes to hours, respectively.
Second, we defined 6 events, a detailed description of which is presented in Fig. S1.E P refers to a series of I E and E I transitions plus the ending transition E I P , resulting in synthesis of mRNA transcripts.E S1 comprises a series of I E and E I plus I P , responsible for disassembly of the scaffold complex.E S2 consists of a series of A F and F A plus A I , enabling assembly of the PIC.E S3 comprises a series of A F , F C , C F , and F A , maintaining the inaccessibility of GTFs to the core promoter and delaying the entry into E S2 .As the beginning event of the entire transcriptional process, E begin comprises a series of C F and F C plus F A , contributing to chromatin opening; as the ending event, E end is composed of a series of A F and F A plus F C , resulting in termination of transcription.Notably, each event is not fixed in composition but amenable to modulation; E begin and E end are governed by the initial and final states in set C, respectively.A single transcriptional burst undergoes 2 stages: The inactive phase refers to no mRNA production, including E S1 , E S2 , and E S3 , while the active phase corresponds to mRNA synthesis, comprising a sequence of E P .
Last, we developed an event-based model for transcription (Fig. 1B).E begin first appears, and then E S2 follows, or E S3 occurs once or repetitively until E S2 arises.Subsequently, different paths lead to the (repetitive) occurrence of E P , with a burst of mRNAs generated.After bursting, E S1 emerges, then E S2 appears with or without E S3 preceding, and a new burst may ensue.Notably, there may exist a refractory period in E S1 where immediate reactivation of transcription is prohibited.The above process repeats until E S1 and E end appear successively, with the gene returning to the silent state.Thus, the transcription is characterized by a sequence of ordered events.The process of mRNA splicing, nuclear export, and degradation of mature mRNA is simplified as an effective Poisson process with rate constant 1/τ m .τ m is set to 5 min unless specified otherwise.The biochemical master equation describing mRNA production is presented in Text S1.
Given the stochasticity in transcription, we focused on the distribution function f i of the duration of event i (i = P, S1, S2, S3) in steady state with the mean τ i (Text S2 and Figs.S2 and  S3).The probability of repeated occurrence of E P (E S3 ) is p 1 (p 2 ) (Fig. S4).In general, f P , f S1 , f S2 , and f S3 are primarily controlled by Pol II-dependent reactions, disassembly of the scaffold complex, GTF-dependent reactions, and nucleosome-associated reactions near the core promoter, respectively.f i is an exponential or a unimodal function in most cases [11,44].τ i is mainly determined by rate-limiting steps, whose durations in a burst are the main constituent of the burst period.p 1 is elevated via accelerating such reactions as recruitment, phosphorylation, and release of Pol II, which promote the transition of A → I → E → P, while p 2 is boosted by accelerating reactions that promote Once f i , p 1 and p 2 were given or derived from experimental data, numerical simulation was performed to depict mRNA production using the Gillespie algorithm (Text S3).The default setting of f i , p 1 , and p 2 is denoted as (f iU , p 1U , p 2U ), corresponding to the situation without regulators binding to cognate sites, and is determined by the gene specificity [45] and local environment; it changes into (f iB , p 1B , p 2B ) with bound regulators (Fig. 1C).At various regulator concentrations, its rate of occupancy at the regulatory site varies between 0 and 1, and the resulting (f i , p 1 , p 2 ) lies between (f iU , p 1U , p 2U ) and (f iB , p 1B , p 2B ).That is, the regulation of transcription is mapped to changes in f i , p 1 , and p 2. We found that the main conclusions do not rely heavily on the concrete form of f i provided that it is exponential or unimodal (Figs.S5 and S6).All the results presented here are based on the exponential distributions.Since an exponential distribution is determined by its mean value, the roles for regulators in gene transcription are classified on the basis of their influence on τ i , p 1 , and p 2 .There exist 255 (2 8 − 1) regulatory modes, and Fig. 1D illuminates 8 basic modes, under each of which regulators affect τ P , τ S1 , τ S2 , τ S3 , p 1 (through ) alone.Compared with previous models where independent quantities characterized experimental observables, the current model focuses on a sequence of ordered events and is more concerned with the impact of biochemical reactions on the event durations and transitions.Thus, the quantities such as the transcription rate constant (1/τ P ), the duration of the active phase [τ A = τ P / (1 − p 1 )] and of the inactive phase (τ I = τ S /p 1 ) are all correlated (see below), which will have important implications as shown later.

Regulatory modes promoting transcriptional sensitivity and specificity
Gene transcription is mediated by a multitude of regulatory factors, such as TFs and cofactors.Without loss of generality, they are assumed to function by binding their cognate sites to affect biochemical reactions.For the general multibinding site scenario, the occupancy rate of regulators can be expressed as , where [R] denotes the regulator concentration, K d is a constant, and Hill coefficient n H could be a noninteger.As shown in Texts S4 and S5, if [R ′ ] and K ′ are separately substituted for [R] n H and K n H d , either parts of original analytical expressions are identical to those in the case of n H = 1, or relationships between specific quantities can be preserved.Consequently, the occupancy rate can simply be written as [R] , corresponding to the simplest single-binding site case, for the purpose of simplifying the exploration of gene regulation.
As mentioned above, bursting parameters can be modulated to various extents under 255 regulatory modes.In mode X, the mean transcription rate  in steady state is approximated as where Ω X refers to [R] at which υ is halfway between its maximum and minimum, β X is associated with the basal transcription Reactions that make the core promoter nucleosome-free to promote the recruitment of GTFs, e.g., removal of nucleosomes at the promoter, altering the accessibility of DNA on the surface of nucleosomes by remodelers, replacement with certain histone variants, or destabilization of internucleosome contacts by acetylation of lysine residues.Reactions that change the microenvironment when the promoter is occluded by canonical nucleosomes or other macromolecules different from GTFs, e.g., formation or disruption of topologically associating domain boundaries and changes in distribution modes and interactions of chromatin marks [36,75].
Reactions that change the microenvironment when the promoter is a nuclosome-free region [75].
The core promoter is wrapped around histone octamers to form canonical nucleosomes or occupied by other macromolecules [36].
A F Reactions that change the microenvironment when the promoter is occupied by GTFs; enhancer-promoter proximity (hubs or looping) [75].
Reactions that change the microenvironment when the PIC scaffold is formed without the EEC on the DNA template; enhancer-promoter proximity [75,81].

I E
Stabilizing the initially transcribing complex to form the EEC via promoter escape, e.g., the B-finger of TFIIB inserts into the polymerase active site to complete escape commitment [81].

E I
Reactions that change the microenvironment when the EEC begins elongation; enhanc erpromoter proximity [75,81].
Transcript synthesis; recovery of the elongation competency of arrested Pol II via TFIIS; release of paused Pol II to enter productive elongation [81].
without bound regulators, and n is a fitting number (see Text S6 for details).K d /Ω X reflects the sensitivity of transcription to changes in [R] (Fig. 2A); high sensitivity allows for efficient regulation of transcription [46].
n measures how adjustable the mean transcription rate is.
Regulators actually bind to both cognate and nontarget sites, and the latter may also induce mRNA production.To quantify the difference between the 2 scenarios, we introduced the specificity S defined as the ratio of the average expression resulting from specific binding of regulators to that resulting from nonspecific binding.S is a unimodal function of [R] for nonzero β (Fig. 2B); its maximum S m approximately equals F υ , given that cognate binding has a much higher affinity than nontarget binding (see Text S6 for details).Collectively, K d /Ω X and S m provide crucial information about transcriptional regulation.
Among 8 basic regulatory modes, Ω X < K d is realizable for X = MEE/EA/FI/CC/FA/IE, whereas Ω X > K d for X = MIA/FC (Fig. 1D and Table 3).In the former, the transcription is activated by reducing τ P , τ S1 , τ S2 , τ S3 , p 2 (via J 2 − ) and increasing p 1 (via J 1 + ), which are achieved mainly by accelerating E I P and I E , I A and E I , A I and F A , F A and C F , F A and A I , and I E and E I P , respectively.In the latter, transcription is activated by increasing p 1 (via J 1 − ) and reducing p 2 (via J 2 + ), which can be acquired by slowing down I A and E I and A F and F C , respectively.Therefore, with Ω X < K d , the transcriptional rate reaches its maximum before the regulator occupancy saturates, while the regulator binding promotes transcription by speeding up some state transitions, which involve the recruitment of TFs, enzymes, and Pol II and are part of  3).

the regulation of E I P
, I E , A I , F A , or C F may be engaged more frequently to gain high transcriptional sensitivity (Fig. 2C).
The type of regulatory mode determines the trend in S m changing with Ω X .For Ω X < K d , S m drops with increasing Ω X /K d (Fig. 2D); S m varies sharply around Ω X /K d = 1 under MIE (MEE) for τ AU < τ IU (τ AU > τ IU ).The maximum of S m is achieved via MIE for τ AU < τ IU and via MEE otherwise, suggesting that regulating E I P or I E is a desirable mode.For Ω X > K d , S m rises linearly with increasing Ω X /K d (on a log-log scale); obtaining a large S m through MIA/FC requires that the binding sites should always be occupied, which is biologically implausible.
According to the event model, we directly infer that the regulation of E I P , I E , A I , F C , or C F can enhance the transcriptional sensitivity while promoting mRNA synthesis.The mode modulating E I P and/or I E has a marked role in enlarging the transcriptional adjustability and reducing the crosstalk effects resulting from noncognate regulator binding.

Major regulatory modes underlying the observed transcriptional noise
Transcription is a complex stochastic process; for simplicity, we only explored intrinsic noise in transcription, neglecting variations in parameters between cells or across time (i.e., extrinsic noise).The relationship between the mean (<m>) and variance (σ m 2 ) of mRNA copy number or Fano factor (F = σ m 2 /<m>) is a key representation of transcriptional regulation.<m>, σ m 2 , and F in steady state take the following forms (see Text S7 for details): where τ m is the lifetime of mRNA, τ A (τ I ) is the mean duration of the active (inactive) phase, and τ S is the mean interval between E S1 and E S2 : . The Fano factor is governed by the mean burst size b [the number of mRNAs per burst; b = 1/(1 − p 1 )], c = τ A /τ m , and d = τ I /τ m .A small burst size, a plateau-like bursting profile, or a short inactive phase can contribute to reducing F.
Figure 3A displays how F varies with <m> over a wide range of parameter values.For each curve, only one of the 6 parameters is altered, while the others are fixed at default values; we also performed simulations by directly setting τ S rather than τ S1 , τ S2 , τ S3 , and p 2 and assumed τ S to obey an exponential distribution.When y is varied, the corresponding F is denoted as F y .As Fτ s differs slightly from F x (x = τ S1 , τ S2 , τ S3 , or p 2 ) (Fig. 3A, inset), only the curve for Fτ s is presented.With τ P regulated alone, the mean height of burst profiles {h ≈ τ m (1 ]. "B" and "U" denote the cases with and without bound regulators, respectively.No "U" emerges if regulator binding has no influence.ε X (X = EE, EA, FI, CC, IE, IA, FC, FA) is the ratio of the corresponding parameter with bound regulators to that without bound regulators.
than burst size, varies prominently.At small τ P , F rises sharply with increasing <m> due to little degradation of mRNA within a short active period.At large τ P , the transcription rate is small such that a nascent mRNA has a high probability of being degraded before new mRNA production, and the active phase is much longer than the inactive phase, i.e., the transcription is nearly a Poisson process (F = 1).With τ S modulated alone, b and τ A show little change, whereas τ I varies markedly, such that F drops toward one with increasing <m>.F varies slightly at small <m> but drops markedly at large <m> due to a continuous reduction in With p 1 altered alone, b rises with increasing <m>, and h rises toward saturation; thus, F first rises because of an increase in b and then drops because of saturation in h and plateau-like bursting.Strikingly, all numerical results agree perfectly with the analytical expressions.
The dependence of F on <m> is more complicated when 2 or more parameters vary concurrently.Notably, the curves for Fτ P and Fτ S constitute the boundaries, and the others lie in between and exhibit unique features (Fig. 3B and Fig. S7).Providing that the regulator binding promotes transcription, F rises markedly at small <m> together with a notable change in h or varies slightly when h changes little.If F falls at intermediate or large <m>, then at least τ S or p 1 is regulated, and <m> tends toward τ m /τ P when F approaches 1; specifically, F drops rapidly when τ P is sufficiently large or changes little.Together, the regulation of τ P or p 1 is indispensable for a rise in (5)  F, and its rapid rise is accompanied by sharp bursting; the regulation of τ S or p 1 is essential to a drop in F, and its fast fall is associated with plateau-like bursting.Similarly, the numerical and analytic results match exactly.Indeed, the fitting curves to experimental data resemble the typical curves above.Activated by the Bicoid (Bcd) protein, major gap genes in early Drosophila embryo are expressed at distinct levels along the anterior-posterior axis of its body due to the Bcd concentration gradient [47], sharing a similar trend in F versus <m> (i.e., F first rises and then drops toward 1 with increasing <m>) (Fig. 3C; see also the right top panel in Fig. 3B).All the data points can be fitted by a single curve with the event model (see parameter inference in Text S4).For each gene, F is greater than 3.8 at small <m> and falls rapidly after the maximum, indicating that τ P is nearly consistent across body locations.With τ P fixed, there should exist the upper bounds, <m> < τ m /τ P and F < 1 + τ m /τ P (see Text S7 for details).As these genes display similar limits in <m> and F and have similar τ m values [48], τ P (and the frequency of transcription initiation) should also be comparable among them, which implies that variations in Bcd concentration have a minor impact on the major rate-limiting steps in Pol II recruitment and promoterproximal pausing.The increase in Bcd concentration may primarily facilitate the activation of Pol II and its entry into productive elongation.
Another example involves the expression of CDKN1A activated by the tumor suppressor p53 in MCF7 cell lines [49].F varies slightly at small <m>, rises to a peak, and then drops gradually (Fig. 3D); to fit the data with the event model, p 1 , τ S , and τ P vary concurrently with small p 1U and τ P (see also the right bottom panel in Fig. 3B).At small p 1U , the gene state tends to leave set E via E I rather than E I P or leave set I via I A rather than I E , associated with slow reactions in recruiting, stabilizing, or releasing Pol II.At small τ P , fast reactions are required before Pol II enters productive elongation.Thus, p53 may prefer to regulate CDKN1A expression via speeding up E I P through cellular signaling.
Further analysis of other experimental data reveals the diversity in transcriptional regulation.When the ctgf expression is mediated by transforming growth factor-β (TGF-β), the transcription rate rises with increasing the TGF-β concentration, and τ A is nearly fixed [50], which requires that τ P is inversely proportional to b.This further implies that the likelihood of Pol II entering productive elongation is prominently higher than its probability of escaping from the core promoter, i.e., the modulation lies in E I P rather than I E .When glucagonlike peptide 1 (GLP-1)/Notch induces sygl-1 expression in Caenorhabditis elegans, the burst size is nearly irrelevant to the concentration of GLP-1/Notch ligand [51], i.e., τ P varies markedly but b (p 1 ) is fixed, implying that the early elongation of Pol II until promoter-proximal pausing can be modulated.Obviously, different regulations of τ P and p 1 are gene-specific, reflecting which steps are rate-limiting and which reactions are being adjusted in E P .
Together, the event model enables the deduction of transcriptional modulatory mechanisms.The results above suggest that the changing trend of burst profiles is influenced by the adjustable range of Pol II-dependent reaction rates (e.g., recruitment and promoter-proximal elongation of Pol II).In the 3 limiting cases where the transcription occurs with fixed τ A , b, or τ P , the burst height, burst duration, and area under the curve are modulated, respectively.

Global constraints on transcriptional bursting
The results above unravel an important regulatory mode involving the concurrent modulation of p 1 and τ S .Under this mode, the mean mRNA number (<m> = υτ m ) and burst size b rise toward saturation with increasing [R], while the burst frequency f first rises to a peak and then drops toward saturation, corresponding to τ A < τ I and τ A > τ I , respectively (Fig. 4A).Ω m , Ω b , and Ω f refer to the operating points where m, b, and f reach half of their maximum values, respectively.Because , their sensitivity to changes in mRNA number, to dissect the contribution of size and frequency modulation to mRNA production.S f and S b drop and rise respectively with increasing <m> (Fig. 4B, bottom), indicating that the modulation of burst frequency and size predominates in distinct regimes and a combination of both mechanisms emerges over intermediate ranges.A similar relationship between f(b) and <m> was observed across gap genes in early Drosophila embryos [47] (Fig. 4C), where their expression levels are graded along the anterior-posterior axis, responsible for body segmentation.
With high-throughput experimental data available, we calculated the mean mRNA numbers and Fano factors for 9196 genes in fibroblasts [52]; to exclude the influence of extrinsic noise, we preprocessed the data using the method in [53].Each mRNA count is normalized by dividing it by the total count in each sample and scaled by the median count across all samples.Each data point in Fig. 4D denotes <m> and F for one gene, and the red curve is a fitted trendline for all data.There exist lower bounds on Fano factor and burst size at each expression level; meanwhile, 2 m = 2 m ∕<m> 2 fluctuates around the mean for large <m> (Fig. 4D, inset), while the total noise intensity in the original data from [52] remains nearly constant at high expression levels [54].The lower bound on F is nearly irrelevant to <m> at low expression but rises prominently with increasing <m> at high expression.That is, higher expression is accompanied by increased burstiness.This phenomenon has been widely observed in mammalian cells [44,[55][56][57][58], fly embryos [59][60][61], and even Escherichia coli [62,63], suggesting that transcriptional regulation may obey a universal principle across the genome, i.e., high-level expression is associated with the bursting with large size.
This global constraint results from the correlations among bursting parameters.Given that ⟨m⟩ = , τ p is ex pressed as a function of p 1 , τ S , and <m>, and, thus, , where τ S0 is the minimum of τ S and p 10 is the maximum of p 1 .F min is greater than one and rises with increasing <m> at high expression, as illustrated by the black curve in Fig. 4E, which is similar in trend to the black one in Fig. 4D (to justify that this global restriction is intrinsic to the transcriptional machinery, few constraints are imposed on parameter values; otherwise, if τ I > τ A were guaranteed, the 2 curves would match).A cluster of red curves is plotted in Fig. 4E, each beginning with the same parameters and ending with a point on the black line.Each curve represents the F-<m> curve for some gene, similar to that shown in Fig. 3B.
To recapitulate the lower bound on F at high expression levels in Fig. 4D, we systematically altered the values of p 1 , τ S , and τ P and calculated the mean mRNA number and Fano factor in the case of τ A < τ I .The scatter plot of F versus m has similarities to that in Fig. 4D (Fig. 4F).The global constraint becomes more striking when there exists a longer refractory period in the inactive phase or a higher transcription rate, such as in the case of large τ S and small τ P , leading to sharp bursting peaks and enhancing the discernibility of burst profiles.By contrast, high-level expression is not necessarily associated with large Fano factor in the telegraph model and other phenomenological models (inset of Fig. 4F and Fig. S9), which arises from the independence of bursting parameters.Collectively, the global constraint reflects an inherent feature of transcriptional bursting.

Transient dynamics reflect the underlying regulatory mode
After showing the typical features of transcriptional bursting in steady state, we turned to explore the transcriptional response to a step rise in regulator concentration.A new steady state will be reached in diverse manners because of different regulatory modes and composition of the inactive phase.With the initial state completely silent (i.e., p 1 → 0, p 2 → 1, or τ i → ∞ and no mRNA production), the mean probability P A (t) of gene activation and mean mRNA number are determined as follows (see Text S8 for details): where f A (t A ), f I (t I ), and f I_init (t I_init ) separately denote the distribution functions of the duration of the active (inactive) phase (6) The red lines show Fano factor versus <m>, and the black line shows the lower limit of Fano factor at large <m> with τ S ≥ 1.The default parameters are the same as in (A).(F) Fano factor versus <m> with the event model or telegraph model (inset).Ten thousand sets of bursting parameters are used in each panel, where the duration of a burst varies between 1 and 1,000 min, the transcription rate is from 0.1 to 100/min and τ A < τ I .The refractory period is set to 1 (blue), 5 (red), or 10 min (yellow).
in the new steady state and the time taken to enter the active phase (Fig. 5A, top), with A and f I (t I ) ≈ Γ(t I , α I , θ I ) (Gamma distribution), while * represents convolution.Depending on the initial condition, f I_init can assume different forms (Fig. S10).f first = f A * f I_init represents the distribution function of the first burst duration t first , which is the time from stimulus onset to completion of mRNA synthesis in the first active phase, and can be expressed as Γ(t first , α f , θ f ).f T = f A * f I denotes the distribution function of burst duration in steady state (t ss ).The mean and coefficient of variability (CV) of t first are τ first and CV first , respectively.The peak and steady-state values of P A are separately P AP and P Ass = τ A /T (T is the mean burst period) (Fig. 5A, middle); <m> p and <m> SS = bτ m /T denote the mean peak and steady-state values of mRNA count, respectively (Fig. 5A, bottom).CV Z (Z = P, S1, S2, S3, S) refers to the CV for τ Z in steady state.
Figure 5B shows the typical dynamics of P A (t): P A either overshoots P Ass or monotonically rises toward P Ass .Without P A overshooting, <m> p cannot exceed <m> ss , and its dynamics also markedly depend on τ m (Fig. 5C).To determine the condition for overshoot, we introduced H P = (P AP − P Ass )/P Ass and H m = (<m> p − <m> ss )/<m> ss .P A (<m>) must overshoot for , δ P = 0, and δ m = τ m .H P is governed by τ first /T and CV first ; the area below the red curve of h p = 1/T corresponds to H p > 0 (Fig. 5D).H m also relies on τ m /T, and the phase plane of h m = 1/T separates H m = 0 from H m > 0 (Fig. 5E).Strikingly, overshoot appears at small values of τ first /T, CV first , and τ m /T.On the other hand, the time taken to reach half of P Ass (<m> ss ) is denoted as t GA (t re ); t re is affected by mRNA production and degradation.If the rising phase of <m>(t) is governed by mRNA production over the time scale of τ m due to strong correlation of bursts among individual simulations, t re can be less than τ m ln2; otherwise, the dynamics of <m> are controlled by mRNA degradation, leading to t re > τ m ln2.Thus, decreasing τ first /T and CV first contributes to reducing t re ; changing T/τ m has a dual effect, and exclusively increasing it leads to a first drop and then rise in t re (Fig. 5F).
In general, τ first /T and CV first are modulated by the initial state distribution and constrained by the promoter architecture.When the gene state immediately enters the active phase upon stimulus onset due to regulation of Pol II-dependent reactions, τ first /T and CV first are separately close to τ A /T and the CV of the active phase duration (C A , which is often ~1), whereas τ first /T and CV first separately approach 1 and the CV of t ss (i.e., CV ss ) when the state just leaves the active phase.If there are multiple rate-limiting steps in burst cycle before stimulus onset [11,44], the initial state is mostly distributed around those states, usually leading to τ A /T < τ first /T < 1 and small CV first , whereas τ first /T ~ 1 and CV first ~ 1 only if there is one rate-limiting step.The number of rate-limiting steps is governed by the promoter architecture [11,64,65] and is associated with the magnitude of p 2 .For large p 2 , the sum of all durations of E S3 within a burst cycle often accounts for the majority of the burst period, and its distribution is approximately exponential, implying that large p 2 is often associated with only one rate-limiting step (i.e., nucleosome clearance) (see Text S2).By contrast, formation of the scaffold complex and pause of Pol II are the major rate-limiting steps for small p 2 .For instance, the nucleosome occupancy rate is rather low at target genes of white collar complex in Neurospora crassa, which is linked with small p 2 and relatively large τ S1 + τ S2 , leading to the observable overshoot [46].Collectively, analyzing the transient response also provides insight into the modulatory mechanism and promoter architecture.Notably, only a finite number of scenarios emerge from the comparison of transient transcriptional responses across different regulatory modes.When the gene is activated from the silent state to the same steady state (with identical f A and f I ), for example, 4 scenarios appear upon examining the ordering of t re values across 6 basic regulatory modes (Fig. 6A).These relationships primarily depend on the composition of the inactive phase, i.e., the values of p 2 , τ S2 /τ S1 , and τ S3 /τ S1 (Fig. 6B).Nevertheless, the fastest response always occurs via MEE, as paused Pol II is poised to enter productive elongation upon an excitatory signal [66]; the second fastest response takes place via MFI, as the transcription machinery is ready for assembly on the promoter (Figs.S11 and S12).Under combinatory modes, a faster response can be induced to reach the same <m> ss when τ P is regulated (Fig. 6C).Therefore, accelerating the reactions involving Pol II can elicit fast transcription from a completely silent state.
Moreover, it may be beneficial for an inducible gene to respond fast to stimuli and generate a broad range of outputs.The rapid response largely results from the overshoot caused by regulating τ P via MEE, while regulating τ P in the case of τ A > τ I or regulating p 1 and τ S in the case of τ A < τ I elicits a broad range in <m>.Thus, integrating MEE with MX (X = IE, EA, FI, CC, or FA) allows for a fast response and highly tunable expression (Fig. 6C).In this sense, regulating E I P and I E is conducive to inducible genes [67].

Discussion
The event model showcases several strengths when compared with previous phenomenological models.First, biochemical reactions involved in transcription are accurately mapped to the events in our model, and parameter values can be directly inferred from experimental data.Second, the event model retains the correlations among the bursting variables [transcription rate constant, burst size, and duration of the (in)active phase].This naturally imposes constraints on burst profiles, guaranteeing bursty transcription in highly expressed genes [52,68,69] and enabling the modulation of burst frequency and burst size to separately predominate at low and high expression levels [47,57,70].Third, 9 kinds of functional state transitions are distinctly separated in time scales from seconds to hours, and the durations and occurrence probabilities of events are independently modulated.These enable transcriptional bursting to manifest across a broad spectrum of temporal scales, exhibiting diverse dynamic patterns.With these features, the event model could have wide applications.
In principle, we can deduce the changes in (τ P , τ S , p 1 ) through the correlations among observables and further reveal the promoter architecture and molecular regulatory mechanisms.For example, E P consists of sequential stages (initiation, promoter clearance, elongation, and promoter-proximal pausing).The tendency of Pol II to continually synthesize mRNAs and the duration of promoter-proximal pause determine p 1 and τ P , respectively.For a group of gap genes in early Drosophila embryos [47], τ P is nearly fixed, but p 1 is regulated, suggesting that regulators strongly affect the stability of the transcription complex and weakly affect the motion of Pol II along DNA templates, e.g., modulating the phosphorylation of Pol II C-terminal domain.In the regulation of ctgf expression by TGF-β1 [50], the duration of the active phase varies only slightly, while the transcription rate constant rises with increasing the TGF-β1 concentration, suggesting that TGF-β1 affects the ratio of forward to backward rates of Pol II movement along DNA templates during the stalling and reverting of the EEC.
In the regulation of sygl-1 expression in C. elegans, the burst size remains relatively constant, while τ A experiences changes [51], suggesting that GLP-1/Notch influences the duration of a highly irreversible stage, e.g., early elongation of Pol II until promoter-proximal pausing.Together, although the major stages in the event model are similar across genes, the reversibility and rate limits of those stages govern the regulatory modes, reflected in the changes to the durations and ordering of the events.Cells are exposed to various fluctuations, and some have to rapidly respond and adapt to new environments via transcription [67].For inducible genes with rapid transcriptional activation [66], it is desirable to trigger highly tunable expression and allow transcription overshoot to lift the limitation of mRNA lifetime on response speed.This can be achieved when τ P and/or p 1 (with τ A > τ I ) or τ S and/or p 1 (with τ A < τ I ) are regulated.Meanwhile, transcriptional overshoot can be evoked when Pol II is poised for transcription, or there exist multiple rate-limiting steps, e.g., a refractory period and PIC recruitment during the inactive phase.The genes are indeed prone to overshoot and respond fast in the presence of a refractory period [46], helpful for the rapid and accurate transmission of information.On the contrary, eukaryotic genes with TATA box often burst with a large size and seldom overshoot [11], conducive to filtering environmental fluctuations, and the TATA box seems to be correlated with lack of a refractory period.
The event model engages various types of regulatory factors; nevertheless, only the concentration of one regulator was altered, while the concentrations of others were kept constant in simulations.When transcriptional modulation involves coordinating multiple regulators and the relationships between their concentrations are known, it is possible to elucidate the combinatorial control of bursting parameters.We could determine the dependence of bursting parameters on the concentration of one of the involved regulators and generate <m>-F curves under diverse conditions.Note that the conclusions drawn in Figs.3A, 4D to F, and 5 rely solely on the values of τ P , τ S1 , τ S2 , τ S3 , p 1 , and p 2 and are not influenced by the singleregulator assumption.In Figs.3C and D and 4A to C, the experimental data were collected when only one kind of regulator was monitored.If the concentrations of multiple regulators change simultaneously, the trends depicted in Fig. 3B may undergo changes, although the <m>-F τP and <m>-F τS curves still constitute the boundaries.
Transcriptional noise can be divided into intrinsic and extrinsic noise [71].Here, the intrinsic noise is mainly manifested in the stochasticity in event orders, event durations, and mRNA degradation.Treating mRNA degradation as a one-step process is a simplification; however, it was reported that, in a broad range of parameter space, models assuming either multistep or one-step degradation of mRNA yield indistinguishable mRNA count distributions [16].The current work did not explore extrinsic noise, which is influenced by various factors including cell volume, gene copies, DNA replication, and cell cycle progression [72,73].For example, it has been shown that the presence of 2 uncorrelated alleles halves the noise intensity [56].It was reported that extrinsic noise typically leads to a more dispersed mRNA count distribution, such as an increased probability of low numbers or a longer tail, and that extrinsic noise contributes substantially to the total noise at high expression levels, leading to a higher noise plateau [11].It was also demonstrated that parameter variability among cells should be taken into account to fully interpret the experimental data [54].Furthermore, when analyzing data obtained through single-cell measurements, technological noise is also unavoidable [53].All these suggest that the event model has to be extended to incorporate more sources of noise, and it would be interesting to dissect the contributions of intrinsic and extrinsic noise.
The current work explored the generic case where the dwell times of regulators are much shorter than the time taken to switch between the inactive and active phases, regulator binding promotes transcription with monotonic gene-regulatory functions, and only one gene locus is involved [5,34,35,74].More advanced models could be developed to probe the gene-specific transcriptional activity.If the concentration of regulators varies periodically (e.g., the pulsing of p53 and nuclear factor κB levels), how transcriptional bursting is modulated to reliably code signals could be the focus of further study.
In summary, the event model, built upon a small set of adjustable functional events, offers a clear and straightforward description of transcriptional progression.This approach facilitates the investigation of both shared and specific mechanisms for gene regulation.Modulation can occur at any stage of a burst cycle, but the underlying regulatory modes are limited in number because of intrinsic correlations among transcriptional events.Our results suggest that large Fano factor is required for transcriptional bursting at high expression levels and the transitions E I P and I E , involved in the recruitment, pause and release, or elongation of Pol II, are pivotal regulatory targets for enhancing the transcriptional sensitivity, anti-crosstalk, adaptability, and responsiveness.Combining experimental data with the event model allows for deducing the molecular processes that predominantly govern transcriptional activity.

Methods
The development of the event model is an important part of this work, and the model is the basis of the whole study.Thus, we provided the details on building the model in the "Establishment of the event model" section.Texts S1 to S8 further present the derivation of formula and explanations.We made a detailed reference to the Supplementary Materials in the text.
) of mRNA numbers for different distributions of f P and f S .

Fig. 1 . 1 + , J 1 − , J 2 +, and J 2 −
Fig. 1.Model of the transcriptional cycle and transcriptional regulation.(A) Schematic of biochemical reactions underlying the gene states (top) and state transitions (middle; large circles denote gene states, while small circles denote omitted states).According to whether the core promoter is open or activated, all states are divided into 5 subsets (C, F, A, I, and E; bottom).(B) Flow-process diagram of the event-based model simplified from (A).There are 6 non-Markovian events (rectangles) and 2 checkpoints (diamonds)."Y" and "N" separately represent whether the criterion is satisfied or not; r i and r j are random numbers from the uniform distribution on the unit interval.Transcription begins with E begin , goes through E S2 , E P , E S1 , and E S3 , and ends with E end .The ordering of the events is determined by p 1 and p 2 , which are decided by the event flows (J 1 + , J 1 − , J 2 + , and J 2 −

Fig. 2 .
Fig. 2. Sensitivity to changes in regulator concentration ([R]) and resistance against crosstalk.(A) Schematic of the mean regulator occupancy rate (top) and steady-state transcription rate (bottom) versus the normalized concentration of regulators.K d denotes the equilibrium dissociation constant for regulator binding, and K d /Ω X reflects the transcriptional sensitivity.(B) Schematic of the mean transcription rate (top) and specificity S (bottom) versus [R]/K d .The equilibrium dissociation constant is K d and 100K d , respectively, for cognate and nontarget binding.(C) Regulation of E I P , I E , A I , F A , or C F promotes the gene's sensitivity to the regulator when its binding facilitates state transitions.(D) S m versus Ω X /K d under different regulatory modes.The default parameters are τ PU = 1 min, τ S1U = 5 min, τ S2U =10 min, τ S3U = 20 min, p 2U = 0.5, and p 1U = 0.5, leading to τ AU < τ IU (top); τ PU = 10 min, τ S1U = 2 min, τ S2U =5 min, τ S3U = 10 min, p 2U = 0.5, and p 1U = 0.9, leading to τ AU > τ IU (bottom).τ PB , τ S1B , τ S2B , τ S3B , p 1B , and p 2B are decided by Ω X /K d and regulatory modes (see Table3).
or C F is modulated, and <m> = bfτ m , Ω f < Ω m < Ω b generally holds true, and thus f, <m>, and b reach their respective maximum values successively (see Text S4 and Fig. S8 for details).Furthermore, f is sensitive to changes in <m> at low expression, while b is insensitive, and vice versa at high expression (Fig. 4B, top) .We also used S b = d ln b d ln m and S f = d ln f d ln m (S b + S f = 1; S b and

Fig. 4 .
Fig. 4. Modulation of transcriptional bursting.Time is in units of minutes.(A) Mean mRNA count <m>, burst size b, and burst frequency f, normalized by their respective maximum values, versus the normalized regulator concentration [R].The regulator binding affects p 1 and τ S ; τ P = 0.2, p 1B = 0.995, p 1U = 0.5, τ SU = 250, τ SB = 5 and n = 1.(B) Burst size and frequency versus the mean mRNA number (top); S b and S f versus <m> (bottom).The dashed line marks the case of τ I = τ A .The black arrows point to <m> with S f = S b .The parameters are the same as in (A).The curves in (A) and (B) are plotted on the basis of analytic expressions.(C) A fit to the experimental data (symbols) in [47] using the event model with τ P = 0.139, p 1U = 0.401, p 1B = 0.995, τ SU = 498, and τ SB = 0.694 (solid line).(D) Experimental data on the mean expression level and Fano factor in [52].Each point is from an individual gene with the shadow showing the range.The red fitting curve is obtained by Gaussian regression.The black curve shows the lower bound on Fano factor at large <m>.The inset shows 2 m = 2 m ∕<m> 2 versus <m>, with the red dashed line representing the mean.(E) The red lines show Fano factor versus <m>, and the black line shows the lower limit of Fano factor at large <m> with τ S ≥ 1.The default parameters are the same as in (A).(F) Fano factor versus <m> with the event model or telegraph model (inset).Ten thousand sets of bursting parameters are used in each panel, where the duration of a burst varies between 1 and 1,000 min, the transcription rate is from 0.1 to 100/min and τ A < τ I .The refractory period is set to 1 (blue), 5 (red), or 10 min (yellow).

Fig. 5 .
Fig. 5. Response patterns upon stimulation.Time is in units of minutes.(A) Schematic of time courses of [R], the mRNA number (black; red bars represent Pol II entering productive elongation), the probability of gene activation, and the mean mRNA count (from top to bottom).The panel in the second row also marks the durations of the active phase (t A ), inactive phase (t I ), first inactive phase (t I_init ), first burst (t first ), and a burst in steady state (t ss ).(B) Time courses of P A with τ A = 10, α I = 2, and θ I = 10.The α and θ of f first are 3 and 5 (τ first = 15, CV first 2 = 1/3), 3 and 10 (τ first = 30, CV first 2 = 1/3), 3 and 15 (τ first = 45, CV first 2 = 1/3), 5 and 6 (τ first = 30, CV first 2 = 1/5), and 2 and 15 (τ first = 30, CV first 2 = 1/2), respectively.The data points are from simulation, while the solid curves are from analytical expression (Eq.H2 in Text S8).(C) Time courses of the mean mRNA number.f A and f I are the same as in (B); the α and θ of f first are 3 and 5 (τ first = 15, CV first 2 = 1/3) for τ m = 1, 10, or 20, and P A overshoots.In the inset, the α and θ of f first are 2 and 15 (τ first = 30, CV first 2 = 1/2) for τ m = 1, 10, or 20, and P A does not overshoot.<m>(t) is normalized by <m> ss .The data points are from simulation, while the solid curves are from Eq. H5. (D) Color-coded H P controlled by τ first /T and CV first .The black points signify that P A does not overshoot with h P T < 1, while the red curve corresponds to h P T = 1.Small values of τ first /T and CV first lead to large H P .(E) Color-coded H m controlled by τ first /T, CV first , and T/τ m .The black points denote that m does not overshoot with h m T < 1, while the red surface represents h m T = 1.(F) Color-coded t re controlled by τ first /T, CV first , and T/τ m .In (D) to (F), the initial state is completely silent, and τ A ∈ (0.1,10), τ I ∈ (0.1,100), CV P = 1, CV i ∈ (0.1,1) (i = S1, S2, S3), and τ I_init = xτ I with x ∈ (0.1, 10).

Table 1 .
Gene state sets.

Table 2 .
Reactions in functional transitions.