Stochastic models of transcription: From single molecules to single cells

doi:10.1016/j.ymeth.2013.03.026

Methods

Volume 62, Issue 1, 15 July 2013, Pages 13-25

https://doi.org/10.1016/j.ymeth.2013.03.026 Get rights and content

Abstract

Genes in prokaryotic and eukaryotic cells are typically regulated by complex promoters containing multiple binding sites for a variety of transcription factors leading to a specific functional dependence between regulatory inputs and transcriptional outputs. With increasing regularity, the transcriptional outputs from different promoters are being measured in quantitative detail in single-cell experiments thus providing the impetus for the development of quantitative models of transcription. We describe recent progress in developing models of transcriptional regulation that incorporate, to different degrees, the complexity of multi-state promoter dynamics, and its effect on the transcriptional outputs of single cells. The goal of these models is to predict the statistical properties of transcriptional outputs and characterize their variability in time and across a population of cells, as a function of the input concentrations of transcription factors. The interplay between mathematical models of different regulatory mechanisms and quantitative biophysical experiments holds the promise of elucidating the molecular-scale mechanisms of transcriptional regulation in cells, from bacteria to higher eukaryotes.

Introduction

Transcriptional regulation is a complex biochemical process that often involves multiple transcription factors that bind to multiple sites on regulatory DNA in response to intracellular or extracellular signals. When bound to these regulatory DNA sequences, transcription factors either inhibit or enhance transcription through interactions with RNA Polymerase (RNAP) and other transcription factors. Most regulatory sequences, which we refer to as “promoters”, contain several operator sequences (“operators”), each of which can often be recognized with different affinities by more than one type of transcription factor. Even bacterial promoters, which are often considered to be simple in comparison with their eukaryotic counterparts, can exist in a surprisingly large number of regulatory states. For instance, the P_RM promoter of phage lambda in Escherichia coli is regulated by two different transcription factors, which bind to two distal sets of three operators that can be brought together by looping out the intervening DNA. As a result, the number of regulatory states (each of which corresponds to a specific combination of occupancies of the different operators) of the P_RM promoter is 128 [1]. Eukaryotic promoters can be even more complex, also involving nucleosomes that compete with (and may be removed by) transcription factors [2]. Furthermore, eukaryotic promoters are often epigenetically regulated via histone modifications [3], [4], [5], in addition to the more conventional regulation by transcription factors. This may lead to very complex promoter dynamics, which may also involve a separation of timescales between genetic and epigenetic regulation [3]. Given the complexity of most promoters, quantitative mathematical models play an important role in testing molecular-scale mechanisms of transcriptional regulation, helping to connect these biochemical models of regulation, proposed in response to in vitro experiments with purified components, with quantitative gene expression measurements in vivo.

The first generation of such models was developed in response to experiments where the transcriptional response from a population of cells was measured as a function of intracellular concentrations of transcription factors [6], [7], [8], [9]. These models helped connect specific promoter architectures (characterized by the arrangement of transcription factor binding sites in the promoter region) with their input–output functions, i.e., the amount of transcripts produced as a function of the input concentrations of all the transcription factors involved.

More recently, technological developments such as the use of fluorescent proteins as reporter genes have made it possible to extend these studies to single cells. Given that genes are present at very low copy numbers (typically 1–2 copies per cell), as are transcription factors (as little as 5–10 copies per cell), transcription in single cells is an inherently stochastic process. This stochasticity leads to random fluctuations in the transcriptional output of single cells. As a result, the outcome of these single-cell transcription experiments contains more information than the population average of gene expression reported in bulk experiments. The whole distribution rather than just the average mRNA and/or protein number per cell in a population can be measured [10], [11]. In addition, by tracking the number of mRNA and protein molecules as a function of time in single cells, these experiments reveal many aspects of the dynamics of transcription that are obscured by bulk experiments [12], [13]. Notably, direct monitoring of transcriptional dynamics in live cells has demonstrated that transcription may occur in bursts, both in bacteria such as E. coli [12], [14], and in eukaryotic cells such as Dyctostelium [15].

The class of models developed in response to bulk experiments, the so-called “thermodynamic models”, focused on computing the steady state occupancies of the different operators by the transcription factors [6], [7], [8], [9], [16], [17]. For specific promoter architecture, these models can be used to predict the equilibrium probability of each promoter state and therefore the average transcriptional output. Even though these models have been very useful for computing average gene expression levels in steady state (see for instance [9], [18], [19], [20], [21]), they have nothing to say about the dynamics of gene regulation, i.e., which promoter states are kinetically connected, and how often the promoter makes transitions from one state to another. To address these questions, a new class of stochastic models of gene regulation have been developed during the last decade [22], [23], [24], [25], [26], which are specifically tailored to deal with transcription from arbitrarily complex promoters at the single-cell level. Here we give an overview of these models, the equations that emerge from them and how they can be used to address single-cell experiments, and we discuss their limitations.

The task of predicting the distribution of mRNA or protein copy number across a population of cells, or the distribution of times between transcription initiation events in a single cell, is significantly more challenging than determining the mean expression level. While the latter involves a straightforward application of equilibrium statistical mechanics, the former requires formulation of stochastic differential equations, or chemical master equations, which are often not tractable analytically, except in certain limits. In spite of this, several theoretical approaches have been developed that provide insights about how promoter dynamics affects stochasticity of gene expression [27], [28], [29] and how promoter architecture affects the transcriptional output from single cells [18], [23], [24], [26], [30].

While the focus of this paper is the relation between promoter architecture and noise in gene expression, other sources of heterogeneity in gene expression at the single cell level have been analyzed as well, such as fluctuations in transcription factor concentration, diffusion of transcription factors to the promoter, the presence of transcriptional feedback through self-regulation, or fluctuations in the global cellular state [31], [32], [33], [34]. We consider these sources of fluctuations in the transcriptional output of cells, as well as the mathematical methods that have been used to describe them, in Section 4.

This paper is organized as follows: First, we discuss methods for computing the probability distributions of mRNA and protein copy-number from stochastic models of transcription. These distributions can be measured in experiments that count the mRNA or protein content of a single cell in a population (Fig. 1A). We demonstrate that these distributions are significantly affected by the dynamics of transcription and we discuss how information about the dynamics can be extracted from experiments. In the second part of this review we focus on methods to compute the distribution of times between subsequent transcriptional events, another measurable quantity in single-cell transcription experiments; see Fig. 1B. Just as before, our goal is to illustrate how a complex molecular model of promoter dynamics, in which the promoter can exist in any number of states, can be associated with an equation that in turn can be connected to experimental data.

We believe that this dialog between experimental data and mathematical modeling, where quantitative data is used to test model predictions, and models are further refined based on comparisons with experimental outcomes, is essential in order to drive progress in quantitative understanding of how gene regulatory function is determined by the sequence of regulatory DNA. However, it is important to be aware of the assumptions and limitations inherent to any equation that is formulated in response to experiments on single cells, which are far more complex than the models. In the third part of this paper, we discuss these limitations, and highlight areas in which further theoretical and experimental developments are needed.

Section snippets

Steady state distributions of mRNA and protein copy number

Recent experimental [10], [12], [35], [36], [37] and theoretical [22], [23], [28], [29], [30] studies have demonstrated that the distribution of mRNA copy number per cell, and its moments, can be dramatically affected by the underlying dynamics of the promoter controlling transcription of the mRNA being measured. Even qualitative features of the distribution, such as whether it is bimodal or unimodal, are determined by the detailed properties of promoter dynamics, such as the values of the

Distribution of times between subsequent transcription events

Experiments that reveal the dynamics of transcription initiation at promoters can also reveal the molecular mechanisms of transcription regulation [59], [60]. Several such experiments, where the synthesis of new mRNA molecules was visualized in live cells with single molecule resolution have been done so far, both in bacteria and in eukaryotic cells [12], [15], [61], [62]. These experiments have demonstrated that transcription can occur in bursts, and a typical output from such an experiment is

Discussion and outlook

As with any quantitative model, especially one attempting to describe processes within a living cell, it is important to understand the limitations of the chemical master equation description of transcription presented here. Particular care has to be taken when using mathematical models in conjunction with data in order to test specific hypotheses about biological mechanisms. When models are most informative is when there is a discrepancy between the model predictions and experimental data.

Acknowledgements

We are indebted to Rob Phillips, Hernan Garcia, and Jeff Gelles for numerous discussions which have shaped our understanding of transcriptional regulation, and to the NSF for financial support via grants DMR-0706458 and DMR-1206146.

References (77)

A. Halme et al.
Cell
(2004)
L. Weinberger et al.
Mol. Cell
(2012)
M.A. Shea et al.
J. Mol. Biol.
(1985)
L. Bintu et al.
Curr. Opin. Genet. Dev.
(2005)
L. Bintu et al.
Curr. Opin. Genet. Dev.
(2005)
I. Golding et al.
Cell
(2005)
J.R. Chubb et al.
Curr. Biol.
(2006)
H.G. Garcia et al.
Trends Cell Biol.
(2010)
H. Boeger et al.
Cell
(2008)
M.L. Simpson et al.
J. Theor. Biol.
(2004)

J. Peccoud et al.

Theor. Popul. Biol.

(1995)

T.B. Kepler et al.

Biophys. J.

(2001)

J. Rausenberger et al.

Biophys. J.

(2008)

A.M. Walczak et al.

Biophys. J.

(2009)

D. Kennell et al.

J. Mol. Biol.

(1977)

J. Zhang et al.

Biophys. J.

(2012)

L.J. Friedman et al.

Cell

(2012)

J.S. van Zon et al.

Biophys. J.

(2006)

P.S. Swain

J. Mol. Biol.

(2004)

P.R. Cook

J. Mol. Biol.

(2010)

H.G. Garcia et al.

Cell Rep.

(2012)

J.M.G. Vilar et al.

Bioinformatics

(2010)

G. Hornung et al.

Genome Res.

(2012)

L.M. Octavio et al.

PLoS Genet.

(2009)

G.K. Ackers et al.

Proc. Natl. Acad. Sci. USA

(1982)

A. Raj et al.

PLoS Biol.

(2006)

G.-W. Li et al.

Nature

(2011)

L. So et al.

Nat. Genet.

(2011)

T.T. Le et al.

Proc. Natl. Acad. Sci. USA

(2005)

L. Saiz et al.

Nucleic Acids Res.

(2008)

T. Kuhlman et al.

Proc. Natl. Acad. Sci. USA

(2007)

H.G. Garcia et al.

Proc. Natl. Acad. Sci.

(2011)

J. Gertz et al.

Nature

(2009)

E. Segal et al.

Nat. Rev. Genet.

(2009)

Á. Sánchez et al.

Proc. Natl. Acad. Sci.

(2008)

A. Coulon et al.

BMC Syst. Biol.

(2010)

T. Höfer et al.

Genome Inform.

(2005)

J. Paulsson

Nature

(2004)

Cited by (45)

Governing principles of transcriptional logic out of equilibrium
2024, Biophysical Journal
To survive, adapt, and develop, cells respond to external and internal stimuli by tightly regulating transcription. Transcriptional regulation involves the combinatorial binding of a repertoire of transcription factors to DNA, which often results in switch-like binary outputs akin to Boolean logic gates. Recent experimental studies have demonstrated that in eukaryotes, transcription factor binding to DNA often involves energy expenditure, thereby driving the system out of equilibrium. The governing principles of transcriptional logic operations out of equilibrium remain unexplored. Here, we employ a simple two-input, single-locus model of transcription that can accommodate both equilibrium and nonequilibrium mechanisms. Using this model, we find that nonequilibrium regimes can give rise to all the logic operations accessible in equilibrium. Strikingly, energy expenditure alters the regulatory function of the two transcription factors in a mutually exclusive manner. This allows for the emergence of new logic operations that are inaccessible in equilibrium. Overall, our results show that energy expenditure can expand the range of cellular decision-making without the need for more complex promoter architectures.
Effects of microRNA-mediated negative feedback on gene expression noise
2023, Biophysical Journal
MicroRNAs (miRNAs) are small noncoding RNAs that regulate gene expression post-transcriptionally in eukaryotes by binding with target mRNAs and preventing translation. miRNA-mediated feedback motifs are ubiquitous in various genetic networks that control cellular decision making. A key question is how such a feedback mechanism may affect gene expression noise. To answer this, we have developed a mathematical model to study the effects of a miRNA-dependent negative-feedback loop on mean expression and noise in target mRNAs. Combining analytics and simulations, we show the existence of an expression threshold demarcating repressed and expressed regimes in agreement with earlier studies. The steady-state mRNA distributions are bimodal near the threshold, where copy numbers of mRNAs and miRNAs exhibit enhanced anticorrelated fluctuations. Moreover, variation of negative-feedback strength shifts the threshold locations and modulates the noise profiles. Notably, the miRNA-mRNA binding affinity and feedback strength collectively shape the bimodality. We also compare our model with a direct auto-repression motif, where a gene produces its own repressor. Auto-repression fails to produce bimodal mRNA distributions as found in miRNA-based indirect repression, suggesting the crucial role of miRNAs in creating phenotypic diversity. Together, we demonstrate how miRNA-dependent negative feedback modifies the expression threshold and leads to a broader parameter regime of bimodality compared to the no-feedback case.
Mechanisms of cellular mRNA transcript homeostasis
2022, Trends in Cell Biology
For most genes, mRNA transcript abundance scales with cell size to ensure a constant concentration. Scaling of mRNA synthesis rates with cell size plays an important role, with regulation of the activity and abundance of RNA polymerase II (Pol II) now emerging as a key point of control. However, there is also considerable evidence for feedback mechanisms that kinetically couple the rates of mRNA synthesis, nuclear export, and degradation to allow cells to compensate for changes in one by adjusting the others. Researchers are beginning to integrate results from these different fields to reveal the mechanisms underlying transcript homeostasis. This will be crucial for moving beyond our current understanding of relative gene expression towards an appreciation of how absolute transcript levels are linked to other aspects of the cellular phenotype.
Distribution of Initiation Times Reveals Mechanisms of Transcriptional Regulation in Single Cells
2018, Biophysical Journal
Citation Excerpt :
To connect mechanisms of transcription initiation with measured times between successive initiation events, we consider a stochastic model of transcription with a general initiation mechanism, in which the promoter can be in an arbitrary number of states defined by different constellations of bound transcription factors and cofactors. Using a chemical master equation approach (22,55,56), we show that the distribution of times between two initiation events and its moments can be computed analytically for any mechanism of transcription initiation. These equations allow us to discriminate between different mechanisms of initiation by comparing the predicted distributions to experimental distributions of transcription initiation times.
Transcription is the dominant point of control of gene expression. Biochemical studies have revealed key molecular components of transcription and their interactions, but the dynamics of transcription initiation in cells is still poorly understood. This state of affairs is being remedied with experiments that observe transcriptional dynamics in single cells using fluorescent reporters. Quantitative information about transcription initiation dynamics can also be extracted from experiments that use electron micrographs of RNA polymerases caught in the act of transcribing a gene (Miller spreads). Inspired by these data, we analyze a general stochastic model of transcription initiation and elongation and compute the distribution of transcription initiation times. We show that different mechanisms of initiation leave distinct signatures in the distribution of initiation times that can be compared to experiments. We analyze published data from micrographs of RNA polymerases transcribing ribosomal RNA genes in Escherichia coli and compare the observed distributions of interpolymerase distances with the predictions from previously hypothesized mechanisms for the regulation of these genes. Our analysis demonstrates the potential of measuring the distribution of time intervals between initiation events as a probe for dissecting mechanisms of transcription initiation in live cells.
Single-cell systems biology: Probing the basic unit of information flow
2018, Current Opinion in Systems Biology
Citation Excerpt :
The earliest ‘Telegraph’ model for describing how information is processed through gene expression dynamics [62] was based on a single active and inactive state. The model proved to fit expression data in some instances [36,63,64], but there are increasing examples which illustrate that two states are insufficient to represent the data [65–68]. Recently, Rieckh and colleagues [69] identified instances in which a multi-state promoter model performs better than a simple two-state model; however, they advocate the two-state model as the simplest theoretical baseline to start from, as it is possible to overfit the data with too many states.
Gene expression varies across cells in a population or a tissue. This heterogeneity has come into sharp focus in recent years through developments in new imaging and sequencing technologies. However, our ability to measure variation has outpaced our ability to interpret it. Much of the variability may arise from random effects occurring in the processes of gene expression (transcription, RNA processing and decay, translation). The molecular basis of these effects is largely unknown. Likewise, a functional role of this variability in growth, differentiation and disease has only been elucidated in a few cases. In this review, we highlight recent experimental and theoretical advances for measuring and analyzing stochastic variation.
Mathematical aspects of the regulation of gene transcription by promoters
2017, Mathematical Biosciences
Citation Excerpt :
Such features of gene expression have important consequences for cellular function, being beneficial in some contexts and harmful in others [19–21]. The corresponding theoretical studies are numerous (see e.g. already mentioned reviews [3–16], recent original studies [22–28], and references therein). In fact, expression (1) often remains to be applicable in this case provided pi are treated as stochastic variables.
Although the transcriptional regulation of gene expression has been a subject of intense experimental and theoretical studies over the past several decades, the understanding of this process is still incomplete. In particular, the models focused on the function of transcription factors usually take into account only the lateral interactions between them in the ground bound state. The rates of attachment and detachment of transcription factors on the promoter sites depend, however, also on the lateral interactions in the activated state. I present general equations describing the effect of both these interactions on the rates of attachment and detachment and illustrate their role in the kinetics of gene expression by using a generic model focused on the function of a gene regulated via two promoter sites. The corresponding analytical treatment and Monte Carlo simulations show that the lateral interaction in the activated state is significant if the genes are expressed in the regime of stochastic bursts of high and low transcriptional activity and RNA and protein populations. In particular, the duration and shape of bursts depend on this interaction.

View all citing articles on Scopus

¹: These authors contributed equally to this work.

View full text

Stochastic models of transcription: From single molecules to single cells

Abstract

Introduction

Section snippets

Steady state distributions of mRNA and protein copy number

Distribution of times between subsequent transcription events

Discussion and outlook

Acknowledgements

Cell

Mol. Cell

J. Mol. Biol.

Curr. Opin. Genet. Dev.

Curr. Opin. Genet. Dev.

Cell

Curr. Biol.

Trends Cell Biol.

Cell

J. Theor. Biol.

Theor. Popul. Biol.

Biophys. J.

Biophys. J.

Biophys. J.

J. Mol. Biol.

Biophys. J.

Cell

Biophys. J.

J. Mol. Biol.

J. Mol. Biol.

Cell Rep.

Bioinformatics

Genome Res.

PLoS Genet.

Proc. Natl. Acad. Sci. USA

PLoS Biol.

Nature

Nat. Genet.

Proc. Natl. Acad. Sci. USA

Nucleic Acids Res.

Proc. Natl. Acad. Sci. USA

Proc. Natl. Acad. Sci.

Nature

Nat. Rev. Genet.

Proc. Natl. Acad. Sci.

BMC Syst. Biol.

Genome Inform.

Nature