Quantifying how post-transcriptional noise and gene copy number variation bias transcriptional parameter inference from mRNA distributions

Transcriptional rates are often estimated by fitting the distribution of mature mRNA numbers measured using smFISH (single molecule fluorescence in situ hybridization) with the distribution predicted by the telegraph model of gene expression, which defines two promoter states of activity and inactivity. However, fluctuations in mature mRNA numbers are strongly affected by processes downstream of transcription. In addition, the telegraph model assumes one gene copy but in experiments, cells may have two gene copies as cells replicate their genome during the cell cycle. While it is often presumed that post-transcriptional noise and gene copy number variation affect transcriptional parameter estimation, the size of the error introduced remains unclear. To address this issue, here we measure both mature and nascent mRNA distributions of GAL10 in yeast cells using smFISH and classify each cell according to its cell cycle phase. We infer transcriptional parameters from mature and nascent mRNA distributions, with and without accounting for cell cycle phase and compare the results to live-cell transcription measurements of the same gene. We find that: (i) correcting for cell cycle dynamics decreases the promoter switching rates and the initiation rate, and increases the fraction of time spent in the active state, as well as the burst size; (ii) additional correction for post-transcriptional noise leads to further increases in the burst size and to a large reduction in the errors in parameter estimation. Furthermore, we outline how to correctly adjust for measurement noise in smFISH due to uncertainty in transcription site localisation when introns cannot be labelled. Simulations with parameters estimated from nascent smFISH data, which is corrected for cell cycle phases and measurement noise, leads to autocorrelation functions that agree with those obtained from live-cell imaging.


Introduction
Transcription in single cells occurs in stochastic bursts (Suter et al., 2011;Larsson et al., 2019). Although the first observation of bursting occurred more than 40 years ago (McKnight and Miller, 1977), the precise mechanisms behind this phenomenon are still under active investigation (Nicolas et al., 2017;Tunnacliffe and Chubb, 2020). The direct measurement of the dynamic properties of bursting employs live-cell imaging approaches, which allow visualization of bursts as they occur in living cells (Donovan et al., 2019). However, in practice, such live-cell measurements are challenging because they are low-throughput and require genome-editing (Brouwer et al., 2020;Lenstra and Larson, 2016). To circumvent this, one can exploit the fact that bursting creates heterogeneity in a population. In this case, it is relatively straightforward to obtain a steady-state distribution of the number of mRNAs per cell from smFISH or single-cell sequencing experiments. These distributions have been used to infer dynamics by comparison to theoretical models. The simplest mathematical model describing bursting is the telegraph (or two-state) model (Peccoud and Ycart, 1995;Raj et al., 2006). In this model, promoters switch between an active and inactive state, where initiation occurs during the active promoter state. The model makes the further simplifying assumption that the gene copy number is one and that all the reactions are effectively first-order. The mRNA in this model can be interpreted as cellular (mature) mRNA since its removal via various decay pathways in the cytoplasm is known to follow single-exponential (first-order) decay kinetics in eukaryotic cells (Wang et al., 2002;Herzog et al., 2017). The solution of the telegraph model for the steady-state distribution of mRNA numbers has been fitted to experimental mature mRNA number distributions to estimate the transcriptional parameters (Raj et al., 2006;Kim and Marioni, 2013;Suter et al., 2011;Larsson et al., 2019).
However, the reliability of the estimates of transcriptional parameters from mRNA distributions is questionable because the noise in mature mRNA (and consequently the shape of the mRNA distribution) is affected by a wide variety of factors. Recent extensions of the telegraph model have carefully investigated how mRNA fluctuations are influenced by the number of promoter states (Zhou and Zhang, 2012;Ham et al., 2020b), polymerase dynamics , cell-to-cell variability in the rate parameter values (Dattani and Barahona, 2017;Ham et al., 2020a), replication and binomial partitioning due to cell division , nuclear export (Singh and Bokes, 2012) and cell cycle duration variability (Perez-Carrasco et al., 2020). One way to avoid noise from various posttranscriptional sources is to measure distributions of nascent mRNA rather than mature mRNA, and then fit these to the distributions predicted by an appropriate mathematical model. A nascent mRNA (Zenklusen et al., 2008;Larson et al., 2009) is an mRNA that is being actively transcribed, that is it is still tethered to an RNA polymerase II (Pol II) moving along a gene during transcriptional elongation. Fluctuations in nascent mRNA numbers thus directly reflect the process of transcription. Because nascent mRNA removal is not first-order, an extension of the telegraph model has been developed (the delay telegraph model) .
However, nascent mRNA data still suffers from other sources of noise due to cell-to-cell variability. For example in an asynchronous population of dividing cells, cells can have either one or two gene copies. In the absence of a molecular mechanism that compensates for the increase in gene copy number upon replication, cells with two gene copies which cannot be spatially resolved will have a different distribution of nascent mRNA numbers (one with higher mean) than cells with one gene copy. The importance of the cell cycle is illustrated by the finding (Zopf et al., 2013) that noisy transcription from the synthetic TetO promoter in S. cerevisiae is dominated by its dependence on the cell cycle. The estimation of transcriptional parameters from nascent mRNA data for pre-and post-replication phases of the cell cycle has, to the best of our knowledge, only been reported in Skinner et al., 2016. Interestingly, all the studies that estimate transcriptional parameters from nascent mRNA data Xu et al., 2015;Zoller et al., 2018;Senecal et al., 2014;Fritzsch et al., 2018) do not compare them with transcriptional parameters estimated from cellular (mature) mRNA data measured in the same experiment. Similarly, a quantitative comparison between inference from cellcycle-specific data and data which contains information from all cell cycle phases is lacking. Likely, this is because it is considered evident that quantifying fluctuations earlier in the gene expression process and adjusted for the cell-cycle will improve estimates. However, nascent mRNA distributions are technically more challenging to acquire than mature mRNA distributions; and inference from nascent mRNA distributions is substantially more complex . Thus, it still needs to be shown Fu, Patel et  that acquiring nascent mRNA data is a necessity from a parameter inference point of view, i.e. that it leads to significantly different and more robust estimates than using mature mRNA data. We also note that current studies report parameter inference from organisms where it is possible to label introns to identify mRNA located at the transcription site. This is not possible in many yeast genes and other microorganisms, and in these cases it is unclear how to correct parameter estimates for uncertainty in the transcription site location.
In this paper, we sought to understand the precise impact of post-transcriptional noise and cellto-cell variability on the accuracy of transcriptional parameters inferred from mature mRNA data. The fitting algorithms (for mature and nascent mRNA data) were first tested on simulated data, where limitations of the algorithms were uncovered in accurately estimating the transcriptional parameters in certain regions of parameter space. The algorithms were then applied to four independent experimental data sets, each measuring GAL10 mature and nascent mRNA data from smFISH in galactose-induced budding yeast, conditional on the stage of the cell cycle (G1 or G2) for thousands of cells. Comparison of the transcriptional parameter estimates allowed us to separate the influence of ignoring cell cycle variability from that of post-transcriptional noise (mature vs nascent mRNA data). We found that only fitting of nascent cell-cycle data, corrected for measurement noise (due to uncertainty in the transcription site location), provided good agreement with measurements from live-cell data. Cell-cycle specific analysis also revealed that upon transition from G1 to G2, yeast cells show dosage compensation by reducing burst frequency, similar to mammalian cells (Padovan-Merhar et al., 2015). Our systematic comparison highlights the challenges of obtaining kinetic information from static data, and provides insight into potential biases when inferring transcriptional parameters from smFISH distributions.

Results
Inference from mature mRNA data vs inference from nascent mRNA data: testing inference accuracy using synthetic data To understand the accuracy of the inference algorithms from nascent and mature mRNA data, in various regions of parameter space, (i) we generated synthetic data using stochastic simulations with certain known values of the parameters; (ii) applied the inference algorithms to estimate the parameters from the synthetic data; (iii) compared the true and inferred kinetic parameter values.
The generation of synthetic mature mRNA data (mature mRNA measurements in each of 10 4 cells) using stochastic simulations of the telegraph model ( Figure 1a) is described in Methods Sections Mathematical model and Generation of synthetic mature mRNA data. The inference algorithm is described in detail in Methods Section Steps of the algorithm to estimate parameters from mature mRNA data. It is based on a maximization of the likelihood of observing the single cell mature mRNA numbers measured in a population of cells. The likelihood of observing a certain number of mature mRNA numbers from a given cell is given by evaluating the telegraph model's steady-state mature mRNA count probability distribution.
For nascent RNA data, we used stochastic simulations of the delay telegraph model (Figure 1b) to generate the position of bound Pol II molecules from which we constructed the synthetic smFISH signal in each of 10 4 cells (Methods Section Generation of synthetic nascent mRNA data). An inference algorithm estimates the parameters, based on a maximization of the likelihood of observing the single cell total fluorescence intensity measured in a population of cells (Methods Section Steps of the algorithm to estimate parameters from nascent mRNA data). Note that the likelihood of observing a certain fluorescence signal intensity from a cell is given by extension of the delay telegraph model (but not directly by the delay telegraph model itself) to account for the smFISH probe positions.
This extension takes into account that the experimental fluorescence data used in this manuscript was acquired from smFISH of PP7-GAL10 in budding yeast, where probes were hybridized to the PP7 sequences. Because the PP7 sequences are positioned at the 5' of the GAL10 gene, the fluorescence intensity of a single mRNA on the DNA locus resembles a trapezoidal pulse (see Figure 1 for an illustration). As the Pol II molecule travels through the 14 repeats of the PP7 loops, the fluorescence intensity increases as the fluorescent probes binds to the nascent mRNA (this is the linear part of the trapezoidal pulse). However, once all 14 loops on the nascent mRNA are bound by the fluorescent probes, the intensity of a single mRNA reaches maximal intensity and the plot plateaus as the RNA elongates through the GAL10 gene body before termination and release. The total fluorescent signal density function is hence given by where p(s|k) is the density function of the signal s given there are k bound Pol II molecules and P(k; θ) is the steady-state solution of the delay telegraph model giving the probability of observing k bound Pol II molecules for the parameter set θ . In Methods Section Mathematical model, we show how p (s|k) can be approximately calculated for the trapezoidal pulse. Hence Equation (1) represents the extension of the delay telegraph model to predict the smFISH fluorescent signal of the transcription site. Note that both of these inference algorithms were used to infer the promoter switching and initiation rate parameters. The degradation rate and the elongation time were not estimated but assumed to be known. The inference and synthetic data generation procedures are summarised and illustrated in Figure 1d.  Note that the length of the nascent mRNA tail increases until Pol II terminates at the end of the gene. As Pol II travels through the 14 repeats of the PP7 loops, the intensity of the mRNA increases due to fluorescent probe binding to the mRNA; intensity saturates as Pol II enters the GAL10 gene body.
(d) Illustration of the algorithms to generate synthetic data and to perform inference from mature and nascent mRNA data. The green boxes are only applicable for the inference of the fluorescence signal intensity of nascent mRNAs; note that in nascent mRNA inference, the "RNA number" in the The accuracy of inference was first calculated as the mean of the relative error in the estimated parameters σ off , σon , and ρ (for its definition see Methods, Equation (6)); note that this error measures deviations from the known ground truth values. Figure 2a shows, by means of a 3D scatter plot, the ratio of the mean relative error from nascent mRNA data (using delay telegraph model) and the mean relative error from mature mRNA data (using the telegraph model) for 789 independent parameter sets sampled on a grid (for each of these sets, we simulated 10 4 cells). The overall bluish hue of the plot suggested that the mean relative error from nascent mRNA data was typically less than the error from mature mRNA data. This was confirmed in Figure 2b where the same data was plotted but now as a function of the fraction of ON time (defined as f ON = σon/(σ off + σon) ). Out of 789 parameter sets, for 483 of them ( ≈ 61% ) the inference accuracy was higher when using nascent mRNA data.
Thus far, we have implicitly assumed that fluctuations in both nascent and mature mRNA are due to transcriptional bursting. However, it is clear that mature mRNA data exhibit a higher degree of noise due to post-transcriptional processing. For example, it has been shown that transcriptional noise is typically amplified during mRNA nuclear export (Hansen et al., 2018). In addition, cell-to-cell variation in the number of nuclear pore complexes has recently been identified as the source of heterogeneity in nuclear export rates within isogenic yeast populations (Durrieu et al., 2022). To take into Figure 2. Accuracy of the inferred kinetic parameters from synthetic mature and nascent mRNA data using the telegraph and delay telegraph model, respectively. (a) 3D scatter plot showing the ratio of the mean relative error from nascent mRNA data (using delay telegraph model MRE delay ) and the mean relative error from mature mRNA data (using the telegraph model MRE tele ) for 789 independent parameter sets sampled on a grid. Red data points indicate parameter sets with lower relative errors for mature data compared to nascent data, blue datapoints indicate parameter sets with lower relative error for nascent data compared to mature data (b) Same data as (a) but shown as a function of the fraction of ON time, f ON . For ≈ 61% of the parameters, the inference accuracy is higher when using nascent mRNA data. (c) Sampling from the same parameter space, we then add log-normal distributed noise (size 5%) to the initiation rate ρ (see text for details) to mimic external noise due to post-transcriptional processing that is only present in mature mRNA. Log 10 of the ratio of the median relative error (MRE) using perturbed mature mRNA data against Log 10 MRE using nascent mRNA data is shown as a function of the true fraction of ON time, f ON . For ≈ 64% of the parameters, the inference accuracy is higher when using nascent mRNA data. (d) The median relative error of each transcriptional parameter as a function of the fraction of ON time, using synthetic nascent mRNA, synthetic mature mRNA data and synthetic mature mRNA with external noise. Inference from nascent data is generally more accurate than using mature mRNA data. account these additional noise sources, which we call external noise, we added noise to the initiation rate ρ in the telegraph model since this rate implicitly models all processes between the synthesis of the transcript and the appearance of mature mRNA in the cytoplasm. Specifically, for each of the 789 parameter sets previously used, we changed ρ to ρ ′ where the latter is a log-normal distributed random variable such that its mean is ρ and its standard deviation is equal to 0.05 of the mean (5% external noise). Note that this implies that at the time of measurement, each cell in the population had a different value of the initiation rate. Simulations with this perturbed set of parameters led to a synthetic mature mRNA data set from which we re-inferred parameters using the telegraph model. In Figure 2c we show the ratio of mean relative error from nascent mRNA data and the mean relative error from perturbed mature mRNA data as a function of the fraction of ON time, f ON . The percentage of parameters where nascent mRNA is more accurate is slightly increased compared to the data without noise (64% versus 61% of the parameters; compare Figure 2c and Figure 2b). However, the addition of even more noise (10% external noise added to the initiation rate) increases the inference accuracy for 91% of the parameter sets when the nascent mRNA data is used (Appendix 1 and Appendix 1- figure 1).
To obtain more insight into the accuracy of the individual parameters, we next plotted the median relative error of transcriptional parameters σ off , σon, ρ , burst size and the inferred fraction of ON time, as a function of the true fraction of ON time (Figure 2d). We compared the results using synthetic nascent mRNA, synthetic mature mRNA data and synthetic mature mRNA with 5% external noise. The median of the relative error for each transcriptional parameter (as given by the second equation of Equation 8) was obtained for the subset of the 789 parameter sets for which the true fraction of ON time f ON falls into the interval [x − 0.05, x + 0.05] where x = 0.1, 0.2, . . . , 0.9 . From the plots, the following can be deduced: (i) the errors in σon (the burst frequency), σ off and the burst size ρ/σ off tend to increase with f ON while the rest of the parameters ( ρ and the estimated value of f ON ) decrease; (ii) for small f ON , the best estimated parameters are the burst frequency and size while for large f ON , it was ρ and the estimated value of f ON . The worst estimated parameter was σ off , independent of the value of f ON ; (iii) the addition of external noise to mature mRNA data had a small impact on inference for small f ON ; in contrast, for large f ON the noise appreciably increased the relative error in σ off and to a lesser extent the error in the other parameters too.
Additionally, in Appendices 1 and 2 we show that (i) independent of the accuracy of parameter estimation, the best fit distributions accurately matched the ground truth distributions (Appendix 1 and Appendix 1-figure 2); (ii) the parameters ordered by relative error were in agreement with the parameters ordered by sample variability (Appendix 1 and Appendix 1-table 1) and by profile likelihood error (Kreutz et al., 2013) (Appendix 1, Appendix 1-tables 2 and 3). Since from experimental data, only the sample variability and the profile likelihood error are available, it follows that the results of our synthetic data study in Figure 2 based on relative error from the ground truth have wide practical applicability; (iii) stochastic perturbation of the mature or nascent mRNA data (due to errors in the measurement of the number of spots and the fluorescent intensity) had little effect on the inference quality, unless the gene spent a large proportion of time in the OFF state (Appendix 1-tables 4 and 5); (iv) if one utilized the conventional telegraph model to fit the nascent data generated by the delay telegraph model, it was possible to obtain a distribution fitting as good as the delay telegraph model but with low-fidelity parameter estimation (Appendix 2, Appendix 2- figure 1 and Appendix 2table 1). Analytically, the telegraph model is only an accurate approximation of the delay telegraph model when the promoter switching timescales are much longer than the time spent by Pol II on a gene or the off switching rates are very small such that gene expression is nearly constitutive.
In summary, by means of synthetic experiments, we have clarified how the accuracy of the parameter inference strongly depends on the type of data (nascent or mature mRNA) and the fraction of time spent in the ON state (which determines the mode of gene expression).

Applications to experimental yeast mRNA data
Now that we have introduced the inference algorithms and tested them thoroughly using synthetic data, we applied the algorithms to experimental data (see Method Section Experimental data acquisition and processing for details of the data acquisition). Note that in what follows, delay telegraph model refers to the extended delay telegraph model that accounts for the smFISH probe positions that was used to predict the smFISH fluorescent signal of the transcription site. Inference from mature mRNA data: experimental artifacts We have four independent datasets from which we determined mRNA count and nascent RNA distributions. Figure 3a shows an example cell with mature single RNAs in the cytoplasm, and a bright nuclear spot representing the site of nascent transcription. Spots and cell outlines were identified using automated pipelines. Importantly, to obtain an accurate estimation of transcriptional parameters, the experimental input distributions of mRNA count and nascent RNAs require high accuracy. We therefore first determined how technical artifacts in the analysis affects the inference estimates. First, if the number of mRNA transcripts per cell is high, accurate determination of the number of transcripts may be challenging, as transcripts may overlap. To determine if this occurred in our datasets, we analyzed the distributions of intensities of the cytoplasmic spots, which revealed unimodal distributions where ∼90% of the detected spots fell in the range 0.5× median -1.5× median (Figure 4a). We therefore concluded that overlapping spots are not a large confounder in our data. In fact, in our experiments, the number of detected mature mRNA transcripts per cell was lower than expected, based on the number of nascent transcripts (compare Figure 3 with Figure 4). This discrepancy between nascent and mature transcripts likely arises because the addition of the PP7 loops to the GAL10 RNA destabilizes the RNA, resulting in faster mRNA turnover compared to most endogenous RNAs (Miller et al., 2011;Wang et al., 2002;Holstege et al., 1998;Geisberg et al., 2014). Previously, both shorter and longer mRNA half-lives from the addition of stem loops have been observed, which may be caused because changes in the 5' UTR length or sequence affect its recognition by the mRNA degradation machinery (Heinrich et al., 2017;Tutucci et al., 2018;Garcia and Parker, 2015). In our case, we note that such high turnover should aid transcriptional parameter estimates, as it closely reflects transcriptional activity.
A second possible source of error is cell segmentation. To test how cell segmentation errors contribute to the mature mRNA distribution and the transcriptional bursting estimates, we compared two independent segmentation tools, where segmentation 1 often resulted in missed spots (Figure 3b), resulting in an underestimation of the mean mRNA count and of the variance (compare Figure 3b and c). We inferred the transcriptional parameters using the algorithm described in Methods Section Steps of the algorithm to estimate parameters from mature mRNA data. In the absence of an experimental measurement of the degradation rate, we could only estimate the three transcriptional parameters normalised by d . The best fits of dataset 1 are shown in (Figure 3b and c) and the transcriptional parameters (for all four datasets) are summarized in (Figure 3e). Note that the estimated parameters for all four datasets, using both segmentations, are shown in Appendix 3-table 1 and the associated best fit distributions in Appendix 3-figure 1a.
Notably the segmentation algorithms led to similar estimates for the burst frequency but considerably different estimates for the rest of the parameters. In particular segmentation 1 suggested that burst expression is infrequent (≈20% of the time) whereas segmentation 2 was consistent with burst expression occurring half of the time. Given that accurate cell segmentation remains challenging, this analysis illustrates that parameter estimation from mature mRNA counts may be affected by technical errors. For the remainder of the mature mRNA analysis, we have used only segmentation 2 data.
Lastly, it may be challenging to distinguish the nascent transcription site from a mature RNA, especially if few nascent RNAs are being produced. Either one can decide to include all cellular spots in the total mRNA count, including the transcription site, with the result that the number of mature transcripts is overestimated with one RNA for cells which show a transcription site. Or conversely, one can decide to exclude the transcription site by subtracting one spot from each cell, with the result that the number of mature mRNAs may be underestimated by one RNA for cells that are transcriptionally silent. To understand how this choice affects the accuracy of parameter inference, we compared both options in (Figure 3c, d and e), where seg2 included all spots, and seg2-TS excluded transcription sites (by subtracting 1 from each cell). The estimated parameters for all four datasets are shown in Appendix 3-table 1 and the associated best fit distributions in Appendix 3-figure 1a. Although the mean was lower when transcription sites were excluded, all the parameters except the burst frequency σon were within the error, indicating that the choice of whether or not to include the transcription site in the mature mRNA count had a small influence on parameter estimation. For the remainder of the analysis, we included all spots, and counted the transcription site as one RNA.   The DAPI and Cy3 signals were used to determine the nuclear and cellular mask, respectively. Detected and fitted spots are indicated in green. Mature RNA count distribution (pink) for segmentation method 1 with a best fit obtained from the telegraph model (gray curve). Scale bar is 5 μm(c-d) The DAPI and Cy3 signals were used to determine the nuclear and cellular mask using a second independent segmentation tool (segmentation 2). Mature RNA count distribution (gray and cyan) with/without counting the transcription site (TS) for segmentation method 2 with a best fit obtained from the telegraph model (gray curves). (e) Bar graphs of inferred transcriptional parameters (merged mature RNA data) from fitting the distributions of the two segmentation methods ('seg1' and 'seg2') as well as the distribution of mature RNAs only ('seg2 -TS' which indicates the exclusion of one spot in each cell that represents the transcription site). The burst size was computed as ρ/σ off and the fraction of ON time as σon/(σon + σ off ) . Error bars indicate standard deviation computed over the four datasets. (f) Distribution of the integrated DAPI intensity for each cell. Cyan line represents a Gaussian bimodal fit with highlighted regions indicating the intensity-based classification of G1 and G2 cells. Distributions of the mature RNA count for all cells (merged) and cell-cycle classified cells (G1 cells and G2 cells). (g) Tables and bar graphs of inferred parameters for merged and cell-cycle-specific data. Note that the transcriptional parameters σon, σ off , ρ are normalised by the degradation rate and hence dimensionless. For the cell-cycle-specific data, parameters were inferred per gene copy.  ACF residuals m e r g e d G 1 G 2 m e r g e d G 1 G 2 m e r g e d G 1 G 2 m e r g e d G 1 G 2 m e r g e d G 1 G 2 Figure 4. Inference from the normalized nascent mRNA distributions for merged and cell-cycle specific data. (a) Normalized nascent mRNA distributions of merged cell-cycle data were obtained by normalizing the signal intensity of the transcription site (defined as the brightest spot in the cell) by the median signal intensity of the cytoplasmic spots (shown in orange and zoom-in depicted in the inset). In all 4 datasets, approximately 90% of the detected cytoplasmic spots fell in the range 0.5× median -1.5× median (grey bargraph). Black line in normalized distribution on the right represents best fit with delay telegraph model. (b) Nascent RNA distributions for cell-cycle-specific data. Black lines represent best fits with delay telegraph model. (c) Bar graphs comparing the transcriptional parameters, burst size, fraction of ON time and Fano factor for cell-cycle-specific and merged data. Error bars indicate standard deviation of the four datasets. (d) Normalized ACF plots of cell-cycle-specific and merged data. The ACF plots are generated by stochastic simulations using estimated parameters from merged and cell-cycle specific nascent mRNA data for each of the four data sets; these were compared with the ACF measured directly using live-cell data in Donovan et al., 2019 (green line). (e) The sum of squared ACF residuals of merged and cell-cycle-specific data from each dataset (this is the sum of squared deviations between the measured and estimated normalised ACF where the sum was calculated over all time points).
Inference from mature mRNA data: merged versus cell-cycle specific The above analysis was performed using the merged data from all cells, irrespective of their position in the cell cycle. The inferred parameters of all four datasets are shown in Figure 3g (grey). To understand the effect of the cell cycle on these parameter estimates, we compared this inference with cell-cycle-specific data. We used the integrated nuclear DAPI intensity as a measure for DNA content to classify cells into G1 or G2 cells (Figure 3f (left)) to obtain separate mature mRNA distributions for G1 and G2 cells.
To infer the transcriptional parameters from mature mRNA data of cells in G1, the inference protocol remained the same. However for cells in the G2 stage, this protocol needed to be altered since G2 cells have two gene copies, whereas the solution of the telegraph model assumes one gene copy. Assuming the transcriptional activities of the two gene copies are independent, the distribution of the total molecule number is the convolution of the molecule number (obtained from the telegraph model) with itself for mature mRNA data. This convolved distribution was used in steps (ii) and (iii) of the inference algorithm in Methods Section Steps of the algorithm to estimate parameters from mature mRNA data. A difference between our method of estimating parameters in G2 from that in the literature  is that we do not assume that the burst frequency is the only parameter that changes upon replication, and we estimated all transcription parameters simultaneously.
Note that the independence of gene copy transcription has been verified for genes in some eukaryotic cells  where the two copies can be easily resolved. For yeast data, as we are analyzing in this paper, it is generally not possible to resolve the two copies of the allele in G2 because they are within the diffraction limit. However, in the absence of experimental evidence, the independence assumption is the simplest reasonable assumption that we could make (see later for a relaxation of this assumption).
For both G1 and G2 cells, we performed inference for cell-cycle specific mature mRNA data, the results of which are shown in Figure 3f (centre and right) and Figure 3g -see Appendix 3-table 2 for the confidence intervals of the estimates calculated using profile likelihood. As expected, the mean number of mRNAs in G2 cells was larger than that in G1 cells. For both merged and cell-cycle specific data, the parameters ordered by increasing variability of the estimates from independent samples (the standard deviation divided by the mean) were: ρ , f ON , σ ON , burst size and σ OFF , and the same order was predicted by the relative error (from ground truth values) from our synthetic experiments (compare with f ON = 0.50 and f ON = 0.80 in the middle and right panels of Figure 2d) and by sample variability (Appendix 1). In Appendix 3 and Appendix 3-table 3 we show that the relaxation of the assumption of independence between the allele copies in G2 (by instead assuming perfect state correlation of the two alleles) had practically no influence on the inference of the two best estimated parameters ( ρ , f ON ).
A comparison of the two types of data predicted different behaviour (Figure 3g bottom): merged data indicated behaviour consistent with the gene being ON half of the time and small burst sizes, while cell-cycle-specific data implied the gene is ON ≈80% of the time with large burst sizes. We note that the burst sizes have considerable sample variability, exemplifying burst size estimates of transcriptional parameters from mature mRNA distributions have to be treated with caution. Nevertheless, in line with this high fraction ON and large burst size, which start to approach constitutive expression, the variation introduced by the transcription kinetics is relatively modest with Fano factors not far from one: 2.43 ± 0.21 for merged data and 1.75 ± 0.45 for cell-cycle data (the slightly higher value for merged data likely was due to heterogeneity stemming from varying gene copy numbers per cell).
Comparing the mean rates between the G1 and G2 phases, we found that σ off , σon , ρ decreased while f ON and the burst size increased upon replication. However, taking into account the variability in estimates across the four datasets, the only two parameters which were well-separated between the two phases were σon and ρ . These two parameters decreased by 65% and 21%, respectively, which suggests that upon replication, there are mechanisms at play which reduce the expression of each copy to partially compensate for the doubling of the gene copy number (gene dosage compensation) .
In conclusion, what is particularly surprising in our analysis is the differences in the inference results using merged and cell-cycle specific data: the former suggests the gene spends only half of its time in the ON state while the latter implies the gene is mostly in its ON state.
Inference from nascent mRNA data: cell cycle effects, experimental artifacts and comparison with mature mRNA inference Cell-cycle-specific versus merged data To determine the number of nascent transcripts at the transcription site, we selected the brightest spot from each nucleus and normalized its intensity to the median intensity of the cytoplasmic spots. As the distribution of intensities of the cytoplasmic mRNAs followed a narrow unimodal distribution, its median likely represents the intensity of a single RNA (orange distribution in the central panel of Figure 4a). The inference of transcriptional parameters using the merged data was done using the algorithm described in Methods Section Steps of the algorithm to estimate parameters from nascent mRNA data.
Similar to above, to account for two gene copies in G2 cells, we assumed that the transcriptional activities of the two gene copies are independent. The distribution of the total fluorescent signal from both gene copies was the convolution of the signal distribution (obtained from the extended delay telegraph model, i.e. Equation (1)) with itself. This convolved distribution was then used in steps (ii) and (iii) of the inference algorithm.
The inference of transcriptional parameters from nascent RNA data was done using a fixed elongation time, which was measured previously at a related galactose-responsive gene (GAL3) at 65 bp/s (Donovan et al., 2019). Since the total transcript length is 3062 bp (see Figure 1c), the elongation time ( τ in our model) is ≈ 47.1 s ≈ 0.785 min . The fixed elongation rate enabled us to infer the absolute values of the three transcriptional parameters σ off , σon and ρ .
Best fits of the extended delay telegraph model to the distribution of signal intensity of nascent mRNAs at the transcription site are shown in Figure 4a and b for dataset 1; for the other datasets see Appendix 4-figure 1. The corresponding estimates of the transcriptional parameters are shown in Appendix 4-table 1 and also illustrated by bar charts in Figure 4c. The confidence intervals of the transcriptional parameters (computed using the profile likelihood method) are shown in Appendix 4-table 2.
Comparing this estimation with that from mature mRNA, we observed that in both cases f ON ≈ 0.5 for merged data and in the range 0.7 − 0.8 for cell-cycle-specific data. Also in both cases, the Fano factors of merged data were larger than those of cell-cycle-specific data. Hence, we are confident that not accounting for the cell cycle phase leads to an over-estimation of the time spent in the OFF state and of the Fano factor. In addition, comparing the burst sizes in Figure 3g and Appendix 4table 1, we found that not taking into account post-transcriptional noise (by using mature mRNA data) led to an lower estimation of the burst size (2.6-fold, 2.6-fold, and 1.1-fold lower for inference from merged, G1 and G2 data, respectively). We note that it would be useful to directly compare the absolute estimates of the other transcriptional parameters from mature and nascent mRNA data. However, this was not possible because the telegraph model only estimates the switching rates and the initiation rate scaled by the degradation rate, and the latter is unknown. On the other hand, the estimates from nascent data were rates multiplied by the average elongation time, which is known and hence the absolute rates can be estimated from nascent mRNA data only. The only quantities that could be directly compared were the burst size and the fraction of ON time, since these are both non-dimensional.
Comparing the variability of the parameter estimates, we found that ρ and f ON were the parameters with the smallest variability across samples for the nascent data, as for inference from mature data. However, the inferred parameter variability across samples was on average about 2.5-fold lower for nascent data compared to mature mRNA data (this was obtained by computing the standard deviation divided by the mean for each parameter and then averaging over all parameters and over merged, G1 and G2 data). Likely this is because nascent data does not suffer from post-transcriptional noise. Indeed, synthetic experiments suggested that the errors in parameter inference using nascent data are often less than those in mature data when f ON ≈ 0.80 (Figure 2d). In summary, we have more confidence in the parameter estimates from nascent data, in particular those from cell-cycle separated data.
To further investigate the hypothesis that estimates from cell-cycle-specific data are more accurate than merged data, we compared the estimates from merged and cell-cycle-specific data to previous live-cell transcription measurements of the same gene (Donovan et al., 2019). Because live-cell traces and simulated traces with the estimated transcriptional parameters are difficult to compare directly, we instead compared their normalized autocorrelation functions (ACFs). Specifically, the live-cell traces displayed cell-to-cell variation in overall fluorescent intensities arising from differences in the PP7 coat protein expression level, precluding a direct comparison of the live-cell intensities with the smFISH distributions. The normalized ACFs are normalized per trace and thus can be used to directly compare the kinetics. For this, we fed the parameter estimates to the SSA to generate synthetic livecell data and then calculated the corresponding ACF (Appendix 5). We found that the estimates from cell-cycle-specific data produced ACFs that match the live-cell data closer than that from the merged data ( Figure 4d). This was also clear from the sum of squared residuals which for each dataset was smaller for the ACF computed using the cell-cycle-specific estimates rather than those from merged data (Figure 4e).
Using nascent data, we also reinvestigated the hypothesis that the gene exhibits dosage compensation. Comparing the mean rates between the G1 and G2 phases, we found that σ off , σon , ρ , f ON decreased while the burst size increased upon replication. However, taking into account the variability in estimates across the four datasets, the only two parameters which were cleanly separated between the two phases were σon and f ON . These two decreased by 41% and 5%, respectively. These results had some similarity to those deduced from cell-cycle separated mature mRNA data (the decrease of σon ) but they also displayed differences. Namely, from mature mRNA data it was predicted that ρ decreased upon replication while from nascent data we predicted that ρ did not change and it was rather f ON that decreased by a small degree. The decrease of the burst frequency σon after replication has also been reported for some genes in mammalian cells Padovan-Merhar et al., 2015), indicating that this could be a general mechanism for gene dosage compensation. Our results are consistent with a population-based ChIP-seq study (Voichek et al., 2016) that showed DNA dosage compensation after replication in budding yeast. We note that our single-cell analysis only revealed partial dosage compensation, where the mean signal intensity of nascent mRNAs in G2 is not the same as in G1, but 1.7-fold higher in G2 than in G1 (Figure 4c).

Correcting for experimental artefacts
Although inference on cell cycle separated data outperformed inference on merged data, we noticed that the corresponding best fit distributions did not match well to the experimental signal distributions in the lower bins ( Figure 4b and Appendix 4- figure 1). In all cases, the experimental distributions showed high intensities in bins 1, 2, and 3, which was likely an artifact of the experimental data acquisition system. Since we defined the transcription site as the brightest spot, this implies that in the absence of a transcription site, a mature transcript can be misclassified as a nascent transcript. We therefore investigated two methods to correct for this, the 'rejection' method and the 'fusion' method.
The rejection method removed all data associated with the first k bins of the experimentally obtained histogram of fluorescent intensities ( Figure 5a shows the fits for dataset 1; for the other datasets see Appendix 4-figure 2). We found that the parameter estimates varied strongly when the number of bins from which data was rejected ( k ) was changed (Figure 5b; see also Appendix 4table 3). Although the distributions fit well to the experimental histograms (Appendix 4-figure 1), comparison with the live-cell normalized ACF indicated that the estimates actually became worse than non-curated estimates, with a higher sum of squared residuals (Figure 5c and d). The rejection method therefore does not produce reliable estimates.
Next, we considered another data curation method which we call the fusion method. This works by setting to zero all fluorescent intensities in a cell population which were below a certain threshold. In other words, we fused or combined the first k bins of the experimentally obtained histogram of fluorescent intensities, thereby taking into account that the true intensity of bin 0 was artificially distributed over some of the first bins. Figure 6 and Appendix 4-table 4 show that the fusion method led to estimates that varied little with k which enhanced our degree of confidence in them (note that k = 1 is the same as the uncurated data). The peak at the zero bin for both G1 and G2 was better captured using the fusion method than using non-curated data (compare Figure 4b and Appendix 4-figure 1, with Figure 6b). Comparison to the autocorrelation function of the live-cell data shows that correction with the fusion method also led to improved transcriptional estimates, as indicated by a reduction in the sum of the squared residuals for all four data sets ( Figure 6c).
Overall, we conclude that for inferring parameters from the smFISH data, the optimal method is to use nascent cell-cycle-specific data, corrected by the fusion method. The optimally inferred parameters for the four data sets in our study are those given in Appendix 4-figure 2d. The profile likelihood estimates of the 95% confidence intervals of these parameters are shown in Appendix 4-table 5. Note that in line with our synthetic data study in Figure 2, the parameters suffering from the least sample variability were f ON and ρ . The rest of the parameters ( σ off , σon and burst size) suffered more sample variability because the fraction of ON time was high; however since their standard deviation divided by the mean (computed over the four datasets) was not high (in the range of 10-20%), they still can be regarded as useful estimates. Note also that the previous prediction that gene dosage compensation involves regulation of the burst frequency did not change upon correction of the nascent data using the fusion method. All these results were deduced assuming that the two copies in G2 are independent from each other. Inferring rates under the opposite assumption of perfectly synchronized copies (Appendix 4-table 6) gave very similar estimates for ρ and f ON (to be expected since according to the synthetic data study, these two are the most robustly estimated parameters for genes spending most of their time in the active state) but different estimates for the rest of the parameters. While such perfect synchronization of alleles is unlikely, some degree of synchronization is plausible and further improvement of the transcriptional parameters in the G2 phase will require its precise experimental quantification.

Discussion
In this study, we compared the reliability of transcriptional parameter inference from mature and nascent mRNA distributions, with and without taking into account the cell cycle phase. Although these distributions come from the same experiment, we found that the different fits produced very different parameter estimates, ranging from small bursts to very large bursts. Comparison to live-cell data revealed that the optimal inference method is to use nascent mRNA data that is separated by cell cycle.
Our findings illustrate the risk of inferring transcriptional parameters from fitting of mRNA distributions. First of all, as we have shown, these fits are sensitive to the segmentation method which can lead to large errors in the estimates. Secondly, the most common method of parameter inference in the literature is fitting of mature mRNA distributions that are not separated by cell cycle Raj et al., 2006;Zenklusen et al., 2008). Obtaining such distributions is straightforward using methods such as smFISH, where one can directly count the number of mRNAs per cell. Additionally, with the advance of single-cell mRNA sequencing technologies, it is possible to obtain mRNA distributions for many genes simultaneously and it is tempting to use these to estimate bursting behaviour across the genome (Kim and Marioni, 2013;Larsson et al., 2019). However, our comparisons on the same dataset show that the values obtained from mature mRNA fits (using merged data) can be   significantly different from the optimal values (using nascent cell-cycle separated data corrected using the fusion method), with underestimation of the burst sizes of almost 10-fold and underestimation of the active fraction of more than 1.5-fold. These results indicate that parameter inference from merged mature mRNA data should be treated with caution. There were smaller differences between the burst size and the active fraction inferred from cell-cycle separated mature and nascent data (only these two can be directly compared because these are non-dimensional); however the relative errors in the estimates (computed over the four datasets) were more than twofold higher for mature data likely due to post-transcriptional noise which nascent data is free from. It is more common to fit mature distributions rather than nascent distributions because nascent distributions are technically more challenging to obtain. As nascent single-cell sequencing methods are still in the early phase (Hendriks et al., 2019), the only method available so far for nascent measurements is smFISH (Patel et al., 2021). In such smFISH experiments, intronic probes can be used to specifically label nascent RNA, although there may be some effects of splicing kinetics on the distribution (Wan et al., 2021). If introns are not present, like for most yeast genes, one can use exonic probes instead (Zenklusen et al., 2008). Since exonic probes label both nascent and mature mRNA transcript, it may be challenging to identify the nascent transcription site unambiguously, especially at lower transcription levels. We show in this manuscript that the fusion method can correct for this bias by combining bins below k RNAs, which results in an improvement of the parameter estimates.
Our analysis also emphasizes the importance of separately analyzing G1 and G2 cells . It is important to note that for cell-cycle-specific analysis, experimental adjustments or cell-cycle synchronized cultures are not required. Although asynchronous cultures consist of a mix G1, S and G2 cells, the integrated DNA intensity of the nucleus of each cell, for example from a DAPI signal, can be used to separate these cells by cell cycle phase in silico Roukos et al., 2015). As most smFISH experiments already include a DNA-labelled channel, adding an extra analysis step should in principle not limit the incorporation of this step in future smFISH fitting procedures.
Even with our optimal fitting strategy, there is a residual error of the simulated ACF and the measured ACF from live-cell measurements. This difference may be the result of different experimental biases of the two measurements. For example, live-cell measurements have a detection threshold below which RNAs may not be detected. In addition, live-cell measurements include cells in S phase, which are not analyzed in smFISH. There could also be differences in the exact percentage of G1 and G2 cells, or other noise sources between live-cell and smFISH experiments. Alternatively, the fit may be imperfect because there might be parameter sets, others than the ones which our inference algorithm found, which provide an accurate fit of the nascent mRNA distribution and perhaps an even better fit to the ACF than we found. We cannot exclude this possibility because we estimated f ON to be 0.7 − 0.8 and using synthetic data we showed that the accuracy of some parameters ( σon, σ off and the burst size) deteriorated as f ON approached 1 (Figure 2d). Another factor which could explain the residual error between the simulated ACF and the measured ACF is that perhaps the two-state model may be too simplistic to cover the true promoter states in living cells and may therefore not be able to describe the true in vivo kinetics. The promoter may switch between more than 2 states, or there may be sources of extrinsic noise other than the cell cycle that contribute to the heterogeneity. Previous studies have for example identified extrinsic noise on the elongation rate (Fritzsch et al., 2018). However, these more complex transcription models also have more parameters, which in practice often means that very few will be identifiable with the current set of experimental observations. To fit these models, one requires temporal data on the transcription kinetics (Fritzsch et al., 2018;Rodriguez et al., 2019), or simultaneous measurements of various sources of extrinsic noise, such as singlecell transcription factor concentration and RNA polymerase number measurements, cellular volume, local cell crowding, etc, which are often not available in standard smFISH experiments (Battich et al., 2015;Foreman and Wollman, 2020). Nevertheless, given that there is no explicit time component in smFISH data, the closeness of the simulated ACF to the measured ACF provides confidence we are close to the real values.
The optimal parameter set ( Figure 6d) indicates long ON promoter times of 75 s, during which almost 50 RNAs are produced in a burst. Large burst sizes (>70) have been previously reported for mouse embryonic stem cells , mouse hepatocytes Bahar Halpern et al., 2015and human fibroblasts Larsson et al., 2019. The large burst size and high active fraction of 0.78 suggests that GAL10 expression is reaching its limit of maximal expression, which may not be surprising as it is already one of the most highly expressed genes in yeast. It is also interesting to note that the ON time of 75 s is longer than the residence time of a single transcript (47 s), which means that RNA polymerases in the beginning of a burst have already left the locus before the burst has finished.
The optimal parameter set ( Figure 6d) also indicates partial gene dosage compensation. Specifically the burst frequency per gene copy ( σon ) in the G2 phase is 0.66 that in the G1 phase; the other transcriptional rates are not significantly different between the two cell cycle phases. The fold change in the burst frequency per gene copy was previously estimated for the Oct4 and Nanog genes to be 0.63 and 0.71 respectively, in mouse embryonic stem cells . The similarity of our estimate of the fold change to those previously measured could be explained by the results of a recent study (Jia et al., 2021); using a detailed model of gene expression, it was shown that in the absence of a dependence of the initiation rate on cell volume, gene dosage compensation optimally leads to approximate mRNA concentration homeostasis when the fold change in the burst frequency upon DNA replication is √ 2/2 ≈ 0.71 . In conclusion, obtaining kinetic information from static distributions can introduce biases. However, we show that it is possible to obtain reasonable estimates that agree with live-cell measurements, if one infers parameters from nascent mRNA distributions that are accounted for cell cycle phase.

Methods
Inference from mature mRNA data

Mathematical model
The steady-state solution of the telegraph model of gene expression (Peccoud and Ycart, 1995) gives mature mRNA distributions. The reaction steps in this model are illustrated in Figure 1a. Next we describe the generation of synthetic mature mRNA data and the algorithm used to infer parameters from this data.

Generation of synthetic mature mRNA data
We generate parameter sets on an equidistant mesh grid laid over the space: where the units are inverse minute. Furthermore we apply a constraint on the effective transcription rate ρ = ρσon σon+σoff < 100.
In each of the three dimensions of the parameter space, we take 10 points that are equidistant, leading to a total of 1000 parameter sets which reduce to 789 after the effective transcription rate constraint is enforced. We additionally fix the degradation rate d = 1 min -1 . Note that we choose not to vary the degradation rate (as we did for the other three parameters) since it is not possible to infer all four rates simultaneously -this is because the steady-state solution of the telegraph model is a function of the non-dimensional parameter ratios ρ/d, σ off /d and σon/d (Raj et al., 2006).
Once a set of parameters is chosen, we use the stochastic simulation algorithm (SSA Gillespie, 2007) to simulate the telegraph model reactions in Figure 1a and generate 10 4 samples of synthetic data. Note that each sample mimicks a single cell measurement of mature mRNA.
Steps of the algorithm to estimate parameters from mature mRNA data The inference procedure consists of the following steps: (i) select a set of random transcriptional parameters; (ii) use the solution of the telegraph model to calculate the probability of observing the number of mature mRNA measured for each cell; (iii) evaluate the likelihood function for the observed data; (iv) iterate the procedure until the negative log-likelihood is minimized; (v) the set of parameters that accomplishes the latter provides the best point-estimate of the parameters of the telegraph model that describes the measured mature mRNA fluctuations.
The degradation rate is fixed to d = 1 min -1 .
Step (ii) can be obtained either by computing the distribution from the analytical solution (Peccoud and Ycart, 1995 or by using the finite state projection (FSP) method Munsky and Khammash, 2006). Here, for the sake of computational efficiency, we use the FSP method to compute the probability distribution of mature mRNA numbers.
For step (iii) we calculate the likelihood of observing the data given a chosen parameter set θ where P(n j ; θ) is the probability distribution of mature mRNA numbers obtained from step (ii) given a parameter set θ , n j is the total number of mature mRNA from cell j and N cell is the total number of cells.
The minimization of the negative log-likelihood is equivalent to maximizing the likelihood. Note the optimization algorithm is terminated when the number of iterations is larger than 10 4 ; this number is chosen because we have found that invariably after this number of iterations, the likelihood has converged to some maximal value. Note that the inference algorithm is particularly low cost computationally, with the optimal parameter values estimated in at most a few minutes. Once the best parameter set θ * is found, we calculate the mean relative error (MRE) which is defined as where θ * i and θ true,i represent the i -th estimated and true parameters respectively, and M denotes the number of the estimated parameters. Thus, the mean relative error reflects the deviation of the estimated parameters from the true parameters.

Inference from nascent mRNA data Mathematical model
The steady-state solution of the delay telegraph model  gives the distribution of the number of bound Pol II. In Appendix 6, we present an alternative approach to derive the steady-state solution. The reaction steps are illustrated in Figure 1a.
The position of a Pol II molecule on the gene determines the fluorescence intensity of the mRNA attached to it. In particular for fluorescence data acquired from smFISH PP7-GAL10, the fluorescence intensity of a single mRNA on the DNA locus looks like a trapezoidal pulse (see Figure 1b for an illustration). This presents a problem because although we can predict the distribution of the number of bound Pol II using the delay telegraph model, we do not have any specific information on their spatial distribution along the gene. However, since the delay telegraph model implicitly assumes that a Pol II molecule has fixed velocity and that Pol II molecules do not interact with each other (via volume exclusion), it is reasonable to assume that in steady-state, the bound Pol II molecules are uniformly distributed along the gene. This hypothesis is confirmed by stochastic simulations of the delay telegraph model where the position of a Pol II molecule is calculated as the product of the constant Pol II velocity and the time since its production.
By the uniform distribution assumption and the measured trapezoidal fluorescence intensity profile, it follows that the signal intensity of each bound Pol II has the density function g defined by where L 1 = 862 bp (base pairs), L 2 = 2200 bp, L = L 1 + L 2 as defined in Figure 1b. The indicator function χ [0,1] (s) = 1 if and only if s ∈ [0, 1] and δ 1 (s) is the Dirac function at 1. The probability of the signal s being between 0 and 1 is due to the first part of the trapezoid function and hence is multiplied by L 1 /L which is the probability of being in this region if Pol II is uniformly distributed. Similarly, the probability of s being 1 is due to the L 2 part of the trapezoid and hence the probability is L 2 /L by the uniform distribution assumption. Note that the signal s from each Pol II is at most 1 because in practice, the signal intensity from the transcription site is normalized by the median intensity of single cytoplasmic mRNAs (Zenklusen et al., 2008). The total signal is the sum of the signals from each bound Pol II. Hence, the density function of the sum is given by the convolution of the signal densities from each bound Pol II. Defining p(s|k) as the density function of the signal given there are k bound Pol II molecules, we have that p(s|k) is the k -th convolution power of g , that is where δ 0 (s) is the Dirac function at. Finally we can write the total fluorescent signal density function as where P(k; θ) is the steady-state solution of the delay telegraph model giving the probability of observing k bound Pol II molecules for the parameter set θ . Hence Equation (8)  . This approach can lead to a large state space which incurs a large computational cost. In contrast, in our method, we use FSP to solve for the delay telegraph model, i.e. the distribution of the discrete number of bound Pol II from which we construct (using convolution) the approximate distribution of the continuous nascent mRNA signal by assuming the Pol II is uniformly distributed on the gene. Since the state space of bound Pol II is typically not large, our method will typically be more computationally efficient than the one described in Xu et al., 2016. Generation of synthetic nascent mRNA data We generated synthetic smFISH signal data by using the SSA, modified to include delay to simulate the delay telegraph model . Specifically, we use Algorithm 2 described in Barrio et al., 2006. One run of the algorithm simulates the fluctuating number of bound Pol II molecules in a single cell. The total fluorescence intensity (mimicking smFISH) is obtained as follows. When a particular bound Pol II is produced by a firing of the transcription reaction G → G + N , we record this production time; since the elongation rate is assumed to be constant, given the production time we can calculate the position of the Pol II molecule on the gene at any later time and hence using Figure 1b we can deduce the fluorescent signal due to this Pol II molecule.
Specifically we normalize each transcribing Pol II's position to [0,1] and map the position to its normalized signal by where x is the normalized position on the gene. Thus at a given time, the total fluorescent signal from the n -th cell (the n -th realization of the SSA) equals where Jn is the number of bound Pol II molecules in the n -th cell, and {x j } with j = 1, . . . , Jn is the vector of all Pol II positions on the gene. The total signal from each cell is a real number but it is discretized into an integer. The kinetic parameters are chosen from the same region of parameter space as in (2), on the same equidistant mesh grid and with the same constraint on the effective transcription rate. Unlike the mature mRNA case, here there is no degradation rate; instead we have the elongation time, which we fix to τ = 0.5 (min) . Note that fixing this time is necessary since it is not possible to infer the three transcriptional parameters rates and the elongation time simultaneously because the steady-state solution of the delay telegraph model is a function of the non-dimensional parameter ratios ρτ , σ off τ and σonτ . Once a set of parameters is chosen, we use the modified SSA (as described above) to simulate the signal intensity in each of 10 4 cells.
Steps of the algorithm to estimate parameters from nascent mRNA data The inference procedure is essentially the same as steps (i)-(v) described in mature mRNA inference except for the following points.
In step (ii), the probability of observing a total signal of intensity i from a single cell is obtained by integrating p(s; θ) in Equation (8) Note that the integration over the interval of length 1 is to match the discretization of the synthetic data and θ ∈ Θ . Intuitively, one can always choose a positive integer K such that P(k) = 0 for any k ≥ K . The computation of the solution of the delay telegraph model P(k) can be done either using the analytical solution (evaluated using high precision) or using the finite state projection algorithm (FSP) Munsky and Khammash, 2006. In Appendix 6-figure 1 and Appendix 6-table 1, we show that the two methods yield comparable accuracy and CPU time.
For step (iii) we calculate the likelihood of observing the data given a chosen parameter set θ where q j is the discretized total signal intensity from cell j and N cell is the total number of cells. In the optimization, we aim to find ) .
The whole procedure (for both mature and nascent mRNA inference) is summarized by a flow-chart in Figure 1c.

Experimental data acquisition and processing
A diploid yeast strain of BY4743 background with a single integration of 14xPP7 loops at the 5'UTR of GAL10 (strain YTL047 Donovan et al., 2019) was used in this study. Four replicate yeast cultures were grown in synthetic complete media with 2% galactose to early mid-log (OD 0.5), fixed with 5% paraformaldehyde (PFA) for 20 min, permeabilized with 300 units of lyticase and hybridized with 7.5 pmol each of four PP7 probes labeled with Cy3 (Integrated DNA Technologies) as described in Trcek et al., 2012 andLenstra et al., 2015;Patel et al., 2021, resulting in four technical replicates. The PP7 probe sequences are: atat cgtc tgct cctt tcta , atat gctc tgct ggtt tcta , gcaa ttag gtac ctta ggat , aatg aacc cggg aata ctgc . Coverslips were mounted on microscope slides using mounting media with DAPI (ProLong Gold, Life Technologies). The coverslips were imaged on a Zeiss AxioObserver (Zeiss, USA) widefield microscope with a Plan-Apochromat 40x1.4 NA oil DIC UV objective and a 1.25 x optovar. For Cy3, a 562 nm longpass dichroic (Chroma T562lpxr), 595/50 nm emission filter (Chroma ET595/50 m) and 550/15 nm LED excitation at full power (Spectra X, Lumencor) were used. For DAPI, a 425 nm longpass dichroic (Chroma T425lpxr) and a 460/50 nm emission filter (Chroma ET460/50 m) and LED excitation at 395/25 nm at 25% power (Spectra X, Lumencor) were used. The signal was detected on a Hamamatsu ORCA-Flash4.0 V3 Digital CMOS camera (Hamamatsu Photonics, Japan). For each sample and each channel, we utilized the Micro-Manager software (UCSF) to acquire at least 20 fields-of-view based on the DAPI channel. Each field-of-view consisted of 13 z-stacks (with a z-step of 0.5 µm) at 25ms exposure for DAPI and 250ms exposure for Cy3.
A custom python pipeline was used for analysis (https://github.com/Lenstralab/smFISH; Pomp, 2022). Maximum intensity projected images were used to segment the cell and nucleus using Otsu thresholding and watershedding (segmentation 1). In addition, we segmented cells using CellProfiler (segmentation 2). The diffraction-limited Cy3 spots were detected per z-slice using band-pass filtering and refined using iterative Gaussian mask localization procedure (Crocker and Grier, 1996;Thompson et al., 2002;Larson et al., 2005;Larson et al., 2011 andCoulon et al., 2014). Cells in which no spots were detected were excluded from further analysis since a visual inspection indicated that these cells were not properly segmented or were improperly permeabilized.
Spots were classified as nuclear or cytoplasmic and the brightest nuclear spots were classified as transcription sites. The intensity of the brightest nuclear spot in a cell was normalized with the median fluorescence intensity of all the cytoplasmic spots in all cells. This is due to the fact that 90% of cytoplasmic mRNAs are isolated (Figure 4a), thus the median of the fluorescence signal of cytoplasmic mRNAs can be considered as the normalizing value. The distribution of the normalised intensity of the brightest nuclear spot, calculated over the cell population, is the experimental equivalent of the total fluorescent signal density function as given by the solution of the modified delay telegraph model, Equation (8).
The number of mature mRNA in each cell is given by counting the number of spots in the entire cell, that is nuclear plus cytoplasmic. The transcription site is counted as 1 mRNA, regardless of its intensity. We show in Figure 3c that this has negligible influence on the estimated parameters since the mean number of mature mRNA is much greater than 1. The distribution of the number of spots is the experimental equivalent of the solution of the telegraph model, that is the marginal distribution of mature mRNA numbers in steady-state conditions. The integrated nuclear intensity of each cell was calculated by summing the DNA content intensity (DAPI) of all the pixels within the nucleus mask. The distribution of the intensities was fit with a bimodal Gaussian distribution. Those cells whose intensity was within a standard deviation of the mean of the first (second) Gaussian peak was classified as G1 (G2) (see Figure 3e left). This gave similar results to a different cell cycle classication method using the Fried/Baisch model (Johnston et al., 1978) which was recently employed in Skinner et al., 2016. See Appendix 7-figure 1 for a comparison of the two methods. We note that cells in late G2 may contain two separate transcription sites, one in the mother and one in the bud. When the nucleus moves into the bud, buds often contain less DNA than G1 cells, and mothers contain more DNA than G1 cells, both of which are excluded from the analysis. When the DNA content of the mother and daughter is similar, both mother and daughter are counted separately as G1 cells. We note that this late G2 subpopulation is very small.
We did four independent experiments with a total number of cells equal to 2510, 6411, 4592, 3181, respectively. After classification, the numbers of G1 cells are 766, 2111, 1495, 904 and the number of G2 cells are 683, 1657, 1209, 1143, whereas the rest were classified as undetermined.

Data availability
The four smFISH datasets are available from https://osf.io/d5nvj/. These datasets include the maximum intensity projected images, the spot localization results, the nuclear and cellular masks used for merged, G1 and G2 cells and the analyzed results of the mature and nascent data. The analysis code of the smFISH microscopy data is available at https://github.com/Lenstralab/smFISH; Pomp, 2022. The code for the the synthetic simulations and the parameter inference is available at https:// github.com/palmtree2013/RNAInferenceTool.jl ;Fu, 2022. 22511104000) and Shanghai Center of Biomedicine Development. XF acknowledges the support from Shanghai Sailing Program (22YF1410700). TLL was supported by the Netherlands Organization for Scientific Research (NWO, gravitation program CancerGenomiCs. nl), Oncode Institute, which is partly financed by the Dutch Cancer Society, and the European Research Council (ERC Starting Grant 755695 BURSTREG). RG was supported by a Leverhulme Trust research award (RPG-2020-327). Additional files

Data availability
The four smFISH datasets are available from https://osf.io/d5nvj/. These datasets include the maximum intensity projected images, the spot localization results, the nuclear and cellular masks used for merged, G1 and G2 cells and the analyzed results of the mature and nascent data. The analysis code of the smFISH microscopy data is available at https://github.com/Lenstralab/smFISH (copy archived at swh:1:rev:b49af68653e9fdcab3fa48085f648fc86d8c659e). The code for the the synthetic simulations and the parameter inference is available at https://github.com/palmtree2013/RNAInfer-enceTool.jl (copy archived at swh:1:rev:be2fcc8f7a811a571a297d3e150395c0a73add09).
The following dataset was generated: Appendix 1 Accuracy of inference from synthetic mature and nascent mRNA data Inference from synthetic mature mRNA data with external noise In the main text, Figure 2, we showed how the addition of 5% external noise to synthetic mature mRNA data degrades the inference accuracy. In Appendix 1- figure 1 we show how the addition of a larger amount of external noise (10%) causes an even larger loss of accuracy. In particular for 91% of the parameters, the inference accuracy is higher when using nascent mRNA data (Appendix 1- figure 1a) and the median relative errors become very high for most parameters, especially for ρ and σ off in the limit of large f ON (Appendix 1-figure 1b).
Appendix 1-figure 1. Comparing inference accuracy using synthetic nascent mRNA data and synthetic mature mRNA data with 10% external noise (log-normal distributed noise is added to the initiation rate ρ to mimic external noise due to post-transcriptional processing that is only present in mature mRNA).
Appendix 1-figure 2. Inference with the telegraph model and delay telegraph model for six parameter sets. (a) Estimates using the inference algorithm with the telegraph model (with no external noise) for six parameter sets.
For both the ground truth and the estimated parameters, we fix the degradation rate d = 1 min −1 . (b). Estimates using the inference algorithm with the delay telegraph model for six parameter sets. For both the ground truth and Appendix 1-figure 2 continued on next page the estimated parameters, we fix the delay τ = 0.5 min . (c) Distributions from synthetic mature mRNA data fitted using the telegraph model. (d) Distributions from synthetic nascent mRNA data fitted using the delay telegraph model.
In the main text, Figure 2, we showed how the accuracy of parameter estimation is not uniform across parameter space. Here we investigate if there is any relationship between this accuracy and how well is a distribution of mature mRNA numbers / signal intensity fit by the inference algorithm. For 12 parameter sets (6 for the telegraph model - Appendix 1-figure 2a) and (6 for the delay telegraph model -Appendix 1-figure 2b), we evaluate the fits to the distribution of synthetic data in Appendix 1-figure 2c-d. The results show that independent of the accuracy of the parameters estimated by the inference algorithm, the fits of the delay telegraph and telegraph model distributions to the distributions generated from synthetic data are generally excellent.

Testing the variability of the inference procedure
To obtain a better understanding of the variability of the inference procedure, for each of the six parameter sets in Appendix 1-figure 2b and i.e. for the inference using the delay telegraph model, we generated 10 independent sets of synthetic data and used maximum likelihood to infer the parameters for each dataset. The mean and standard deviation of the parameters (computed over all 10 datasets) are shown in Appendix 1-table 1. The means are close to the true parameter values (in Appendix 1-figure 2b); this shows that the inference procedure is working correctly and the deviations from ground truth values are mostly due to noise in the synthetic datasets. We can define the sample variability of a parameter as the standard deviation divided by the mean. Ordering parameters by this quantity, we find that for all six parameter sets the error is largest for σ off . For small f ON , the parameters ordered by increasing error are: σon , burst size, ρ , f ON and σ off .
While for large f ON , the order is: ρ , f ON , σon , burst size and σ off . Note that this order is the same as we determined using the relative errors (from the ground truth) which is shown in Figure 2 of the main text.
Appendix 1-table 1. Mean and standard deviation of the parameters estimated from 10 independent synthetic datasets, generated for each parameter set in Appendix 1-figure 2.

Confidence intervals using profile likelihood
We perform a profile likelihood study (Kreutz et al., 2013) on the 12 parameter sets of synthetic mature/nascent mRNA data described in Appendix 1-figure 2a-b. We obtain the 95% confidence interval for each parameter. The results are shown in Appendix 1-table 2. We also compare the relative errors for each parameter (computed as (estimated value -ground truth)/ground truth using the data in Appendix 1- figure 2a and b) with the profile likelihood error (computed as (upper bound -lower bound)/optimal estimate using the bounds in Appendix 1-table 2 and the optimal values in Appendix 1- figure 2a and b). The results are shown in Appendix 1-table 3. Note that in most cases, the parameters ordered by relative error are in agreement with the parameters ordered by profile likelihood error. Effect of random perturbation of mature mRNA data on inference To assess the reliability of the inference results due to errors in spot counting, we redid the inference with synthetic mRNA data (and using the telegraph model) perturbed randomly by minus 1/plus 1/ unchanged with probability 1/3 . The results are shown in Appendix 1-table 4.
We found that when the fraction of ON time was very small (Set 1), there is a considerable effect of the perturbations on the values of the inferred parameters -this is because in this case, the mean number of mRNA is very small and hence a perturbation of one molecule is very significant. However as expected, the inference results are quite robust when the fraction of ON time is not too small (Sets 2-6).
Appendix 1-table 4. Effects of random perturbations on inference of parameters from mature mRNA data (using the telegraph model).
where Φ is a stochastic perturbation satisfying the following constraints Φ(S i ) is a random variable sampled from the distribution Log-normal (α, β) whose mean equals S i , and the standard deviation equals 0.1 * S i . This means the random perturbation keeps the mean value of the signal S i unchanged but adds noise with a coefficient of variation equal to 0.1.
In Appendix 1-table 5 we compare the results of inference using synthetic nascent mRNA data with the aforementioned stochastic perturbation and without. As for mature mRNA data, we find that the perturbation has only a significant impact when the fraction of ON time is very small.
Appendix 1-table 5. Inference using the delay telegraph model from synthetic nascent fluorescent data, with and without perturbation by log-normal noise.

True
Delay Perturbation In this section, we aim to precisely understand the differences between the telegraph and the delay telegraph models. For both models, we define the rate of switching from the ON state to OFF state as σ off , the rate of switching from the OFF state to the ON state as σon and the production rate of nascent mRNAs in the ON state as ρ . The first-order decay rate of nascent mRNA in the telegraph model is given by d and the delay time between initiation and degradation in the delay telegraph model is τ . Note that while the telegraph model was in the main text explained in terms of mature mRNA, in this section we use it as a model for nascent mRNA since we want to compare directly with the delay telegraph model. The telegraph and delay telegraph models can be solved exactly in steady-state conditions (Peccoud and Ycart, 1995;Xu et al., 2016) (an alternative derivation for the delay telegraph model is also given in Section F). From the generating function solution of the models, one can deduce expressions for the first and second centered moments in steady-state conditions: where ⟨n⟩ delay and ⟨n⟩ tele are the mean number of nascent mRNA in the delay telegraph and telegraph models respectively, and Var delay and Var tele are the corresponding variances in molecule numbers.
To understand the differences between the two models, we set the means of the two models to be the same (by choosing d = 1/τ ) and then compute the relative error in their variance predictions: where x and y are non-dimensional variables defined as x = τ (σ off + σon), y = (ρ/σ off )(σ off /(σ off + σon)) 2 .
Note that x is the ratio of the time for nascent mRNA to detach from the gene τ and the timescale of promoter switching 1/(σ off + σon) . The non-dimensional parameter y is a measure of the burstiness of gene expression since it increases with the mean burst size ρ/σ off and the fraction of time spent in the OFF state σ off /(σ off + σon) . In fact as σ off → 0 which implies y → 0 , the telegraph model converges to a constitutive model where the time between two successive nascent mRNA production events is exponentially distributed. Note that the relative errors in the Fano factor and the coefficient of variation squared are the same as the error in the variance since the means of the two models are the same. Using Equation (12), it is easy to see that the relative error between the two models vanishes in the limit of x → 0 (when the promoter switching timescales are much longer than the time spent by a polymerase on a gene) or in the limit of y → 0 (when there is no burstiness in gene expression). Hence, the telegraph model is an accurate approximation of the delay telegraph model in these two limits. It can be shown that in the first case, the distribution of nascent mRNA numbers is well approximated by the sum of two Poisson distributions, whereas for the second case the distribution is a Poisson. Note that since R is always positive, it follows that the telegraph model systematically underestimates the size of noise in nascent mRNA numbers. It can be further shown that R increases monotonically with x and y and the maximum attainable value is R = 1/2 (when x → ∞, y → ∞ ), that is, the worst prediction of the telegraph model is that the variance is half that of the delay telegraph model.
In Appendix 2-figure 1a and c we contrast the distributions of nascent mRNA predicted by the delay telegraph model with those predicted by the telegraph model for the case d = 1/τ where the means are matched, as also assumed for the calculation of the relative error above. Appendix 2- figure 1b shows the effect of increasing x (via τ ) and Appendix 2- figure 1d shows the effect of increasing y (via ρ ). The number distributions are constructed from the generating functions of the telegraph model Peccoud and Ycart, 1995 and of the delay telegraph model (Section F). Note that the shapes of the distributions of the two models can be considerably different, e.g. the cases ρ = 3 and ρ = 10 in Appendix 2- figure 1c shows that the nascent distribution from the delay model is bimodal with peaks at 0 and at a non-zero value but it is unimodal with a non-zero peak from the telegraph model. Hence we conclude that if we are interested in accurately predicting nascent mRNA number distributions then generally the Markovian telegraph model is not a good approximation of the non-Markovian delay telegraph model.
In the main text, we used the delay telegraph model to infer the synthetic data from the synthetic data generated by the delay SSA. For comparison, we repeat the same but now we use the telegraph model (rather than the delay telegraph model) to calculate step (ii) in the inference algorithm, that is P(k; θ) in Equation 8 in the main text is now chosen to be the steady-state solution of the telegraph model. Once the best parameter set θ * is found, we calculate two scores: (i) the fitness given by the smallest negative log-likelihood value found by the optimizer normalized over the sample size. (ii) the mean relative error (for definition see Equation (4.5) in the main text). In Appendix 2-figure 1, we show both of these scores obtained for 20 independent numerical experiments -clearly the error using the delay telegraph model for the inference algorithm is significantly lower than if the telegraph model is used.
These observations are further reinforced in Appendix 2- figure 1f where we show the best fit distributions and the corresponding relative errors in the estimates of the burst size and the burst frequency (bar chart insets). These are computed using the formula: α = Relative error in the burst frequency estimate = |σ * on −σon| |σ * on | , β = Relative error in the burst size estimate = We note that the distributions are well fit in all cases, using both telegraph and delay telegraph models. However the errors α and β are considerably larger for the former.
In Appendix 2-table 1 we show the true and estimated parameters for the 6 distributions in Appendix 2-figure 1f. The estimates for ρ are accurate independent of the choice of model; however, the estimates for the promoter switching rates σ off , σon are far more accurate using the delay telegraph model. Hence we conclude that although inference using the telegraph model provides a histogram that fits well with the synthetic data, the inferred parameter values have little meaning because they are not an accurate reflection of the true parameter values. The inference is done in two different ways, using either the telegraph model (green) or the delayed telegraph model (blue). (f) Distributions of total fluorescent intensity from synthetic data (red dots) fit using the inference algorithm with telegraph model (dashed green) or delayed telegraph model (blue) for 6 different parameter sets.
The insets show the relative errors in the estimates of the burst frequency ( α ) and of the burst size ( β ) calculated using Equation (13). Note that while both models provide a very good fit to the distribution from synthetic data, nevertheless parameter estimation is far more accurate using a delayed telegraph model. This is also reflected in (a) where we see low fitness scores for both models but a high mean relative error for estimates based on the telegraph model. The true and estimated parameters are shown in Appendix 2-  Confidence intervals for estimates from merged and cell-cycle specific mRNA data using segmentation 2 We obtain 95% confidence intervals for the parameters estimated from mature mRNA data under segmentation 2 (shown in Figure 3f of the main text). The confidence intervals are shown for σ off , σon, ρ in Appendix 3- Effect of synchronization of gene copies in G2 phase on parameter inference In the main text, we assumed that the transcription of two allele copies in the G2 phase are independent. Here we consider the opposite scenario where the two allele copies in G2 phase are perfectly synchronized with each other, i.e. when one copy switches from on to off, the other copy also switches from on to off. Note that the time at which mRNA is transcribed from each copy is however not the same. Using this modified simulation algorithm, we re-perform the inference using the experimental data under segmentation 2. The results are shown in Appendix 3-table  3. Comparing these values with those estimated for phase G2 under the assumption of allele independence (main text Figure 3f right panel), we find that ρ and f ON are very similar but the other parameters vary considerably.
Appendix 3-  Appendix 4-figure 1. Inference results using merged and cell-cycle specific nascent data. Experimental distributions (purple) are fit using the delay telegraph model (magenta curves).

Inference using curated data
We show the inference results using nascent mRNA data curated with the rejection method in Appendix 4-table 3 and Appendix 4-figure 2, and with the fusion method in Appendix 4table 4. We also used the profile likelihood method to obtain the 95% confidence intervals for the fusion method with k = 4 ; the results are shown in Appendix 4-table 5. We also reinferred the parameters in the G2 phase for fusion corrected data by assuming that the allele states are perfectly synchronised; these are shown in Appendix 4-table 6. using the delay telegraph model (magenta curves).

Appendix 4-table 2 Continued
Appendix 4- Appendix 5-figure 1. Autocorrelation functions of 10 4 simulated GAL10 intensity traces (solid blue lines). The transcriptional parameters for G1 and G2 cells in the four sets of experimental data were obtained using the fusion method (see Figure 6d of the main manuscript). A linear fit (dashed black line) was subtracted to correct the ACFs for switching from G1 to G2 (solid cyan lines).
We checked that the obtained distribution agrees exactly with the nascent mRNA distribution of Equation 7 in Xu et al., 2016.

Finite State Projection
The finite state projection (FSP) method Munsky and Khammash, 2006 is an efficient numerical method that can be implemented to solve the CME (23) to any desired degree of accuracy. Because Equation (23) is coupled with Equation (25), here we show how to use a two-step FSP method to obtain the probability distribution. By assuming that there exists a large N such that P(i, n, t) = 0 and P (i, n, t) = 0 for any i = 0, 1 and n ≥ N + 1 , we denote P(t) = (P(0, 0, t), P(0, 1, t), . . . , P(0, N, t), P(1, 0, t), P (1, 1, t), . . . , P(1, N, t)) T as a N 2 -dimensional vector and P (t) ∈ R N 2 similarly. We first use FSP method to solve Equation (25) with the following ordinary differential equation (ODE) up to time Secondly, we solve the following inhomogeneous ODE d dt P(t) = AP(t) + h(t − τ ) where S is the right-shift operator, i.e. for any x = (x 1 , x 2 , . . . , Hence, the solution of Equation (31) gives the approximate probability distribution P(t) for any t > 0 . Note that if one uses the generating function Equation 29 to compute the probability distribution, the computation for the high-order derivatives can cause numerical instabilities due to lack of arithmetic precision -compare the exact solutions with precision 85 and 300 in the left and right panels of Appendix 6-figure 1. Because FSP avoids the computation of the higher-order derivatives, it can be as numerically stable as computing the exact solution with high precision (right panel of Appendix 6-figure 1). In Appendix 6-table 1, we show the computational efficiency of FSP versus numerical evaluation of the exact solution -FSP and the exact solution with high precision are hence comparable both in terms of accuracy and efficiency. In the main text, we use FSP. The distribution of DNA content intensities (DAPI) was fit using a bimodal distribution; those cells whose intensity was around one peak were classified as G1 and those around the second peak were classified as G2 (main text Figure 3e). We also did the classification using an alternative method, based on the Freid/Baisch model, which was recently employed in Skinner et al., 2016. In Appendix 7- figure 1 we show the distributions of fluorescent signal intensity for cells in the G1 phase (top row) and for those in the G2 phase (bottom row) for the four data sets described in the main text. Note that the method (bimodal Gaussian or Freid/Baisch) used to classify the cells in G1/G2 does not alter the intensity distribution hence verifying the robustness of our classification method. Appendix 7-figure 1. Blue curves (Method 1) show the fluorescent intensity distributions of the four experimental data sets after the classification of cells into G1 and G2 phases using the Fried/Baisch model. Magenta curves (Method 2) show the same but using a bimodal Gaussian, as described in the main text.