A simple method for getting standard error on the ratiometric calcium estimator

Graphical abstract How to get error bars on the ratiometric calcium estimator? The two measurements areas (region of interest, ROI, on the cell body and background measurement region, BMR, outside of the cell) are displayed on the frame corresponding to one actual experiment. Two measurements, one following an excitation at 340 nm and the other following an excitation at 380 nm are performed (at each ’time point’) from each region. The result is a set of four measures: adu340 (from the ROI), adu340,B (from the BMR), adu380 and adu380,B.. The fact that the measurements as well as the subsequent quantities derived from them are random variable realization is conveyed throughout the figure by the use of Gaussian probability densities. The densities from the BMR are ’tighter’ because there are much more pixels in the BMR than in the ROI (the standard deviations of the densities shown on this figure have been enlarged for clarity, but their relative size has been preserved, the horizontal axis in black always starts at 0). The key result of the paper is that the standard deviation of the four Gaussian densities corresponding to the raw data (bottom of the figure) can be reliably estimated from the data alone, egσ380≈Gadu380+V, where V is the product of the CCD chip gain squared by the number of pixels in the ROI by the CCD chip read-out variance. The paper explains how to compute the standard deviation of the derived distributions obtained at each step of the calcium concentration estimation.


Specifications
Python codes, data and all the information required to reproduce the manuscript results can be found on GitLab: https://gitlab.com/c _ pouzat/gettingse-on-ratiometric-ca-estimator

Rational
Since its introduction by Grynkiewicz et al. [2] , the ratiometric indicator Fura-2 has led to a revolution in our understanding of the role of calcium ions (Ca 2+ ) in neuronal and cellular function. This indicator provides a straightforward estimation of the free Ca 2+ concentration ([Ca 2+ ]) in neurons and cells with a fine spatial and time resolution. The experimentalist must determine a 'region of interest' ( ROI ) within which the [Ca 2+ ] can be assumed to be uniform and is scientifically relevant. Fluorescence must be measured following excitation at two different wavelengths: typically around 340 and 380 nm; and, since cells exhibit autofluorescence or 'background fluorescence' at those wavelengths, the measured fluorescence intensity is made of two sources: the Fura-2 linked fluorescence and the autofluorescence. The measured intensity within the ROI is therefore usually corrected by subtrating from it an estimation of the autofluorescence intensity obtained from simultaneous measurements from a 'background measurement region' ( BMR ); that is, a nearby region where there is no Fura-2. At a given time the experimentalist will therefore collect a fluorescence intensity measurement from the ROI at 340 and 380 nm; we are going to write adu 340 and adu 380 these measurements, where 'adu' stands for 'analog to digital unit' and corresponds to the raw output of the fluorescence measurement device, usually a charge-coupled device ( CCD ); if the experimentalist is careful not to saturate the sensor, the adu count is proportional to the number of photo-electrons present in the pixel, or in the group of pixels when on-chip binning is used, at the end of the exposure period. The experimentalist will also collect intensity measurements from the BMR, measurements that we are going to write adu 340 ,B and adu 380 ,B . If P CCD pixels make the ROI and P B pixels make the BMR and if the illumination time at 340 nm is T 340 , while the illumination time at 380 nm is T 380 (both times are measured in s ), the experimentalist starts by estimating the fluorescence intensity per pixel per time unit following an excitation by a light flash of wavelengths λ ( λ = 340 or 380 nm) as: where an assumption of autofluorescence uniformity is implicitly made. The following ratio is then computed: This is an important and attractive feature of the method as well as the origin of its name. Since only ratios are subsequently used, geometric factors like the volume of the Fura loaded region under the ROI do not need to be estimated. The estimated [Ca 2+ ] that we will write Ca for short (the ' ' sign is used for marking estimated values) is then obtained, following [2 , Eq. (5) , p. 3447], with: where K e f f (measured in μM), R min and R max are calibrated parameters (the last two parameters are ratios and are dimensionless). R min is the ratio ( Eq. (2) ) observed in the absence of calcium, while R max is the ratio observed with a saturating concentration. K e f f is the calcium concentration at which the ratio is half way between R min and R max . If a set of experiments is performed on a given cell type with the same batch of Fura, as in the companion paper [3] , the calibration errors on these three parameters will be the same for each experiment. If different cell types are considered and/or different Fura batches are used, the calibration errors should be taken into account before making comparison of estimated calcium dynamics parameters (see [1] for discussion).
If we now want to rigorously fit [Ca 2+ ] dynamics models to sequences of Ca , we need to get standard errors , σ Ca , on our estimates. This is where the ratiometric method gets 'more involved', at least if we want standard errors from a single transient as opposed to a mean of many transients. We typically work ( e.g. [1,3] ) in a setting, using the so called 'added buffer approach', where we cannot get more than a single transient in given conditions since Fura is constantly diffusing into the recorded cell modifying thereby the time constant of calcium transients. It is worth pointing out that there is a more general interest in obtaining standard errors from a single transient: getting these fluorescence measurements requires shining UV light on the neurons we are recording from and generates photodamage. Despite the ubiquity of ratiometric measurements in neuroscience and cell physiology, we are aware of a single paper-by some of us [1] -where the 'standard error question' was directly addressed. The method proposed in [1] requires a 3 wavelengths protocol: measurements at 340, 380 and 360 (the isosbestic wavelength) nm; it drops, so to speak, the above advantage of working with a ratiometric estimator since it fits directly the adu 340 and adu 380 data (at the cost of estimating some geometry related parameters) and it requires a model of the autofluorescence dynamics if the latter is not stationnary. It therefore requires a slightly more complicated '3 wavelengths' recording protocol as well as a more involved fitting procedure. The dataset of the companion paper [3] exhibits a clear but reversible autofluorescence rundown that cannot be ignored since autofluorescence accounts for half of the signal in the ROI. Rather that constructing / tailoring the accurate enough autofluorence models required by the 'direct approach' of [1] we looked for an alternative method providing standard errors for the ratiometric estimator.

Fluorescence intensity
As detailed in [1,2] , the fluorescence intensities giving rise to the adu 340 , adu 340 ,B , adu 380 and adu 380 ,B signals can be written as: where F λB is the autofluorescence intensity per pixel per time unit at wavelength λ, K F ura is the Fura dissociation constant (a calibrated parameter measured in μM), [ F ura ] total , is the total (bound plus free) concentration of Fura in the cell (measured in μM) and φ is an experiment specific parameter (measured in 1 /μM/s ) lumping together the quantum efficiency, the neurite volume, etc (see [1] for details).
Recorded signals adu 340 , adu 340 ,B , adu 380 and adu 380 ,B As detailed and discussed in [1,4] , the signal adu λ recorded with a CCD chip whose gain is G and whose read-out variance is σ 2 read−out can be modeled as the realization of a Gaussian random variable ADU λ with parameters: with the obvious adaptation when dealing with the BMR signal: I λ is replaced by I λB and P is replaced by P B . Parameters G and σ 2 read−out are CCD chip parameters provided by the manufacturer. Calibration procedures are discussed in [1,4] and a comprehensive example with data and codes can be found in [5] . Our experience is that the values provided by manufacturers are good starting points; the user calibrated read-out noise is sometime slightly larger than the one specified by the manufacturer.
Variance estimates for adu 340 , adu 340 ,B , adu 380 and adu 380 ,B So, to have the variance of ADU λ we need to know I λ and for that we need to know [ Ca 2+ ] ( Eqs. (4) and (6) ) precisely what we want to estimate. But the expected value of ADU λ is G I λ ( Eq. (8) ), we can therefore use as a first approximation the observed value adu λ of ADU λ as a guess for G I λ , so in Eq. (9) we plug-in adu λ for G I λ , leading to: In other words, we will use the observed adu λ as if it were the actual fluorescence intensity times the CCD chip gain, ADU λ = G I λ , in order to estimate the variance. In doing so we will sometime slightly underestimate the actual variance (when the observed adu λ turns out to be smaller than ADU λ ) and sometime slightly overestimate it (when the observed adu λ turns out to be larger than ADU λ ). Since we are going to combine many such approximations, we expect-and we will substantiate this claim in Section 3 -that overall the under-estimations will be compensated for by the over-estimations.

Variance estimate for Ca
Now that we have a ˆ σ 2 ADU λ we can work with -that is, an estimate from the data alone -, we want to get ˆ σ 2 r ( Eq. (2) ) and ˆ σ 2

Ca
. We can either use the propagation of uncertainty (also referred to as error propagation, compounding of errors or delta method ) [6,7] together with Eqs. (2) and (3) , or a 'quick' Monte Carlo approach. We drop any explicit time index in the sequel in order to keep the equations more readable, but it should be clear that such variance estimates have to be obtained for each sampled point.

Propagation of uncertainty
This method requires, in the general case, an assumption of 'small enough' standard error since it is based on a first order Taylor expansion (see Section Appendix A for details). It leads first to the following expression for the variance, ˆ The variance ˆ σ 2 r of r in Eq. (2) is then: and the variance ˆ A remark on ˆ σ 2

Ca behavior
The last three Eqs. (11) -(13) can be used together with Eqs. (8) and (9) to understand why ˆ σ 2 Ca will increase with the calcium concentration and therefore why a weighted nonlinear least-square procedure is required [8][9][10] in order to get proper confidence intervals on calcium dynamics model parameters. Eq. (9) tells us that the variance of the raw signals is an increasing linear function of their means. When the calcium concentration increases, the recorded signal at 340 nm increases while the one at 380 nm decreases ( Fig. 1

Monte-Carlo method
Here we draw, k quadruple of vectors where adu λ is the observed value and z [ j] λ is drawn from a standard normal distribution. We then plug-in these quadruples into Eq. (1) leading to k couples: These k couples are 'plugged-in Eq.
(2) ' leading to k r [ j] : before plugging in the latter into Eq. (3) to get k Ca [ j] : The empirical variance of these simulated observations will be our ˆ Since the Monte-Carlo method requires milder assumptions (the variances do not have to be small) and is easy to adapt, we tend to favor it; to be on the safe side, users can use both methods and, if they disagree, plot a histogram of the Ca [ j] to make sure that the discrepancy source is the nonnormality of the latter.

Comment
The present approach based on a ˆ σ 2 Ca estimation is slightly less rigorous than the 'direct approach' of [1] but it is far more flexible since it does not require an independent estimation / measurement of [ F ura ] total . In line with the discussion following Eq. (3) , in the companion paper [3] we chose to consider the calibrated parameters K e f f , R min and R max as fixed.

Rational
Eqs. (4) - (7) , together with Eqs. (8) and (9) can be viewed as a data generation model. This means that if we choose model parameters values as well as an arbitrary [Ca 2+ ] time course, we can simulate measurements ( adu ) at both wavelengths in the ROI as well as in the BMR. We can then use these should be very close to a standard normal distribution. This is precisely what we are going to check .

Simulated data
We are going to use the first transient of dataset DA_121219_E1 of the companion paper [3] . The 'static' parameters -that is the parameters not link to the calcium dynamics -used for the simulation are the actual experimental parameters rounded to the third decimal ( Table 1 ). The simulated calcium dynamics is a monoexponential decay mimicking the tail of the transient: and the parameter values ( Table 2 ) are just a rounded version of the fitting procedure output (see companion paper [3] ). The simulated data obtained in that way are shown on Fig. 1 (blue traces) together with the actual data (red curves) they are supposed to mimic. At a qualitative level at least, our data generation model is able to produce realistic looking simulations.

Software and simulation details
The methodological details of the measurements to which the analysis presented in the present manuscript was applied are described in the companion paper [3] .
The use of scipy was kept to a bare minimum to maximize code lifeduration ( scipy tends to evolve too fast with minimal concern for backward compatibility). The random number generators used were therefore the ones of numpy : the uniform random number generator derives from the Permuted Congruential Generator (64-bit, PCG64) ( https://www.pcg-random.org/ ) [11] while the normal random number generator is an adaptation of the Ziggurat method [12] of Julia ( https://docs.julialang.org/en/v1/ ); unfortunately one has to check the source code of both numpy and Julia to find that out.

Are the standard errors of ratiometric estimator accurate?
Since the two ˆ σ 2 Ca estimation methods, propagation of uncertainty and Monte-Carlo, agree at each time point within 2%, we illustrate in this section the results obtained with the Monte-Carlo method.
We take next the simulated data (blue curves on Fig. 1 ) together with the simulated background signals (not shown) as if they were actual data and we compute the ratiometric estimator and its standard error as described in Section 2.4 , using k = 10 4 replicates. Fig. 2 shows the standardized residuals as well as the simulated data together with the true [Ca 2+ ], we know it since we used it to simulate the data! The upper part of Fig. 2 is only a qualitative way of checking that the normalized residuals follow a standard normal distribution. A quantitative assessment is provided by the Shapiro-Wilk W statistic, that is here: 0.987 ; giving a p-value of 0.128 . There is therefore no ground for rejecting the null hypothesis that the normalized residuals are IID draws from a standard normal distribution. As an additional, visual but less powerful test, we plot the empirical cumulative distribution function (ECDF) of the normalized residuals together with the theoretical (normal) one and with Kolmogorov ' s confidence bands ( Fig. 3 ). If the empirical ECDF arises from a normally distributed sample with mean 0 and SD 1, it should be completely contained in the 95% confidence band 95% of the time and in the 99% band, 99% of the time (these are confidence bands not collections of pointwise confidence intervals).
We conclude from these visual representations and formal tests that our normalized residuals follow the expected standard normal distribution, implying that our proposed method for getting the standard errors of the ratiometric estimator is fundamentally correct.

Discussion
We have presented a new and simple method for getting standard errors on calcium concentration estimates from ratiometric measurements. This method does not require any more data than what experimentalists using ratiometric dyes like Fura-2 are usually collecting: measurements at 340 and 380 nm both within a region of interest and within a background measurement region. Once the errors bars have been obtained, arbitrary models can be fitted to the calcium transients -by weighted nonlinear least-squares [10] -and meaningful confidence intervals for the parameters of these models will follow as illustrated in the companion paper [3] . The present contribution is therefore best viewed as a major simplification of the 'direct approach' of [1] . In contrast to the latter, the new method does not require a '3 wavelengths protocol', it does not require either a precise fit of the autofluorescence dynamics at the three wavelengths and is therefore much easier to implement. We provide moreover two independent implementations, one in C and one in Python , they are open source and freely available. The rather verbose Python implementation of the heart of the method ( Section 2.4 ) requires 25 lines of code and nothing beyond basic numpy functions. We are therefore confident that this method could help experimental physiologists getting much more quantitative results at a very modest extra cost.

Co-submission
This manuscript is a co-submission associated with Analysis of neuronal Ca 2+ handling properties by combining perforated patch clamp recordings and the added buffer approach , DOI: 10.1016/j.ceca.2021. 102411 .

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Appendix A Propagation of uncertainty
We outline in this section how to reach Eqs. (11) -(13) of Section 2.4.1 . We first need to remember that is X and Y are two independent random variables with mean E X = μ X and E Y = μ Y and variance Eq. (11) is a direct consequence of the last equality. If X is (approximately) normally distributed with ( X ∼ N (μ X , σ 2 X ) ) as well as Y , we can write: X ≈ μ X + Z 1 σ X and Y ≈ μ Y + Z 2 σ Y , where Z 1 and Z 2 are independent and follow a standard normal distribution, N (0 , 1) . If now Z = f (X, Y ) and the partial derivatives of f at (μ X , μ Y ) exist then: This is just a first order Taylor expansion and that is where the 'small enough standard error' assumption is necessary. Z is then (approximately) a linear combination of two independent standard normal random variables and we immediately get: Eq. (12) follows directly by computing the necessary partial derivatives, while Eq. (13) requires the computation of a single derivative.

B1 General features
The evolution of the aduλB is shown on Fig. 4 . We see that the autofluorescence runs down when high frequency flashes are applied during the 3 transients, with a partial recovery between transients.

B2 Within transient dynamics
The 'direct method' of [1] requires the knowledge of the autofluorescence value at each time point during a transient at both 340, 360 and 380 nm, since Eqs. (4) and (6) are fitted directly to the recorded adu 340 and adu 380 and they depend on the total Fura concentration at transient time that is estimated from the difference of the 360 nm measurements in the ROI and the BMR. We therefore take a closer look a the autofluorescence dynamics during the first transient ( Fig. 5 ).
At that stage we can fit a straight line plus a cosine function whose period is the duration of a transient. That's a good way to capture the main structure in the transient, but is still does not account for the full signal variability ( Fig. 5 ). As can be seen from the normalized residuals -the residuals divided by the standard deviation -that should be very nearly independent random draws from a standard normal distribution if the model is correct, there are finer structures left (like the double valley on the 380 nm residuals) meaning that those fits won't pass formal goodness of fit tests. Indeed if we apply Pearson's χ 2 tests to these stabilized residuals we get: • at 340 nm a residual sum of squares (RSS) 326 , leading to a P (χ 2 197 > 326) = 0.0 , • at 360 nm a RSS of 288 , leading to a P (χ 2 197 > 288) = 2.6e-05 ,  • at 380 nm a RSS of 275 , leading to a P (χ 2 197 > 275) = 0.000203 .
We are then left with three possibilities: 1. try to refine the 'straight line plus cosine function' empirical model in order to get acceptable fits, 2. try to get a better understanding of the autofluorescence dynamics, 3. find another way to get error bars on our estimates.
Since we wanted to propose an 'as general and easy as possible' method we chose the third approach in the present manuscript.