Ca analysis: An Excel based program for the analysis of intracellular calcium transients including multiple, simultaneous regression analysis (cid:2)

Here I present an Excel based program for the analysis of intracellular Ca transients recorded using ﬂuorescent indicators. The program can perform all the necessary steps which convert recorded raw voltage changes into meaningful physiological information. The program performs two fundamental processes. (1) It can prepare the raw signal by several meth-ods. (2) It can then be used to analyze the prepared data to provide information such as absolute intracellular Ca levels. Also, the rates of change of Ca can be measured using multiple, simultaneous regression analysis. I demonstrate that this program performs equally well as commercially available software, but has numerous advantages, namely creating a simpliﬁed, self-contained analysis workﬂow.


Introduction
It is now commonplace to record and store experimental data in digital form. These data must then be analyzed. My own work involves the study of changes of intracellular calcium by using fluorescent Ca indicators [1,2]. The apparatus used provides the user with raw data in the form of voltage changes that reflect changes in intracellular Ca. However, this raw signal must be processed to provide meaningful physiological information. It must be prepared by one of several methods ଝ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution and reproduction in any medium, provided the original author and source are credited. * Tel.: +44 0161 2751208.
E-mail address: david.greensmith@manchester.ac.uk to provide acceptable ways of representing changes in intracellular Ca and only then can the prepared data be analyzed to quantify physiological phenomenon. These phenomena are varied, but include quantifying absolute levels of Ca (diastolic and systolic) and also measuring the rate of change of Ca. Practically speaking, converting raw experimental data into meaningful physiological information requires several steps. Propriety software currently exists, such as ClampFit (Molecular devices LLC), Excel (Microsoft Corporation) and SigmaPlot (Systat Software Inc.) which can adequately perform some of these steps, but no single one can effectively complete the 0169-2607/$ -see front matter © 2013 The Author. Published by Elsevier Ireland Ltd. All rights reserved. http://dx.doi.org/10.1016/j.cmpb.2013.09.004 entire task making the process disjointed and time consuming. To improve the efficiency of this process, I have written an Excel based program. Excel is commonly used as a simple spread sheet and graphing package. However, by using Visual Basic for Applications (VBA), one can write macros that unlock Excel as a powerful tool for complex data manipulation. Arrange these macros into an efficient workflow and provide a user interface, and one essentially creates a self-contained program, using Excel as a chassis, where no user interface with Excel itself is required. Others have expanded Excel's capabilities in numerous ways, including using advanced features [3][4][5] creating bespoke add-ins [6,7], modeling biological systems [8][9][10] and writing self-contained programs [11].
Perhaps the greatest advantage of this approach is the fact that the program is essentially an Excel workbook containing all the necessary subroutines for the process. This means the user can, for a given bout of analysis, (such as recordings from a particular cell), save a copy of that workbook. Therefore, for a given cell, the raw data, prepared data, analysis process and results of analysis are self-contained and stored in one location. Subsequent and re-analysis can be performed by simply re-opening that workbook, negating the need to re-import raw data and re-export the result of preparation/analysis.
There are several other advantages. (1) Raw data can be imported from any source that allows an array of values to be copied to a PCs clipboard, such as Clampfit's basic data export routine. This negates the need to deal with unique, and program specific data file formats, which are at the mercy of alteration by version updates. (2) VBA is highly flexible allowing user specific modifications to be made to the core script without the need for re-compilation. (3) Compatibility issues are minimized as any PC capable of running Excel will run an Excel based program. (4) The financial benefits can be significant as a single piece of widely available software; Excel, can be used to create any number of bespoke software applications.
With regard to this program, for a given experiment, the program can hold the raw and prepared data of 10 specimen Ca transients. The absolute Ca levels described above can be measured and stored for each. In addition, for each specimen, the decay phase can be subjected to multiple simultaneous regression analysis. Although I discuss in terms of intracellular Ca analysis, any biological signal that requires analysis in this way can be processed by this program by simply bypassing the calibration steps.

Computational methods and theory
The program has 2 fundamental processing components. (1) Preparation of raw data, and (2) analysis of those prepared data. With regard to the latter, there are two sub components.

Subtraction of background fluorescence during raw data preparation
Experimentally, photons are collected from the entire field of view "seen" by fluorophore imaging apparatus. Therefore "raw" florescence is that of the entire field of view. It is therefore standard practice to subtract background fluorescence (here, we regard background fluorescence to be that of the field of view with the cell removed). Irrespective of how background fluorescence is determined by a particular investigator, it must be subtracted from that of the raw fluorescence. This program automatically performs this step in all the raw data preparation algorithms (see Section 3 for practical aspects). Therefore, in all the equations given in the following sections, F is fluorescence subsequent to the subtraction of background fluorescence.

Preparation of data from ratiometric indicators
Dual excitation and dual emission fluorophores provide raw data in the form of a nominator and a denominator signal that must be ratioed prior to analysis (For example see [12]). This program automatically process the raw data as per Eq. (1).
where F nom is the nominator fluorescent signal and F denom is the denominator fluorescent signal.

Preparation of data from non-ratiometric indicators
The program can prepare the raw signal generated by nonratiometric indicators in two ways. Calibration to a true intracellular Ca concentration (for example see [1]) is possible by implementation of Eq. (2) [13].
where F is fluorescence (see above), Kd is the fluorophore's dissociation constant, and F max is the fluorophore's maximal fluorescence.
Alternatively, raw data can be converted into a pseudo ratio (F/F 0 ) (for example see [14]) where all Ca transients are normalized to a given, or their own resting fluorescent value by implementation of Eq. (3).
where F is fluorescence and F 0 is fluorescence in diastole.

Calculation of absolute Ca levels
Values of diastolic and peak systolic Ca are determined using cursors (see Section 3.3). The diastolic Ca value is subtracted from the peak value to calculate the Ca transient amplitude.
If calibrated data are used, the result of this analysis gives absolute concentrations of Ca. If raw data prepared as F/F 0 or ratios are used, this analysis gives relative signal changes that reflect changes of Ca.

Non-linear regression
The program determines the rate constant of decay of the experimental data (Y) by regression analysis using least square convergence. First, an initial predicted fit is generated by basic analysis of the experimental curve. The program then improves the fit of this predicted curve using an iterative loop implemented by the Excel Add-In "Solver" [4,15,16]. During each iteration, the predicted curves parameters are adjusted so generating a new predicted curve. Following each iteration, the goodness of fit is evaluated by summing the squares of the differences between the predicted and observed values. The predicted curve that results in the lowest sum of squares is the best fit. If the fit is robust, then the parameters of the observed curve can be inferred from those of the predicted.

Fitting a single exponential decay to the experimental data
The predicted curve resulting from each iteration is generated by implementation of Eq. (4).
where Y predicted is the predicted value, Y 0 and A are the baseline and amplitude of the observed curve (Y) respectively and k is the rate constant of decay of Y. To prevent excessive run durations, Solver may only use a finite number of iterations. Therefore, the "first guess" parameters must be relatively close to those of the best fit. The first guess values for Y 0 and A are determined using cursors on a graphical user interface (Section 3.4). The initial value for Y 0 is that derived from the "baseline" cursor's Y-axis position and the initial value of A is calculated from subtraction of initial Y 0 from the value derived by the "peak" cursor's Y-axis position. The initial rate constant (k) is predicted from the experimental curves time constant ( ) as per Eq. (5).

Fitting a double exponential decay to experimental data
The predicted curve resulting from each iteration is generated by implementation of Eq. (6).
where Y predicted is the predicted value, Y 0 is the curves baseline, A 1 and A 2 are the amplitudes of the two exponentials and k 1 and k 2 are the rate constants of decay of the two exponentials. For the initial prediction, Y 0 is calculated as per Section 2.5.1. A 1 and A 2 are calculated as per Eq. (7a and 7b): and where A is the observed curves amplitude calculated as per Section 2.5.1, and a and b are user defined constants. The initial guesses for the rate constants k 1 and k 2 are calculated as per Eq. (8a and 8b). and where k is the single exponential rate constant calculated as per Section 2.5.1, and c and d are user defined constants.

Goodness of fit
The program allows the user to evaluate goodness of fit qualitatively by visual assessment of the overlaid predicted and observed curve (see Section 3.4). For a quantitative assessment, the program calculates the R 2 value of the fit as per Eq. (9).
where SS reg is the sum of the squares of the regression residuals (the difference of each Y data point and each Y predicted data point), and SS total is the sum of squares of the differences between each Y data point and the mean of Y.
To make R 2 of the single and double exponential fit directly comparable, and therefore determine which describes Y most reliably, R 2 is adjusted as per Eq. (10).
where n is the number of data points and p is the number of parameters used to determine Y predicted . For most experimentally recorded curves, n will be in the order of 10 3 and so R 2 will not differ significantly from adjusted R 2 .

Addition of other regression analysis by code integration
To date, the user has the option to fit experimental data with a single, double or simultaneous single and double exponential. However, any number of regression analyses could be incorporated with ease.

Program description and sample runs
The program employs a logical workflow, taking the user through a series of steps, arranged as worksheet tabs. Each tab is responsible for a given step, and are ordered such that the user is taken from raw data input, to raw data processing, to analysis and finally to output. The workflow for a given step is logically arranged as a series of buttons, each of which triggering a particular sub-routine. A workflow overview is provided in Fig. 1. We suggest that the program file is kept as a template, then, for a given experiment, the program is saved with a unique file name for a given cell or experiment. All the raw data, analysis and output for that experiment then becomes self-contained, and can be reopened for resumption of or re-analysis.
In the examples given in the following sections, specimen data were recorded from dog or rat cardiac ventricular myocytes loaded with either fura-2 or fluo-3.

Import of raw data
Two raw data sheets handle the import of raw data, one for ratiometric data, the other for non-ratiometric data. Here, I define a given specimen from an experiment as a treatment. Clicking a treatment's import button will bring data copied to the clipboard into the program.

Fig. 3 -Calculation of absolute levels. Diastolic, peak and the Ca transient amplitude are calculated in the "Amp + Dia" tab. The treatment to be analyzed is selected with the combo box (A) and plotted (B). The X-cursors (C), controlled with X-scroll bars (D) can be used to flank a region of interest which can then be expanded (E). The diastolic (F) and peak (G) selection cursors are controlled with the Y-scroll bars (H). The current measurements are displayed (I) and
can be sent to the "Collated sheet" once the user is content (J).

Data preparation
Preparation of non-ratiometric and ratiometric data is dealt with by the "Calibration" tab. (Fig. 2). First, the user must import the appropriate data for the preparation they wish to perform. All preparation types require background fluorescence data. Calibration of non-ratiometric data to [Ca] requires the indicators maximal fluorescence value (F max ) and the fluorphores Kd. The treatment to be prepared is selected with the combo box and the preparation type selected using the appropriate option button. To calculate ratios, or to calibrate to absolute calcium, the user just needs to click "Calibrate". For conversion to F/F 0 , the user needs to define the value of F 0 by selecting a treatment with the combo box then clicking "draw". The currently selected treatment's Ca transient is then plotted, and the user flanks the region to be averaged as F 0 using two cursors. Clicking define then averages that region, defining F 0 . Clicking "calibrate" then converts the currently selected treatment into F/F 0 . The user then has the option convert all the treatments with the current F 0 or re-define F 0 for each individual treatment.
The prepared data is then stored in the "Calibrated" tab for subsequent analysis by the program, or for export.

Calculation of absolute Ca levels
Calculation of diastolic, peak systolic and the calcium transient amplitude is performed by the "Amp + Dia" tab (Fig. 3).
The user selects a treatment using the combo box and clicks "Draw". The selected treatments Ca transient is then plotted, superimposed by a pair of X-axis cursors and a pair of Y-axis cursors. The position of these cursors is controlled using scroll bars. The user, if desired, can zoom to a region of interest by flanking that region with the X-axis cursors and clicking "Zoom". The user defines the diastolic and peak Ca level by aligning the bottom Y-axis cursor with the diastolic region of the Ca transient and the top Y-axis cursor with the Ca transient's peak respectively. The amplitude is calculated automatically. Clicking "Send Measurements" sends the measurements to the "Collated" sheet.

Regression analysis
Regression analysis, and so calculation of the rate constant of decay of the Ca transient is performed by the "RC" tab ( Figs. 4 and 5). The user selects a treatment using the combo box and clicks "Draw". The selected treatment's Ca transient is then plotted, superimposed by a pair of X-axis cursors and a pair of Y-axis cursors. Here, the X-axis cursors are used to flank the fit region. The user then aligns the bottom Y-axis cursor with the fit region's baseline, and the top Y-axis cursor with the fit region's peak. This provides the initial parameters for the iterative loop described in Section 2.5. The fit region can be zeroed (reducing the number of iterations required) by selecting the "Zero Fit Range" check box. Checking "Smart RC Prediction" enables the RC prediction algorithms outlined in Section 2.5. (Fig. 4)

, the selected region can be fit with a single and or double exponential. (A) This superimposes the predicted curve(s) onto the plot (B). If Smart RC Prediction is enabled, the location of Tau is also indicated (C). The solved parameters are displayed (D) as are the goodness of fit parameters (E). If the user is content with the fit, these parameters be sent to the "Collated" sheet (F).
The program is now ready to fit the selected region with a single exponential decay, a double exponential decay or both simultaneously, depending on the configuration of the fit check boxes. Clicking "Fit" fits the selected region with the appropriate regression(s) and the predicted curves are superimposed onto the plot so the user can evaluate the goodness of fit visually. If Smart RC Prediction is enabled, the position of tau is indicated with a vertical line. If the user is content with the fit(s), clicking "Send Measurements" sends the solved parameters to the "Collated" tab.
The goodness of fit parameters are also displayed (and exported upon click of "Send Measurements".) These data give a quantitative assessment of the goodness of fit for the single and or double regression. Where both fits are performed simultaneously, they allow the user to quantitatively decide which regression describes the observed curve most reliably.

Output of analysis
All measurements are sent to the "Collated" tab (Fig. 6). The data is arranged logically into tables, and automatically plotted as histograms. So long as the workbook is saved, the output of analysis will remain in the sheet. Or, the user can export as required.
If the user has opted to simultaneously fit the observed curve with a single and double exponential, clicking "Indicate Best Fit" gives an indication of which regression describes a given treatments data most reliably.

Accuracy of regression analysis
To test the reliability of the fit algorithms I compared the fit parameters derived using my program with those derived from the commercially available SigmaPlot. Each program was used to fit a single exponential to the decay phase of an example Ca transient, know to have a single exponential decay. I repeated this process, fitting a double exponential to an example Ca transient, know to have double exponential decay. The fit region compared was identical. Fig. 7 and Table 1   increases due to, for example, noise. To test how reliably this program would deal with such a scenario, a calcium transient with a relatively poor signal to noise ratio was fitted with a single exponential. Fig. 8 and Table 2 display the outcome of this fit. Fig. 8 visually demonstrates that the fit is robust and is comparable to that performed using SigmaPlot. Table 2 quantitatively confirms the fit is robust, and the program performs

Comparison of simultaneous regression
A useful feature of the program is the ability to simultaneously fit the same region with a single and double exponential so as to decide which describes the observed data most reliably. The goodness of fit parameters provide a quantitative assessment of which is best, and those from each fit are directly comparable due to the fit region being identical. Fig. 9 shows the result of an example simultaneous fit and Table 3 compares the goodness of fit parameters. It is clear visually that the double exponential is the better of the two fits. Comparison of the goodness of fit parameters in Table 3 confirms this.

Mode of availability
A copy of the program and instructions for initial set-up can be obtained by e-mailing the author.