Fast calculation of fluorescence correlation data with asynchronous time-correlated single-photon counting

Fluorescence correlation spectroscopy (FCS) is a powerful spectroscopic technique for studying samples at dilute fluorophore concentrations down to single molecules. The standard way of data acquisition, at such low concentrations, is an asynchronous photon counting mode that generates data only when a photon is detected. A significant problem is how to efficiently convert such asynchronously recorded photon count data into a FCS curve. This problem becomes even more challenging for more complex correlation analysis such as the recently introduced combination of FCS and time-correlated single-photon counting (TCSPC). Here, we present, analyze, and apply an algorithm that is highly efficient and can easily be adapted to arbitrarily complex correlation analysis. © 2003 Optical Society of America OCIS code: (170.6280) Spectroscopy, fluorescence and luminescence; (180.1790) Confocal microscopy; (300.2530) Fluorescence, laser-induced


Introduction
An important issue when detecting fluorescence signals at very low light intensities as e.g. in single molecule spectroscopy (SMS) [1] is the occurrence of long time intervals with few or no photon detection events. Usually photon detection intensities are continuously monitored within consecutive macroscopic time bins of equal width (e.g., 100 ns to 1 ms). This sampling mode becomes increasingly problematic at short bin width (i.e., high time resolution) under low light intensities such as applied in SMS. In this case, most bins are accounted by zero photon intensity and, thus, of no worth. For example, sampling photon detection intensities with a temporal resolution of 100 ns requires recording of 10 7 intensity values per second, although most of these values will be zero for signal intensities usually encountered in SMS experiments (typical count rates are between 10 3 and 10 5 counts per second). The situation becomes even more dramatic when desiring to measure not only fluorescence intensities but also fluorescence lifetime by using pulsed laser excitation and correlating the photon detection times on a nanosecond timescale with the pulse times of the laser, i.e. direct determination of the delay time between excitation and detection (time-correlated singlephoton counting or TCSPC). The conventional approach [2] requires of building up complete TCSPC histograms for every time bin. In case of TCSPC histograms containing 2 8 = 256 delay time channels and a moderate counting depth of only 4 bit per channel, one ends up with a data flow rate of 4×256×10 7 bit/s = 1.28 GByte/s, which is beyond any reasonable limit. A much more advantageous mode of data acquisition than the synchronous acquisition mode is to record data only in case of an actual photon detection event (asynchronous data acquisition). This is the philosophy behind the so-called time-tagged time-resolved (TTTR) data acquisition mode that was recently introduced in low-intensity fluorescence measurements [3][4][5]. In that mode, every photon detection event generates two numbers: the time-correlated single-photon correlation (TCSPC) time (photon arrival time with respect to most recent laser pulse), and the macroscopic arrival time on a continuous time axis with ca. 100 ns time resolution (time tag). Only these two numbers are transferred to and stored in the PC memory. Additionally, each photon can be associated with a third tag containing information about, e.g., which photo-diode in a multi-channel detection scheme has detected the actual photon. Thus, the data flow is directly proportional to the photon detection count rate, generating no overhead of redundant information as in a synchronous data acquisition mode. Although this asynchronous TTTR data acquisition mode is now widely used in timeresolved single-molecule fluorescence spectroscopy, an important question is how to efficiently convert the TTTR data into a fluorescence correlation spectroscopy (FCS) curve. FCS is a powerful tool for evaluating SMS measurements and has found innumerable applications, see, e.g., ref. [6]. In many applications, FCS curves are generated from raw fluorescence intensity signals by dedicated correlator hardware. However, in that case any information about fluorescence lifetime or fluorescence intensity characteristics that are not reflected in the FCS curve is completely lost. Recently, combinations of TCSPC and FCS have been proposed, either for improving signal-to-noise ratio of an FCS measurement by time-gating photons on the nanosecond timescale [4,7], a virtual multi-channel FCS measurement scheme (i.e., creating several FCS data for different fluorophores present in the sample) by using the fluorescence lifetime characteristics of different fluorophores as the distinguishing parameter (TCSPC-FCS) [8]. In these cases, hardware correlators can only be used if they are preceded by another piece of data processing hardware. Furthermore, hardware correlators must, by design, immediately process incoming data and store the results. This prohibits any sectioning of the data and elimination of unwanted sections. This may be a severe practical limitation, since measurements that are accidentally spoiled by e.g. diffusing dust particles must be completely discarded.
An alternative approach is to calculate the FCS curve by software on the basis of the directly recorded asynchronous photon count data. The straightforward approach would be to generate a continuous file of fluorescence intensity values in consecutive time bins of fixed bin width, and to use this file for a conventional autocorrelation calculation. But then again, one encounters the same unreasonably large amount of data as when directly recording fluorescence intensity values in a synchronous data acquisition mode. Recently, Eid et al. [9] and Magatti and Ferri [10] described new algorithms for converting asynchronously acquired photon counting data into correlograms. Here, we expand on their ideas and present an algorithm especially designed for the evaluation of TTTR photon count data, emphasizing its generalization to more complex correlation analysis techniques such as combining the autocorrelation analysis with the analysis of the fast fluorescence decay on the nanosecond time-scale (TCSCP-FCS).

Asynchronous single-photon counting data
Asynchronously measured time-tagged single-photon counting data consist of a linear file of detection times {t 1 , t 2 , …, t N } of the detected photons, where N is the total number of detected photons during a given measurement. As special feature of these detection times is that they are integer multiples of some minimal time δt, determined by the temporal resolution of the detection electronics. Without restriction of generality, it will be assumed that all times are measured in units of δt, so that all the numbers t j take integer values. The value g(τ) of the autocorrelation function for a given lag time τ is defined as where I(t) is the photon detection intensity at a given time t, and the brackets denote averaging over all possible values of time t. For a photon detection measurement with temporal resolution δt, the intensity values I(t) within consecutive time intervals can only take the values 1/δt or 0, depending on whether there was a photon detection event during a time interval of width δt or not. The average in Eq. (1) is then calculated as the sum over all consecutive time intervals of width δt, divided by the total number of summed intervals.

Autocorrelation with logarithmic time-scale
In practice, one does not compute the autocorrelation function for all possible values of lag time τ, but at increasingly spaced lag time values. If the temporal resolution of the photon detection is, e.g., 100 ns, and one desires to follow correlation processes up to a minute, possible values of lag time τ are any value between 100 ns and 60 s in intervals of 100 ns, resulting in 6 × 10 8 possible lag time value. Calculation of g(τ) for all of these values would be an enormous time-consuming numerical effort. Instead, the autocorrelation is calculated for only few, approximately logarithmically spaced values of τ, which is sufficient for most FCS applications. This calculation at selected, approximately logarithmically spaced lag times was first used for hardware correlators, due to restrictions in practical complexity [11], and remains applicable for software correlators. However, the latter permit a much wider choice of lag time spacing and can easily be reconfigured for the given experiment. A convenient choice for the values of τ is with j taking integer values starting with one and running up to some maximum number j max = n casc B; where B is some integer base number; the bracket   gives the integer part of the enclosed expression. The resulting lag times are grouped into n casc cascades with equal spacing of The advantage of such a choice of lag times is that all τ j have integer values so that fast integer arithmetic can be used in subsequent computations. For example, when using a base number value of B = 10 and n casc = 3, one obtains the lag time sequence {τ j } = {1,2,…,9,10,12,…,28,30,34,…,70}.

Synchronous photon intensity data: bin-and-correlate algorithm
As mentioned above, a straightforward way of calculating the autocorrelation function is to divide the total measurement time, t N − t 1 , into intervals of unit length δt, and to sort the detected photons into these intervals corresponding to their arrival times t j . The result is a synchronous photon detection intensity file I j with j running from 1 through t N − t 1 , where the I j can only adopt the values one or zero. The fluorescence autocorrelation can then be calculated as given by Eq. (1). In practice, such an approach is prohibitively memory demanding and computationally expensive. As an example, consider an experiment with an average count rate of 10 5 counts per second and a typical time resolution of δt = 100 ns. A measurement lasting one minute would result in an average number of 6⋅10 6 detected photons. Converting the photon arrival data into a synchronous binary intensity file with 100 ns temporal resolution results in the huge number of 6⋅10 8 time intervals, or little less than 100 MByte of data, whereas only 1 % of these will contain a nonzero value. In contrast, the asynchronous file of photon arrival times takes only an amount of 12 MByte, if one assumes that every photon is tagged with a 2 Byte long label containing its arrival time and some overflow information.

Asynchronous photon intensity data time-tag-to-correlation algorithm
An optimal FCS algorithm will work directly on the arrival times {t 1 , t 2 , …, t N }, without converting them into time-binned data. In its simplest form, our algorithm is rather straightforward. For a given lag time τ, a second vector of arrival times { In the beginning, the value of the autocorrelation at lag time τ is set to zero. The algorithm starts with the time t 1 in the original vector and moves to consecutive time entries in that vector until it encounters a value t j that is equal to or larger than 1 t′ . If 1 t t j ′ = , the value of the autocorrelation at lag time τ is increased by one. Next, the algorithm switches to the entries of the second vector and, starting with 1 t′ , moves to consecutive time entries in that vector until it encounters a value 1 j t′ that is equal to or larger than 1 j t . If t that is equal to or larger than 1 j t′ , and so on until the last entry in one of both vectors is reached. In this simplest form, the autocorrelation algorithm calculates, up to some constant factor, the probability of detecting a photon at some time t + τ if there was a photon detection event at time t.

Correlation time coarsening
When applying the algorithm only at the increasingly spaced lag times as given by Eq. (2), it will completely miss, e.g., the strong autocorrelation of any periodic signal with a repetition time not included within the vector of used lag times. To avoid that, one usually applies an averaging procedure by coarsening the time resolution of the photon detection times t j when coming to the calculation of the autocorrelation function at increasingly larger lag time. This is equivalent to the multiple-tau and multiple-sampling time correlation method employed in hardware correlators [12]. Such a procedure is easily incorporated into the present algorithm. Besides working only with the original and shifted vectors of arrival times, { K }, all time entries t j and j t′ are associated with weight values w j and j w′ that are all set to one at the start of the algorithm. In case of an equality k j t t ′ = , the autocorrelation is increased by the weight product k j w w ′ and not by one. A time coarsening step is inserted each time when finishing the calculations for one cascade of B lag times τ j with equal spacing and before starting with the next cascade of B lag times with doubled spacing: All values { N t t t , , , 2 1 K } used in the previous cascade are divided by two and rounded to the nearest lower integer value, which will occasionally leads to the occurrence of consecutive entries with the same time value. Before continuing the autocorrelation computation, such double entries are reduced to one entry, and the corresponding weight of that remaining entry is increased by the weight of the eliminated one. Thus, with increasing lag time τ j , the time scale underlying the autocorrelation calculation becomes increasingly coarser, and the total number of time entries to be processed increasingly smaller. To correct for the varying time scale of the autocorrelation calculation, one has finally to divide, at each lag time τ j , the calculated autocorrelation value by the corresponding time scale factor   B j 2 . As pointed out in Ref. [12], this time coarsening leads to a triangularly weighted average of the true value of the autocorrelation function.

Generation of cross-correlation data
The above algorithm can easily be generalized to more complex situations. In most photon detection experiments, it is desirable to detect the fluorescence within two detection channels, either for obtaining more information about the fluorescence (detection of fluorescence polarization or different emission colours) [13], or to eliminate the adverse effects of detector afterpulsing [14] on the short-time behaviour of the FCS curve. In all these cases, crosscorrelating the signals of both detection channels instead of autocorrelating the signals of each channel becomes an important task. This is easily realized with the above algorithm by using the time values of the first channel as the entries for the first vector { N (TCSPC-FCS) [8]. For both applications, the algorithm presented here is easily adapted. In case of time-gating, the procedure is straightforward: all entries in the vector { N t t t , , , 2 1 K } are eliminated where the lifetime tag of the corresponding photons lie outside the set timegate. In case of applying a filter function to the recorded photons that acts on the lifetime tag, the procedure is similarly simple: the initial weight values w j are set to the filter output instead to one, subsequently applying the FCS algorithm without any alteration. Thus, the presented calculation method is most general and allows arbitrarily complex correlation analysis of the TTTR data.

Computation time
The computational efficiency of the presented algorithm can theoretically be compared with that of the traditional bin-and-correlate approach.

Bin-and-correlate algorithm
Assuming a total measurement time T, and neglecting the computational load of transforming the time-tagged data into binned data, one has to perform roughly T/τ k multiplications for calculating the autocorrelation at τ = τ k (measurement time over minimum possible bin width). Thus, the total number of multiplications is estimated as where k runs over all lag times used.

Time-tag-to-correlation algorithm
It is more complicated to estimate the computational load for the algorithm presented in this paper, because it depends on the number of detected photons and their spacing in time. Let us assume a homogeneous photon detection rate (equally spaced photons), which is the worst possible case, because then the time coarsening and data reduction step has the smallest data reduction efficiency. If the complete number of detected photons is denoted by N, the average rate of photon detection events is N/T. On a time scale of 2 k , i.e., after k time-coarsening steps leading to a temporal resolution of 2 k (see above), the length of the vector { N t t t , , , 2 1 K } will be approximately equal to the total number of time intervals at that temporal resolution times the probability to find at least one photon detection event during time interval of length 2 k . The number of time intervals of width 2 k equals the total measurement time T divided by 2 k . Assuming furthermore a Poissonian statistics of photon detection, the probability to find at least one photon within a time interval of length 2 k is given by 1 − exp(-2 k N/T). Finally, assuming that the number of necessary multiplications for calculating the autocorrelation at a given lag time value is roughly proportional to the length of the vector {

Comparison of bin-and-correlate and time-tag-to-correlation algorithms
As an example, consider a vector of delay times as given by Eq. (2) with B = 10 and n casc = 17, thus covering a time range of nearly six orders of magnitude (from τ 1 = 1 up to τ 170 = 1310710). Let the smallest lag time of τ 1 = 1 correspond to 100 ns in experimental time, a temporal resolution typical for TTTR photon counting measurements. Typical average photon count rates range between 1 kHz and 1 MHz. The result for the ratio of L bin to L tttr is shown in Fig. 1. The ratio L bin /L tttr does not depend on total measurement time T but only on the count rate N/T which can be seen when inserting Eq. (3) and Eq. (4) into L bin /L tttr and noticing that the final result contains only the ratio N/T but not T alone. As can be seen, the time-tag-to-correlation algorithm is superior to the bin-and-correlate approach even at high count rates. Moreover, one has to bear in mind that the made estimates are very conservative and biased in favour of the bin-and-correlate method: Any computational load of converting the tagged into binned data was neglected, and the rather unphysical assumption of evenly spaced photons was considered for the time-tag-to-correlation approach. In a real experiment, such as the detection of sparse numbers of molecules within a confocal detection volume, photon arrivals will be rather bunched into bursts every time a molecule diffuses through the detection region.

Experimental application
The experimental set-up is described in detail in Refs. [3,5]. Briefly, the light of a pulsed diode laser (PDL 800, PicoQuant) with 635 nm wavelength is focused, through a water immersion objective with 1.2 numerical aperture (60 × PlanApo IR, infinity corrected, Olympus), into the diluted bead solution (~ 10 -9 M). The fluorescence light is collected through the same objective and, after passing a dichroic mirror (DRLP 650, Omega Optical), imaged onto a confocal aperture (50 µm diameter). After splitting the transmitted light into two detection channels by a 50/50-beam splitter, it is refocused onto the active area of two single-photon avalanche diodes (SPAD, SPCM-AQR 13, Perkin Elmer). The photon count signals of the SPADs are processed by a fast TCSPC electronics (TimeHarp 200, PicoQuant) and converted into a TTTR data stream of photon arrival and TCSPC times. The macroscopic temporal resolution of the TTTR data is 100 ns. The FCS algorithm was applied to data obtained from buffered aqueous solutions of fluorescently labelled latex beads (Crimson Red FluoSpheres® by Molecular Probes of ~36 nm diameter). These beads deliver a bright fluorescence signal and a well-defined slowly decaying autocorrelation curve due to their slow diffusion. Figure 2 shows the result of different autocorrelation algorithms applied to 10 seconds of a measurement at 200 µW excitation power. For studying the computation time of the time-tag-to-correlate algorithm at different fluorescence intensities, measurements were carried out at six different excitation intensities of 0.7, 2, 7, 20, 70, and 200 µW and lasted 60 minutes each. Thus, a large range of average photon count rates is covered. For FCS analysis, a vector of lag times as given by Eq. (2) with B = 10 and n casc = 20 and τ 1 equal to 100 ns was used, and the signal of both detection channels was cross-correlated (which is usually necessary for preventing any distortion of the final FCS curve by SPAD-afterpulsing). The algorithm was written in C, and it was compiled and executed on a standard PC with an AMD Athlon 1800+ processor running at 1.5 GHz. The computation times for the six measurements are shown in Fig. 3. A linear curve was fitted against the determined computation-times with a standard least-squares fit. As can be seen, the computation time scales approximately linearly with the count rate (and thus number of detected photons), as expected from Eq. (4). Only at the lowest count rate (excitation power of 0.7 µW) the computation time falls below the fitted line in Fig. 3. This due to the fact that for very sparse data, the time coarsening steps will lead to a much faster data reduction than at larger fluorescence intensities. Thus, the algorithm is most efficient at low light intensities typical for single molecule spectroscopy.

Conclusion
We have presented a powerful and complex algorithm for converting asynchronous photoncounting data into correlation curves. The algorithm is perfectly suited to low-intensity measurements as usually encountered in single-molecule and most fluorescence-correlation experiments. The computational time of the algorithm scales linearly with the number of detected photons but is completely independent on the total measurement time. A major advantage of the presented algorithm is its flexibility, allowing for, besides standard autocorrelation, the computation of more complex correlation function such as crosscorrelation, time-gated correlation, or sophisticated filtered correlation like TCSPC-FCS.