Hardware simulator for optical correlation spectroscopy with Gaussian statistics and arbitrary correlation functions

We present a new hardware simulator (HS) for characterization, testing and benchmarking of digital correlators used in various optical correlation spectroscopy experiments where the photon statistics is Gaussian and the corresponding time correlation function can have any arbitrary shape. Starting from the HS developed in [Rev. Sci. Instrum. 74, 4273 (2003)], and using the same I/O board (PCI-6534 National Instrument) mounted on a modern PC (Intel Core i7-CPU, 3.07GHz, 12GB RAM), we have realized an instrument capable of delivering continuous streams of TTL pulses over two channels, with a time resolution of △t = 50ns, up to a maximum count rate of ⟨I⟩ ∼ 5MHz. Pulse streams, typically detected in dynamic light scattering and diffuse correlation spectroscopy experiments were generated and measured with a commercial hardware correlator obtaining measured correlation functions that match accurately the expected ones. © 2014 Optical Society of America OCIS codes: (070.4550) Correlators; (290.0290) Scattering; (300.0300) Spectroscopy. References and links 1. B.J. Berne and R. Pecora, Dynamic Light Scattering, (Wiley, NY, 1976). 2. D. J. Pine, D. A. Weitz, P. M. Chaikin, and E. Herbolzheimer, “Diffusing wave spectroscopy”, Phys. Rev. Lett. 60, 1134-1137 (1988). 3. D.A. Boas, L.E. Campbell, and A.G. Yodh, Scattering and imaging with diffusing temporal field correlations, Phys. Rev. Lett. 75, 1855-1858 (1995). 4. T. Durduran, R. Choe, W.B. Baker and A.G. Yodh, Diffuse optics for tissue monitoring and tomography, Rep. Prog. Phys. 73, 076701-076744 (2010). 5. K. Schatzel, “Single-photon correlation techniques”, in Dynamic Light Scattering, edited by R.G.W. Brown (Clarendon, Oxford, 1993). Chapt. 2. 6. D. Magatti and F. Ferri, “Fast multi-tau real-time software correlator for dynamic light scattering”, Appl. Opt. 40, 4011–4021 (2001). 7. D. Magatti and F. Ferri “25 ns software correlator for photon and fluorescence correlation spectroscopy”, Rev. Sci. Instrum. 74, 1135-44 (2003). 8. M. Wahl, I. Gregor, M. Patting and J. Enderlein, “Fast calculation of fluorescence correlation data with asynchronous time-correlated single-photon counting”, Opt. Express 11, 3583–3591 (2003). #217149 $15.00 USD Received 17 Jul 2014; revised 6 Oct 2014; accepted 7 Oct 2014; published 4 Nov 2014 (C) 2014 OSA 17 November 2014 | Vol. 22, No. 23 | DOI:10.1364/OE.22.028002 | OPTICS EXPRESS 28002 9. L.-L. Yang, H.-Y. Lee, M.-K. Wang, X.-Y. Lin, K.-H. Hsu, Y.-R. Chang, W. Fann, and J.D. White, “Real-time data acquisition incorporating high-speed software correlator for single-molecule spectroscopy”, J. Microscopy, 234, 302–310 (2009). 10. J. Buchholz, J. Wolfgang Krieger, G. Mocsar, B. Kreith, E. Charbon, G. Vamosi, U. Kebschull, and J. Langowski, “FPGA implementation of a 32x32 autocorrelator array for analysis of fast image series”, Opt. Express 20, 17767-17781 (2012). 11. E. Schaub, “F2Cor: fast 2-stage correlation algorithm for FCS and DLS”, Opt. Express 20, 2184–2195 (2012). 12. E. Schaub, “High countrate real-time FCS using F2Cor”, Opt. Express 21, 2184-2195 (2012). 13. F. Ferri and D. Magatti, “Hardware simulator for photon correlation spectroscopy”, Rev. Sci. Instrum. 74, 42734279 (2003). 14. J.W. Goodman, Statistical Optics, (Wiley & Sons Inc., New York, NY, 2000). 15. J.A. Williamson, “Statistical evaluation of dead time effects and pulse pileup in fast photon counting. Introduction of the sequential model”, Anal. Chem. 60, 2198-2203 (1988). 16. C. Zhou, G. Yu, F. Daisuke, J. H. Greenberg, A. G. Yodh, and T. Durduran, “Diffuse optical correlation tomography of cerebral blood flow during cortical spreading depression in rat brain”, Opt. Express, 14 1125-1144 (2006). 17. K. Schatzel, “Noise on photon correlation data: I. Autocorrelation function”, Quantum Opt., 2, 287 305, (1990). 18. R. Peters, “Noise on photon correlation and its effects on data reduction algorithms”, in Dynamic Light Scattering edited by W. Brown (Clarendon Press, Oxford, 1993) Chapt. 3. 19. W.H. Press, S.A. Teukolsky, W.T. Vetterling, B.P. Flannery, Numerical Recipes in C (Cambridge University Press, Cambridge, 1992).


Introduction
Optical correlation spectroscopy (OCS) is a powerful tool for studying many physical, biological and chemical systems that exhibit relaxation dynamics over a wide range of correlation times, from nano to milliseconds or greater.OCS works by measuring the time auto-or crosscorrelation function of the intensity signal scattered by the sample under investigation.When the detected photons are singly scattered, the method is known with the classical names of Photon Correlation Spectroscopy (PCS) or Dynamic Light Scattering (DLS) [1] and has been applied for decades in the field of particle sizing, laser Doppler velocimetry and, in general, for the characterization of quasi-transparent complex fluids and soft matter systems.Conversely, in the opposite limit, when the sample is very turbid, the photons are scattered so strongly that they undergo diffusive propagation and the method, depending on the application, takes the original name of Diffusing Wave Spectroscopy (DWS) [2] or, as recently proposed in biomedical optics, the name of Diffuse Correlation Spectroscopy (DCS) [3,4].In all these cases, the statistics of the scattered photons is expected to be Gaussian and consequently the electric field correlation function can be recovered, via the well known Siegert relation [5] from the intensity correlation function which is the experimentally measured quantity.
In a typical OCS experiment, the intensity correlation function is computed by using a digital correlator, a device capable of computing the correlation function of the pulse stream out from the photodetector in real-time.Digital correlators can be either hardware-correlators that are commercially available [see, for example, the ones manufactured by ALV (Langen, Germany) or Brookhaven Instrument (Holtsville, NY) or Correlator.com(Bridgewater, NJ)] or homemade software-correlators that have been proposed in literature over the last decade [6][7][8][9][10][11][12].In both cases, the correlators need to be accurately tested and benchmarked to compare their performances.
For this reason, about a decade ago, some of us developed [13] a new tool known as a hardware simulator (HS).The HS is an instrument capable of delivering a continuous stream of transistor-transistor logic (TTL) pulses with the desired statistical properties over one or two channels.Its maximum time resolution was set by the board used (PCI-6534, National Instrument) and corresponded to a minimum temporal distance between pulses of ∆t = 50ns.Conversely, the maximum count rates deliverable on one or two channels were determined by the performances of the used personal computer (PC) and, in particular, by the speed at which the data could be read from the hard disk.In [13], where a Pentium−4 PC and an ATA−66, 7200 rpm hard disk were used, the maximum deliverable count rates were ⟨I⟩ ∼ 350KHz and ⟨I⟩ ∼ 80KHz on one or two channels, respectively.
The HS developed in [13] was specifically designed for testing digital correlators used in an ideal OCS experiment where the electric field correlation function G 1 (τ) = ⟨E(t) E ⋆ (t + τ)⟩ is a single exponential decay, i.e.G 1 (τ) = exp[−τ/τ 1 ], and the detection is acquired well inside a single coherence area so that the intensity correlation function, i.e.G 2 (τ) = ⟨I(t) I(t + τ)⟩, according to the Siegert relation [5], is given by G 2 (τ) = ⟨I⟩ 2 . In this work we have removed the above restrictions, upgrading the HS with the possibility of generating pulse streams with an arbitrary correlation function and detecting the intensity over an (integer) number of coherence areas, N ca ≥ 1.While implementing these new features, we have also boosted up the overall performances of the HS and achieved, with a time resolution of △t = 50ns, a maximum deliverable count rate higher than ⟨I⟩ ∼ 5MHz on both channels which is similar to the maximum count-rates of common photon-counting detectors.

Theoretical background
In this section we will first describe how to synthesize a correlated stochastic signal z(t) that is characterized by a Gaussian distribution and whose autocorrelation function can have any arbitrary shape G z (τ).Secondly, we will show how to extract from z(t) a stream of pulses that simulate the output of an ideal detector and are characterized by a given average intensity ⟨I⟩ and a autocorrelation function equal to G z (τ).
As well-known [14], when a stochastic variable, y(t), characterized by a probability Gaussian distribution with an average value, < y >, standard deviation, σ y , and a correlation function, G y (τ) =< y(t) y(t + τ) >, is fed through a time invariant linear filter with a normalized impulse response function, h ( ∫ ∞ −∞ h(t)dt = 1), the resulting stochastic variable z(t) given by (symbol ⊗ representing convolution) is still characterized by a Gaussian distribution, same average value ⟨z⟩ = ⟨y⟩ and reduced (see below) standard deviation, σ z .Its correlation function, G z (τ) = ⟨z(t) z(t + τ)⟩, can be written in terms of G y (τ) as [14] G where If the data y(t) are zero-averaged (⟨y⟩ = 0) and δ -correlated where k is a constant with the dimensions of time), it is straightforward to show that demonstrating that G z (τ) and G h (τ) have the same shape or, equivalently, that their normalized versions are the same, i.e. g z (τ) = g h (τ), where g x (τ) = G x (τ)/G x (0).Therefore, if we want to generate synthetic data z(t), characterized by a given g z (τ), we have to invert the equation solve for h ′ (t), find the normalized dt, and finally apply Eq. ( 1) to the δ -correlated y(t) data.Equation ( 5) can be solved in the Fourier domain where the operator denotes Fourier Transform and we have used the Fourier transform correlation theorem.Notice that in Eq. ( 6) g z (ω) is real and symmetric, being the Fourier transform of the real and symmetric function g z (τ).Square root of Eq. ( 6) gives where ϕ (ω) is an arbitrary phase function.Among all the solutions satisfying Eq. ( 7), we can choose the one with ϕ (ω) = 0, implying that h ′ (ω) is real and symmetric (because g z (ω) is real and symmetric).By inverse Fourier transforming Eq. ( 7) (with ϕ (ω) = 0) we get the solution which is a real and symmetric function, i.e. h ′ (−t) = h ′ (t).
In practice, we invert Eq. ( 5) by discretizing it on a time grid with a clock time ∆t much smaller than the width of g z , ( ∆t ≪ δ g z ) and windowing the function g z within a finite support [−τ max , τ max ], so that it can be sampled over a finite number of points.The value of τ max can be found accordingly to the accuracy required for g z , i.e. by zeroing all the τ-values for which |g z (|τ| > τ max )| ≤ ε, where ε is the required accuracy and we have used the property that g z (τ → ±∞) → 0. Thus, the approximate version of Eq. ( 5) evaluated at the discrete lag-times τ k = k∆t with k = 0, ±1, ±2, ..., Equation ( 9) can be solved numerically in the Fourier domain by computing, as outlined above, direct and inverse discrete Fourier Transforms.In this way we obtain the function h ′ j that is discretized on the same times (t j = j∆t) as g z , is made of the same number of points N p = 2 j max + 1 ( j max = k max ), and extends over the same finite interval [−t max ,t max ] = [−τ max , τ max ] (see Fig. 6 in appendix B).
The next step is to synthesize a continuous stochastic function y(t) by using the random generator available on the PC and generating a stream of random, uncorrelated numbers y i with a Gaussian distribution, average value ⟨y⟩ = 0 and standard deviation σ y .The function y(t) is assumed to be constant between launches, i.e. y(t) = y i for any (i − 1/2)∆t ≤ t < (i + 1/2)∆t with i = 0, 1, 2, ...∞.A useful equivalent representation of such a function is where R(x) is the rectangle function (R(x) = 1 for −0.5 ≤ x < 0.5 and 0 elsewhere) and t i = i∆t.Equation (10) represents the discretized version of a δ −correlated function that is expected to remain correlated only for lag times |τ| < △t.Indeed, its correlation function reads (see appendix A) where the triangular function Λ(x) is the correlation product of R(x) and is defined as Λ(x) = 1 − |x| for |x| ≤ 1 and 0 elsewhere.As expected, Eq. (11) shows that G y (τ) = 0 as soon as |τ| △t .By inserting Eq. ( 11) into Eq.( 2) and using the property that the width of G y is much smaller that the width of G h (∆t ≪ δ G ), we obtain the correlation of the function z(t) as which is the discretized version of Eq. ( 4) when the time resolution of the system is ∆t.
At this point the stream of correlated random numbers z i can be obtained from the discretization of Eq. ( 1) that, after normalization of h ′ so that h j = h ′ j /∆t ∑ h ′ j , reads where the convolution extends over N p = 2 j max + 1 points.Equation ( 13) is the critical part of the procedure described so far for the synthesis of the z i numbers because each new point z i requires the computation and the sum of N p products, a task that could become overwhelming when N p is very large.We will describe how to cope with this problem at the end of next section and report a practical example in Appendix B.
An important parameter that will be used below for setting the amplitude of the signal to be synthesized is the variance σ 2 z .This is given by the zero-value of Eq. ( 12), i.e.
where we have used ⟨y⟩ = ⟨z⟩ = 0. Equation (14) shows that σ 2 z and σ 2 y are related to each other by the factor G h (0) △t, which can be easily computed.Indeed, since G h (0) = ∆t ∑ h 2 j and, ∆t ∑ h j = 1, we can write or, in terms of the first two moments of h, as where and in the right hand side term we have used the property ⟨ h 2 ⟩ / ⟨h⟩ 2 ∼ 1 (which is a direct consequence of the normalization of h to 1).Equation (16) shows that the convolution filter dampers the squared fluctuations of the variable y(t) by a factor proportional to △t/δ h ≪ 1.
We can now to use the above framework for generating synthetic intensity data similar to the ones collected in an OCS experiment carried out with polarized light.As mentioned above, in a system governed by a gaussian statistics, the intensity, G 2 (τ), and field, G 1 (τ), correlation functions are related to each others by the Siegert relation [5] where β ≤ 1 is a geometrical factor (called coherence factor), which scales as the reciprocal of the number of coherence areas falling onto the detector, β ∼ 1/N ca (for N ca ≫ 1).Geometrically, for each coherence area, the electric field scattered by the sample can be represented as a vector performing a "random walk" in two dimensions, in which the two components, E x and E y , are independent stochastic variables representing the real and imaginary parts of the polarized electric field [14].E x and E y are characterized by the same Gaussian probability distribution with . Thus the intensity data stream collected within a single coherence area is where (E x ) i and (E y ) i are generated by using Eq. ( 13), in which the y i are random numbers drawn from a Gaussian distribution with ⟨y⟩ = 0 and σ y chosen according to Eq. ( 15) so that σ 2 z = ⟨I⟩ /2.
When N ca > 1, the procedure has to be repeated N ca times, so that the k th intensity data stream is In this way single coherence area intensities I k i sum up incoherently, and the correlation function of the overall intensity is Equation ( 18) represents the generalization of Eq. ( 11) of [13] to an arbitrary correlation function with an arbitrary coherence factor β = 1/N ca , being N ca an integer number.
If we express ⟨I⟩ as the average number of photons scattered per unit time, the number R i of photons falling within the i th interval ∆t is simply R i = I i ∆t.To obtain from R i an integer number N i which represents the number of photocounts in the i th interval ∆t i , it is necessary to pass R i trough a Poisson filter (see Fig. 1 and [13]).The Poisson filter simulates what happens in an ideal photodetector where the output is an integer random number N i (the number of photoelectrons) sampled according to a Poisson distribution characterized by an average value equal to R i .When R i ≪ 1, the integers N i are mostly either zeros or ones.This condition is easily fulfilled because the probability of P (N i ≥ 2) is equal (for an uncorrelated signal) to ⟨R⟩ 2 /2, where ⟨R⟩ is the probability of having one photon per time interval, given by ⟨R⟩ = ⟨I⟩ ∆t.Thus for example, even at a count rate as high as ⟨I⟩ = 1MHz, with ∆t = 50ns, we have Once the photocounts, N i , have been generated, the sequence of zeros and ones must be transformed in integrated arrival times and written onto the PC hard disk (see [13]).The space required on the hard disk is large, but, definitely, much smaller than standard capacities of modern hard disks ∼ 10 2 −10 3 GByte.For example, at a pulse count rate ⟨I⟩ = 1MHz for a measuring time of T = 100s, the overall number of photocounts would be ∼ ⟨I⟩ T = 10 8 , and the corresponding file would be ∼ 1 GByte in size.

Simulator architecture
The simulator architecture is identical to the one reported in [13].It uses the same I/O board (NI PCI-6534, National Instruments), but a much higher performance PC (Intel Core i7-CPU, 3.07GHz, 12GB RAM) running under Windows-7 Enterprise-N, 64 bit.The homemade software, originally developed under LabVIEW 6.1, was upgraded to LabVIEW 2010, optimized and developed according to the scheme reported in Fig. 2 of [13].
The two main advantages of this report with respect to the original work are: (a) the reading task, which was the major limiting factor for the performances of the simulator, is now much faster.By using a standard hard disk (SATA, 3.0 Gb/s, 7200 rpm) we attain, for the reading task, a value of δt 1 ∼ 0.02µs/pulse, which is a factor ∼ 100 smaller than the original value; (b) the "data buffer", which is limited by the RAM memory available on the host PC, is now much larger and set to a size of ∼ 8.8 × 10 8 .The combination of these two features boosts up the performances of the hardware simulator to levels that, as matter of fact, are only limited by the time resolution of the PCI-6534 board.
In practice, we are now able to generate at the maximum time resolution of ∆t = 50ns streams of pulses up to count rates ∼ 5MHz on both channels.Note that a further increase of the count rate would be troublesome because pile up effects [15] would become dominant and affect both the count rate and the statistics of the pulse stream delivered by the board.An example of this effect is shown below.
As a final remark, we should mention the main drawback of the new HS.While in the original work the generation of the correlated data z i was done iteratively and, for each new z i only one single uncorrelated point y i was needed (see Eq. ( 7) of [13]), in this work we have to use Eq. ( 13), which is a convolution equation.Thus, the generation of each new z i requires the sum of a large number of terms, and the use of Eq. ( 13) can easily become unpractical because of extremely long computational times.We have overcome this problem by resampling both h j and y i on a pseudo-geometrical progression time grid (instead of the traditional linear grid) so to reach the same lag time interval [−τ max , τ max ], but with a much smaller number of points.We will describe in Appendix B how to practically implement such a non-linear time grid and provide estimates for the corresponding computational times.

T(t) i
Poisson Filter E y (t) Convol.

Convol. E y P(E y )
Fig. 1. (Adapted from Fig. 1 of [13]).Schematic diagram of the algorithm adopted for generating synthetic correlation data.E x (t) and E y (t) are the two components of the scattered electric field, I(t) is the intensity, N i the sequence of counts (0 or 1) after the Poisson filter and T (t) the integrated photon arrival times that are stored on the hard disk (HD).Optionally, the presence of the detector defects can also be introduced.

Experimental results
The simulator was tested by processing its pulse stream with a commercial hardware correlator (model Flex2k 12x2 from Correlator.com) that adopts a multi-tau scheme [5] with a minimum delay time of 12.5ns.Unless otherwise specified, all the tests were carried by generating the pulse stream at the highest time resolution of ∆t = 50ns with a count rate of ⟨I⟩ = 10 5 Hz for a measuring time of T = 100s.
The first test was aimed to show the ability of the simulator to generate correlation functions of different shapes such as the ones characterized by exponential decays with different stretched or compressed exponents α : In Fig. 2(a) we compare the full coherent (N ca = 1), normalized intensity autocorrelation functions g 2 (τ) = 1 + (G 1 (τ)/G 1 (0)) 2 for three values of α = 0.5, 1.0, 1.5 and same τ c = 10 −5 s (lines) with the corresponding correlation functions measured with the Flex correlator (symbols).The data corresponding to the first 32 channels of the Flex correlator (16 channels @ 12.5ns, 16 channels @ 25.0ns) were binned and averaged so to have lag times equal to multiples of the clock time ∆t = 50ns (see details in [13]).As one can appreciate, all the curves are accurately matched by the measured correlation functions, with only slight or non-systematic residuals [panel (b)], whose overall deviations (except for the first two points, see below) were less than 1% rms.The small systematic deviations occurring in the range ∼ 10 −5 −10 −7 s are due to the finite accuracy used in Eqs. 9 and 10 (ε = 5 × 10 −3 ).The first two points that exhibit much higher deviations are due to a malfunctioning of the Flex correlator.The same kind of deviations are present in all the test we have done, but do not show up when using other correlators such as the software correlator of [7] (data not shown).
The second test was based on a typical correlation function measured in a diffuse correlation  [4] experiment for the monitoring of blood flow in living tissues.A general expression for the electric field autocorrelation function in a reflection geometry with a point source illumination placed at a distance z from the tissue plane with semi-infinite slab boundary conditions and the detector placed at a transversal distance ρ from the source, is given by Eq. ( 40) of [4] as where In Eq. ( 21) v is the speed of light in tissue, D is the photon diffusion coefficient in tissue, µ a is the tissue absorption coefficient, µ ′ s is the tissue reduced scattering coefficient, α is the the fraction of dynamic photon scattering events, k 0 = 2π/λ is the wavenumber of the continuous-wave laser light at (medium) wavelength λ , ⟨ ∆r 2 (τ) ⟩ is the mean-square displacement in time τ of the scattering particles, ℓ tr is the photon transport mean free path (ℓ tr = 1/µ ′ s ), and z b is the boundary plane where the photon fluency is extrapolated to be zero (see Figure 5 and Table 1 of [4]).
When the particles are subjected to Brownian motion their mean-square displacement is ⟨ ∆r 2 (τ) ⟩ = 6D b τ (being D b their diffusion coefficient), whereas when they undergo a random flow the mean square particle velocity).Note that the dependence of Eq. ( 21) on the ratio v/D is only apparent because D = v/3(µ a + µ ′ s ).
On the same Figure , reported as open symbols, are shown the g 2 (τ) data obtained with the Flex correlator by using the same working conditions of Fig. 2. As in the previous case, the two curves are accurately matched by the measured correlation functions, with slightly or non-systematic residuals, whose overall deviations (except for the first two points) were less than 1% rms.
The third test shows that the simulator is capable of delivering pulse streams characterized by two single exponential decay times and detection carried out by integrating the scattered intensity over a few coherence areas.Thus, and 2 with N ca being the number of coherence areas.When τ 1 and τ 2 are sufficiently different and A 1 ∼ A 2 , the normalized intensity correlation function g 2 associated to Eq. ( 22) deviates significantly from a single exponential decay and assumes a sigmoidal shape, as shown by the five solid lines of Fig. 4(a).All these curves have been obtained with τ 1 = 10 −6 s, τ 2 = 10 −4 s, A 1 /(A 1 + A 2 ) = 0.30 and N ca ranging from 1 to 5 (top to bottom).The correlation function measured with the Flex correlator (open symbols) match quite accurately the expected curve, as evidenced by the residual plot reported in Fig. 4(b).Finally, we discuss the main limitation of the hardware simulator, which is related to the maximum output count rate the simulator can deliver.As mentioned in Sect.3, the time resolution of the of the PCI-6534 board is ∆t = 50ns, meaning that, if two or more pulses are closer than 50ns pile up effects [15] take places and only a single pulse is output on the port.Thus, if ⟨I⟩ is the true pulse count rate, the effective count rate delivered by the board is smaller and, for ∆t ⟨I⟩ ≪ 1, can be approximated by As a consequence, the ratio ⟨I⟩ ef f / ⟨I⟩ becomes increasingly smaller as ⟨I⟩ gets higher.For example, we have ⟨I⟩ ef f / ⟨I⟩ ∼ .98 for ⟨I⟩ = 5 × 10 5 Hz and ⟨I⟩ ef f / ⟨I⟩ ∼ .75 for ⟨I⟩ = 5 × 10 6 Hz.Pile-up effects cause also a reduction of the rms fluctuations of the signal, because high fluctuations are dampened and thresholded to a one count /clock.Fig. 5(a) shows an example of how the intensity correlation function change as the count rate is increased: the pulse streams, delivered at a fixed ∆t = 50ns clock, were characterized by the same single exponential decay [Eq.( 20)] with α = 1, τ c = 10 −4 s and with count rates varying from 100kHz to 10MHz .The open symbols represent the autocorrelation data taken with the Flex correlator for a measuring time of T = 100s (blue and green symbols) and T = 10s (red symbols).Surprisingly, whereas the correlation amplitude decreases remarkably with increasing the count rate, the effects on the shape of g 2 are much less pronounced.Indeed, when fitting the data to single exponential decay functions, the fittings appear rather accurate [residuals slightly systematic only for the 10MHz data (red symbols), see panel (b)] and the recovered correlation times appear to be quite close to the expected ones, i.e. τ c = 1.00 × 10 −4 ± 1.38 × 10 −7 s, 0.99 × 10 −4 ± 0.62 × 10 −7 s , 0.93 × 10 −4 ± 2.65 × 10 −7 s for the 100kHz , 2MHz and 10MHz count rates, respectively.Thus, we conclude that pile-up affects only moderately the relaxation dynamics of the system.
As a final comment, we would like to point out that pile-up effects would be greatly reduced if a faster board would be used.For example, with the board PCI-6562 (National instruments), which has a time resolution of ∆t = 5ns, pile-up effects would be reduced by a factor 10 with respect to PCI-6534 board used in this work.Thus, even at a count rate ⟨I⟩ > 1MHz, the correlation function of the generated pulses would present no appreciable distortions, exactly as the blue data of Fig. 5(a).Notably, common photon-counting avalanche photo diodes (e.g.SPCM series by Excelitas, Canada), that are utilized in methods such as DCS operate with a similar maximum count-rate.

Conclusions
In this work we have developed a new hardware simulator (HS) capable of delivering a continuous stream of TTL pulses with Gaussian statistics and an arbitrary correlation function.The HS is ideal for testing and benchmarking digital correlators used in various OCS experiments where the correlation function may be significantly different from that of a single exponential decay.Indeed, the latter one can be recovered only in an ideal OCS experiment, as for example, in the case of DLS measurements taken on a mono-disperse diluted particulate samples when the detection is acquired well inside a single coherence area.Conversely, in many cases these conditions are not met, such as for complex systems exhibiting sub-diffusive or stretched dynamics, or in the case of diffuse correlation spectroscopy experiments carried out in biomedical optics for monitoring blood flow in living tissues.
The HS developed in this work makes use of a commercial I/O board (National Instrument, PCI-6534), a modern PC (Intel Core i7-CPU, 3.07GHz, 12GB RAM), and a home made code developed under LabVIEW 2010 (National Instrument TM).It is capable of delivering pulses over one or two channels with a time resolution of △t = 50ns up to a maximum count rate of ⟨I⟩ ∼ 5MHz for both channels.The two channels are synchronized with the same clock so that their outputs can be profitably used for testing the cross-correlation capability of the correlator and checking its efficiency in the removal of detectors afterpulses.
The proper functioning of the HS was ascertained by computing the autocorrelation function of the output pulses by using a commercial hardware correlator, namely the Flex2k-12x2 correlator (by Correlator.com).We synthesized pulse streams corresponding to various exponential and non-exponential decay correlation functions, at different count rates, with the detection being carried out over different coherence areas.In all the cases, the measured correlation functions were in excellent agreement with the expected ones.
Beside its use for testing and benchmarking digital correlators, the HS can be also exploited for investigating and optimizing their statistical performances, which are described by the variance-covariance matrix C[g 2 (τ k ), g 2 (τ m )] of the normalized intensity correlation function g 2 [17,18].This matrix, whose diagonal elements are directly related to the signal-to-noise ratio of g 2 , depends on many experimental factors, such as the measuring time T , the average count rate ⟨I⟩ T , and the number of collected coherence areas N ca .However, C[g 2 (τ k ), g 2 (τ m )] depends also on (i) the features of the detection hardware components such as the photodetector defects and on (ii) the specific procedure used by the correlator for recovering g 2 from the raw data, such as the adopted normalization method and the multi-tau scheme [5].In this respect, the HS offers a great opportunity because, knowing exactly the incoming intensity and corresponding incoming pulse sequence, it provides an easy way to select and/or design the specifications of the detection chain and optimize the correlator architecture.Various OCS based instruments could benefit from such optimization.For example, it was previously demonstrated that the signal-to-noise ratio of common DCS measurements is dependent on the correlator architecture and can be tuned for specific probe geometries and tissue conditions [16].
Finally, it is worth mentioning that, although the HS developed this work was designed for synthesizing only pulse streams characterized by Gaussian statistics (and arbitrary correlation functions), it can also be implemented for simulating pulse patterns with different statistics, such as the ones encountered in fluorescence correlation spectroscopy (FCS).The latter one is a quite popular and powerful technique, worldwide used for investigating the dynamics of many physical, chemical and biophysical systems, even at the level of single molecule dynamics.The implementation of the HS for the various applications of FCS is not straightforward, and we are currently working on this topic.

Appendix normalized response function as h
[solid line of Fig. 6

(b)]
(3) to speed up the convolution procedure [Eq.( 13)], resample h j on a non-linear time-grid so to cover the same interval [−t max ,t max ], but with a much smaller number of points.Since h(t) is symmetrical [h(t) = h(−t)] and asymptotically decays to zero (h(t → ±∞) → 0), it is convenient to choose a time grid with symmetric intervals that increase progressively as they move away from t = 0. Let us indicate the new time grid with the times ∼ t q (q = 0, ±1, ±2, ..., ±q max ) that are selected so that ∼ t 0 = 0 and ∼ t q = n(q)△t where n(q) is an integer number such that n(q) = −n(−q).In this way, the time grid is symmetric around t = 0, all the ∼ t q are multiples of △t, and q max is found so that ∼ t q max = t max .A convenient way to define such a grid is to set the total number of points ∼ N p = 2 q max + 1 to be used and let the modulus |n(q)| increase according to a pseudo-geometrical progression that start with |n(1)| = 1.Thus the resampling will look like [−t max , ....
(q)△t, ...., t max ] and the size of the intervals ∼ △t q between adjacent times is We therefore define the resampled values ∼ h q as the average values of h inside where △n(q) = n(|q| + 1) − n(|q|) is equal to the number of clocks ∆t inside the interval A sketch of the new time grid, together with the first few values of ∼ h q are reported in Fig. 7.The overall resampling of the function h corresponding to the g z of Fig. 6 (4) determine σ y chosen according to Eq. ( 16), in which σ 2 z = ⟨I⟩ /2.(5) initialize the simulator by generating for each of the two components E x and E y of the electric field a number N p of random y i points drawn from a Gaussian distribution with standard deviation σ y and average value ⟨y⟩ = 0. Use these two data sets for loading two data buffers of N p points each.
(6) for the component E x (or E y ) of the field generate a new y i random number (Gaussian distribution with σ y and average value ⟨y⟩ = 0) and update the buffer making it to work as a First-Input-First-Output (FIFO) buffer, where the new point is inserted into the first position, all the others are moved backwards by one position and the oldest point is discarded.Select the central point of the buffer and resample the y i accordingly to the the same time grid defined at point 3.If i 0 = i − j max denotes the central index of the N p points of the buffer, the resampled A sketch of the resampling protocol for the points y i is illustrated in Fig. 8.
(7) compute (E x ) i or (E y ) i as the convolution between y and h by using where z i stands for (E x ) i or (E y ) i .Equation (B.3) is the generalization of Eq. ( 13) carried out on the non-linear time-grid ∼ △t q .Notice that in Eq. (B.3) the sum extends over ∼ N p = 2 q max + 1 points, a figure that is usually ≪ N p .Thus the use of Eq.(B.3) instead of Eq. ( 13) is rather convenient as far as concerns computational times (see discussion at the end of this Appendix).13) is very convenient.(9) calculate I i according to Eq. ( 18) i.e.
) calculate R i = I i ∆t and pass R i trough a Poisson with a average count equal to R i .For realizing a Poisson filter see, for example, [19].The output of the Poisson filter is a integer count N i (0, 1, 2, ..).(11) if desired, introduce photodetector defects such as after pulse, dead time, dark count.(12) if N i = 0, no pulse is present.Return to point 6 for the next step.(13) if N i = 1, one pulse is present.Its arrival time is computed by counting the number of steps elapsed since the occurrence of the last pulse.The corresponding integrated arrival time, equal to the number of steps passed from the first one is recorded on the hard disk.(14) if N i ≥ 2, two or more pulse are present (pile up effects).This case is handled exactly as point (13) and all the extra pulses are discarded.
(15) return to point (6) for the next step.
The accuracy of the overall procedure has been tested by comparing, for the same data of Fig. 6(a), the correlation functions of the intensity signal I i the and corresponding pulse stream N i with the theoretical curve g 2 (τ) = 1 + |g z (τ)| 2 .In order to avoid spurious effects in the generation of the N i 's, we did not introduce any photodetector defects and kept the count rate low enough (⟨I⟩ = 10 5 Hz), which implies that pile-up effects are negligible at the clock △t = 50ns.The averaging times were T = 20s and T = 100s for the I i and N i signals, respectively.The results, reported in Fig. 9(a), show an excellent matching between the two recovered correlations and the theoretical one.As expected, due to the shot noise introduced by the Poisson filter, the N i -correlation is somewhat noisier than the I i -correlation, but with no As final remark, we would like to discuss and emphasize the gain of computational times that Eq.(B.3) provides with respect to Eq. ( 13).This is particularly relevant when increasingly broader g z (t) functions are considered and smaller ε required.Indeed, the number of points required in Eq. ( 13) scales as N p ∼ α(ε)(δ g z /△t), where α(ε) is a decreasing function of ε that depends on the shape of the g z (t).For example, in the case of a g z decaying as a single exponential with τ c = 10 −5 s, with △t = 50ns , we need N p ∼ 2 × 10 3 points for attaining an accuracy of ε = 5 × 10 −3 .And as soon as τ c = 10 −4 or 10 −3 s, the number of points grows to N p ∼ 2 × 10 4 or ∼ 2 × 10 5 , figures that make the use of Eq. ( 13) fairly impractical.
Conversely, with the use of Eq.(B.3), the number of points ∼ N p = 2 q max + 1 can be lowered by orders of magnitudes ( ∼ N p /N p ≤ 10 −2 ) and the computational times correspondingly reduced, even though by not the same factor because of the time required for the resampling of y i data (Eq.(B.2)).For example by using Eq.(B.3) (with ∼ N p = 100, △t = 50ns, ε = 5 × 10 −3 ) pulse streams with a duration of T = 1s, characterized by single exponential decay with times τ c = 10 −4 s and τ c = 10 −3 s, would take on our PC ∼ 0.4h and ∼ 4h, respectively.Conversely, by using Eq. ( 13), they would take ∼ 1.1h and ∼ 19h, increasing computational times by factors of ∼ 3 and ∼ 5.

Fig. 2 .
Fig. 2. (a): Normalized Intensity correlation functions (symbols) obtained by measuring with the Flex2k 12x2 correlator (by Correlator.com) a pulse stream characterized by the same decay time τ c = 10 −5 s and different stretched/compressed exponentials with α = 0.5, 1.0, 1.5 (see Eq. (20).The solid lines represent the expected behaviors.The pulses were delivered with a ∆t = 50ns clock, at a count rate ⟨I⟩ = 10 5 Hz, for a measuring time T = 100s.(b): Relative deviations between data and expected curves.

Fig. 3 .
Fig. 3. (a): Normalized Intensity correlation functions (symbols) obtained by measuring with the Flex2k 12x2 correlator two pulse streams characterized by a field correlation function given by Eq. (21) (see text for parameters values).The solid lines represent the expected behaviors.The pulses were delivered with a ∆t = 50ns clock, at a count rate ⟨I⟩ = 10 5 Hz, for a time T = 100s.(b): Relative deviations between data and expected curves.

Fig. 4 .
Fig. 4. (a): Normalized Intensity correlation functions (symbols) obtained by measuring with the Flex2k-12x2 correlator a pulse stream characterized by two single exponential decay times [see Eq. (22)] with τ 1 = 10 −6 s, τ 2 = 10 −4 s and A 1 /(A 1 +A 2 ) = 0.30.Different data sets refer to data being acquired within different coherence areas, from N ca = 1 to 5. The solid lines represent the expected behaviors.The pulses were delivered with a ∆t = 50ns clock, at a count rate ⟨I⟩ = 10 5 Hz, for a time T = 100s.(b): Relative deviations between data and expected curves.

Fig. 5 .
Fig. 5. (a): Normalized intensity correlation functions (symbols) obtained by measuring with the Flex2k 12x2 correlator pulse streams characterized by the same single exponential decay time τ c = 10 −4 s and different count rates from 100kHz to 10MHz.The solid lines represent the fitting to single exponential decay functions.The pulses were delivered with a ∆t = 50ns clock, for a time time of T = 100s (blue and green symbols) and T = 10s (red symbols).(b): Relative deviations between data and fitting curves.
(a) is shown in Fig. 6(b), where the number of points is ∼ N p = 100.

Fig. 7 .
Fig. 7. (a): Sketch of the resampling procedure for the function h(t).The original values h j (black dots) sampled at the times t j are replaced by the values

Fig. 8 .
Fig. 8.Comparison between the convolution carried out by using Eq.(13) [panel(a)] and Eq.(B.3) [panel (b)].Equation (13) uses a linear time grid and the generation of each new point z i requires the computation of a very large number of terms, namely the the sum of N p = 2 j max + 1 products.Conversely, Eq.(B.3) uses a pseudo-geometrical progression time-grid where the last N p y i points of panel (a) are resampled on the same time grid used for h (crosses), with y i 0 (red cross) being the central point with index q = 0.In this case the number of terms to be multiplied and summed is ∼ N p = 2q max +1.Since typically ∼ N p = 100, while N p might be very large (N p ∼ 10 4 − 10 5 ), the use of Eq.(B.3) with respect to Eq. (13) is very convenient.

Fig. 9 .
Fig. 9. Comparison between the theoretical intensity correlation g 2 (τ) = 1 + |g z (τ)| 2 (solid line) and the two correlations computed by correlating the intensity signal I i and the corresponding pulse stream N i .The theoretical g z (τ) is the same as the one of Fig. 6(a).The averaging times were T = 20s and T = 100s for I i and N i signals, respectively.Both curves match excellently the theoretical one, with non systematic residuals that, because of the shot noise introduced by the Poisson filter, are higher in the case of the N i -correlation.