Multi-exposure laser speckle contrast imaging using a video-rate multi-tap charge modulation image sensor.

Multi-exposure laser speckle contrast imaging (MELSCI) systems based on high frame rate cameras are suitable for wide-field quantitative measurement of blood flow. However, high-speed camera-based MELSCI requires high power consumption, large memory, and high processing capability, which may lead to relatively large and expensive hardware. To realize a compact and cost-efficient MELSCI system, we discuss an application of the multi-tap CMOS image sensor originally designed for time-of-flight range imaging. This image sensor operated in the global shutter mode and every pixel was provided with multiple charge-storage diodes. Multiple images for different exposures were acquired simultaneously because exposure patterns were programmable to implement an arbitrary exposure duration for each tap. The frame rate was close to video frame rates (30 frames per second (fps)) regardless of the exposure pattern. The feasibility of the proposed method was verified by simulations that were performed with real speckle images captured by a high-speed camera at 40 kfps. Experiments with a four-tap CMOS image sensor demonstrated that a flow speed map was obtained at a moderate frame rate such as 35 fps for a moving ground glass plate and 45 fps for flowing Intralipose, which were linearly moved at speeds of 1-5 mm/s.


Introduction
Laser speckle contrast imaging (LSCI) is a wide-field and non-contact optical method to image the two-dimensional (2D) blood flow maps with high spatial and temporal resolution in real-time [1]. It has been widely adopted for blood flow studies in recent years. Laser speckle is observed when coherent laser light is scattered by randomly distributed medium such as tissues and its interference measured by a photodetector. Moving particles like red blood cells (RBCs) in the tissue fluctuate the speckle pattern. The generated speckle pattern is detected by charge coupled device (CCD) or complementary metal oxide semiconductor (CMOS) image sensors. The speckle pattern, which is temporally fluctuating due to the flow of RBC, is averaged over the accumulation time of image sensor, T. The speckle contrast, K, is modeled by the following equation in terms of a region of interest (ROI) and is statistically related to the RBC flow speed [2]: Here, <I> represents the mean of light intensity, I, in the ROI. When there is no flow, the speckle pattern is still. Therefore, the contrast K becomes the highest, i.e. 1. Meanwhile, the speckle pattern on the captured image is blurred if the scatterers move during the accumulation time of the image sensor. Thus, K approaches 0. λ is the wavelength of the laser. ρ is the fraction of total light that is Doppler-shifted. β is a calibration factor that accounts for speckle averaging effect. v ne is the variance due to non-ergodic processes and v n is the variance due to instrumental noise. Both variance terms can be viewed as an offset. In the following discussions, we combine v n and v ne together as single variance, v o . The flow speed v ∝ 1/τ c (τ c , correlation time) is estimated by fitting K(T) for multiple T's with Eq. (1). The LSCI-based flowmetry technique started to advance in the 1990s for imaging blood flow in the retina and skin, with the development of faster digital acquisition techniques and processing technologies. Briers and Webster proposed the first scheme of LSCI to monitor capillary blood flow [1]. Later, several authors have contributed to improve LSCI over the past two decades. LSCI was utilized to measure cerebral blood flow [3], tissue perfusion [4,5], burn scar perfusion [6], and human retinal vein perfusion [7].
Image acquisition of conventional LSCI systems is mostly based on a single-exposure time. Unfortunately, single exposure LSCI suffers from a limited range of flow sensitivity and underestimates large changes in blood flow speed [8]. An extension of the LSCI technique called multiple exposure LSCI (MELSCI), has improved the performance of LSCI [2]. MELSCI estimates the flow speed from a set of contrast values measured at different exposure times. Consequently, the sensitivity range and accuracy of the flow speed calculations improves. Studies based on MELSCI include chronic imaging of cortical blood flow [9] and in vivo imaging of microvascular changes in an angiogenic environment [10]. MELSCI systems using high-speed cameras [11][12][13][14] have the capability of real-time quantitative measurement. However, they require considerable processing capability due to the high frame rate, which is not energy efficient. Furthermore, large memory for signal processing is necessary, which will lead to relatively large hardware.
In general, pulsatile blood flow is at about 1 Hz. To observe the blood flow changes for each pulsation, a video rate of 30 fps is adequate. However, the fast frame rate higher than 1kfps is required to record the fast changing of the speckle pattern. The aim of this study is to build a MELSCI system, which is capable of monitoring the speckle fluctuation efficiently at video rate, by taking advantage of the programmable exposure of a multi-tap CMOS image sensor [15][16][17], which contributes to low power consumption and requires less processing capability.
The remainder of the paper is structured as follows; Section 2 describes the operation principle of laboratory-designed multi-tap charge modulation pixels and its application to MELSCI. Selection of exposure patterns in the proposed MELSCI is also mentioned. Section 3 discusses system performance by simulation based on real images captured by a high-speed camera. In Section 4, the proposed method is demonstrated with a 4-tap CMOS image sensor originally designed for time-of-flight (TOF) range imaging. Future issues are discussed in Section 5 and Section 6 concludes this paper.

Multi-tap charge modulation pixels
Multi-tap or multi-bucket CMOS image sensors are applied in this paper to minimize the redundancy of sampling data points, and to bridge the gap between 30fps and 1000fps while not suffering from inter-frame delays in exposure [18,19]. A lateral electric field charge modulator (LEFM) developed at Shizuoka University is capable of implementing a pixel with multiple pieces of charge storage and has advantages such as high-speed and loss-less complete charge transfer and suppressed charge back-flow from the storage to the photodiode. Figure 1 shows an example of layout, equivalent schematic, and timing chart of 4-tap pixel based on LEFM. The pixel of this image sensor consists of photodiode (PD), four charge storage diodes (FD1-4) and drain. Although the charge storage is embodied as a floating diffusion (FD) in this figure, two-step charge transfer structures to suppress reset noise have been developed [20,21]. The photo-generated charges in the PD are quickly transferred to one of the storage diodes and the drain by the 5 sets of controllable gates (G1-4 and GD). While G1 is activated and the other gates are deactivated, the photo-electrons generated in the photodiode region are transferred to FD1. This operation is repeated for the gates G2, G3 and G4 and the charges are integrated in the corresponding FDs. Thus, arbitrary exposure is implemented by programming the GX (X = 1-4) signals. GD is a gate which drains the charges in the photodiode. During the image readout, the photo-charges should not be transferred to any FD, so that GD is activated and the other gates are deactivated. This image sensor can sample the charges efficiently at the higher modulation frequency than video rate to detect rapid fluctuation in speckle intensity and read out the images for different exposures simultaneously at near video rate. With this approach, there is no dead time during the charge modulation, and hence the problem of inter-frame delay does not occur.

Exposure patterns
In this paper, two exposure patterns depicted in Fig. 2 are discussed. The unit exposure time or the shortest exposure time is represented by T 0 . The relative exposure time window durations are equal ( Fig. 2(a)) and exponential ( Fig. 2(b)). Note that the images for multiple exposure times are synthesized by summing up the images of adjacent exposure time windows. The ratios of the shortest to the longest exposure times are 1:4 (equal pattern) and 1:8 (exponential pattern). In the design, a tradeoff between signal-to-noise ratio (SNR) and the measurable flow speed range is considered. To cover a wide speed range, the ratio of the longest exposure and the shortest exposure should be large. However, the SNR of the captured image for a shorter exposure time becomes worse. Therefore, it is effective to utilize multiple taps for the same exposure time and average the calculated contrast values among the taps. As shown in Fig. 2(a), an exposure time, T 1 , is given by the four images of G1, G2, G3, G4. Therefore, the standard deviation of the calculated contrast will be improved by a factor of 2 by averaging. To synthesize images for an effective exposure time of T 2 = 2T 0 , three combinations are possible. For T 3 = 3T 0 , there are two combinations. On the other hand, for the exponential pattern, only 2 combinations exist to synthesize the images for T 1 and T 2 , which means the exponential pattern suffers from larger noise.

Data acquisition with a high-speed camera
The effectiveness of the proposed exposure patterns was verified by simulation in MATLAB R2014b based on the measured images with a high-speed camera. The specifications of the highspeed camera are shown in Table 1. As solid and fluid scatterers, a ground glass plate (Edmund Optics, 45-656, 220 grits, 100× 100 mm dimension and thickness of 1.60 mm) and Intralipose (Otsuka Pharmaceutical Co., Ltd., Intralipose 20%), respectively, were measured. Figure 3(a) shows the schematic of experimental setup, which is also utilized in the experiment with the 4-tap CMOS image sensor. A CW wavelength-stabilized laser (ONDAX, RO-USB-785-PLR-100-1, λ of 785nm, output power of 100 mW) with linear polarization was used as a light source. During high-speed camera measurement, the phantom of ground glass plate and Intralipose were illuminated with laser power of 8 mW and diameter of 10mm. A high-speed CMOS global shutter camera (Mikrotron, MC1362) and visible and near-infrared C-mount lens (Edmund Optics, #67-715, focal length of 25mm) were used for imaging. To remove the non-scattered surface reflection of the laser illumination, an analyzer was placed in front of the imaging lens. The ground glass plate was placed on a motorized stage (OptoSigma, OSMS20-85). The ground glass plate moved in one direction (forward) to provide the same translation repeatedly. From the correlation taken between the captured frames at different time period, the random nature of speckles generated by moving the ground glass plate was confirmed. To control the flow speed of Intralipose, a syringe pump (YMC, YSP-101) was used. The Intralipose was pumped into a capillary tube (YMC, P-0027) with inner and outer diameters of 1mm and 1.58mm, respectively, which was placed on a scattering tissue phantom to resemble the blood flow. In the measurement, the phantoms were observed for the speeds of 1-5 mm/s with a 1 mm/s step. This flow range is relevant to clinical studies [22][23][24][25][26]. The f -number of the imaging lens was set to 2 to give the highest contrast change in correspondence to the shortest to longest exposure time. The speckle-to-pixel size ratio was calculated as 0.19. The observed region was 5.0 × 1.0 mm 2 , which corresponded to 160 × 32 pixels on the high-speed image sensor. The raw speckle images were captured at the frame rate of 40 kfps or with a temporal resolution of 25µs. The image acquisition was carried out for 2s, corresponding to 80,000 frames acquired. In data analysis, 19 data sets each of whose observation period was 102.4ms were generated (Fig. 4). Every data set was composed of 4096 frames which correspond to 102.4ms. The combination of exposures times was generated based on a binary tree structure [14]. The shortest exposure time was 25µs and a total of 4096 speckle images were summed up to generate the longest exposure time of 102.4ms. A set of 4096 images were divided into pairs of successive images. The corresponding pixel values in both images were added together to generate a single image to double the exposure time. This process is iteratively repeated to generate exposure times of 25 µs, 50µs, and so on. Note that multiple images are obtained for each effective exposure time except the longest exposure. For example, 4096 images for 25µs, 2048 images for 50µs, etc.
To simplify the calculation, the squared speckle contrast value, K 2 , is used in the following parts. K 2 value was calculated for each synthesized image, and the K 2 values for the same effective exposure were averaged to reduce the variation. Thus, a dense K 2 curve that consisted of 13 exposure times (25µs -102.4ms) was obtained. A fixed ROI size of 7 × 7 was used to calculate the K 2 value. Fig. 4. Data processing of high-speed camera images. The squared contrast value, K 2 , for each exposure time was calculated for each data set. K 2 and speed were obtained by further averaging of the 19 data sets.
The K 2 curves are shown in Fig. 5 were obtained by fitting K 2 , which was normalized by K 2 measured from the static ground glass plate. Therefore, β was excluded in the fitting because β served as a scaling factor in Eq. (1). The fitting was performed with the "fminsearchbnd" function available from MathWorks File Exchange. Three parameters, ρ, x = T/τ c , and v o were estimated by fitting under the condition [0 < ρ < 1, 0 < v o < ∞, 0 < τ c < ∞]. The quality of the fitting was evaluated by using root mean square error (RMSE). We collected measurements from the ground glass and Intralipose. The K 2 values for the Intralipose are smaller than those for the ground glass due to increased Brownian motion associated with Intralipose.

Simulation of the 4-tap Image Sensor
Using the high-speed camera data set, we next simulated speckle images obtained by the 4-tap CMOS image sensor. Multi-tap images {I i (x, y) (i = 1, 2, 3, 4)} were generated by the following equation: where (x, y) denotes the pixel position, i is the index of the tap, I (x, y; t) represents the captured high-speed images, N is the number of frames (N = 4096), and G i (t) is an exposure pattern function for tap-i, which is shown in Fig. 2. T min was 25µs. K 2 for the ground glass, which was normalized by the K 2 for the static ground glass plate, was calculated in the fixed ROI of 7 × 7 pixels in the center of the captured image.
In the fitting, one additional data point was considered with an assumption that the normalized K 2 becomes unity for a sufficiently short exposure time, (e.g. 1µs in this paper). This is because only four data points were not satisfactory for fitting. Figures 6 and 7 show the fitted curves for the ground glass for the equal exposure pattern and the exponential exposure pattern, respectively. In each figure, several unit exposure times were given under the constraint that the longest exposure time is less than 33ms (one frame period for the video rate). The data points in the figures were given by the mean of the simulated K 2 among the 19 data sets. The fitted curves for the longer unit exposure time more resemble those of the high-speed camera shown in Fig. 5(a).

Comparison of Experimental High-Speed Camera to Simulated 4-tap CMOS Sensor Data to Estimate Flow Speeds
The estimated speeds by the high-speed camera and the simulated 4-tap CMOS image sensor were compared. Because it is difficult to quantify the absolute value of speed using the MELSCI system due to several hindrances such as improper statistical model and interpretation of inferred correlation time, τ c , and the speed, v [27], the estimated speed (1/τ c ) was normalized to the slowest flow speed (1mm/s). A linear fit is also shown in the plots for the unit exposure time associated with the highest sensitivity, as determined by the steepest slope. As shown in Fig. 8, a longer exposure time mostly resulted in higher sensitivity. The quality of the estimated speed was evaluated by the flow speed-to-noise ratio (FNR), which was defined as the quotient of the mean estimated speed to the standard deviation of estimated speed. The FNRs for the T 0 associated with the highest FNR are shown in Table 2. Overall, the equal exposure pattern provided higher FNR than the exponential pattern. One possible reason is that the variation of the estimated speed is significantly affected by the K 2 for the shortest exposure time. As shown in Fig. 2, the number of averaged data for the shortest exposure were 2 and 4 for the equal and exponential exposure patterns, respectively. Our data (Fig. 8) suggests that the estimated speed for the exponential exposure fluctuates more than that of the equal exposure. The feasibility of the 4-tap sensor was evaluated by relative error. The positive and negative signs show over estimation and under estimation of flow speed, respectively. From the relative estimation error analysis as shown in Table 3, it is clear that 4-tap sensor has the ability to measure the flow speed with acceptable errors. For various T 0 's, the mean FNR was calculated as shown in Fig. 9. The clear trend of two peaks was observed in mean FNR vs T 0 plot for equal and exponential exposure patterns (ground glass plate and Intralipose). Because of Brownian motion associated with Intralipose, the fitted K 2 curve does not have flat part at shorter exposure time as shown in Fig. 5(b). Therefore, the shorter T 0 can estimate the flow speed with the better mean FNR as shown in Fig. 9(b) and (d).

4-tap CMOS image sensor
To demonstrate the proposed method, a 4-tap CMOS image sensor originally developed for time-of-flight range imaging was used. Pixel size and the effective pixel count were 22.4 × 22.4 µm 2 and 132 × 84 pixels, respectively. For details, refer to [28].
The timing chart of the 4-tap charge modulator pixels is depicted in Fig. 10. Because this image sensor operates in the global shutter mode, the charge modulation and accumulation period are separated from the image readout period. The former period duration was determined by the shutter pattern and T 0 . The latter period was fixed (∼2.8ms).

Basic characterization
In the experiment, the same optical setup in Fig. 3 was used. The imaging area was 15 × 10 mm 2 . The f -number of the imaging lens was 11 to give the largest K 2 to noise ratio [29]. The ratio of speckle to pixel size was calculated as 0.57. The T 0 values for the ground glass and Intralipose were 6.4ms and 4.8ms, respectively. K 2 was calculated in the ROI of 7 × 7 for ground glass and Intralipose over 20 frames. The calculated K 2 for each frame was averaged and fitted with the speckle model in Eq. (1). The same processing was applied as the four-tap simulation in Sec. 3.2.2. The measured K 2 was normalized by K 2 of the static ground glass for the longest exposure time of 6.4ms [30,31]. Because K 2 should be constant for a static object regardless of the exposure time, the longest exposure time was used to define K 2 to mitigate effects of sensor and photon shot noise. The measurement was performed with the equal exposure pattern. K 2 and the normalized measured speeds for the ground glass and Intralipose are shown in Fig. 11.
To obtain the absolute speed map, the calibration based on a simple linear fitting was performed based on the following equation [13]: where γ and δ are calibration factors. Thus, the calibrated speed is given by Figure 12 shows the estimated absolute speed maps for the ground glass and Intralipose for several speeds. To calibrate the estimated speed, (γ, δ) for the ground glass and Intralipose were (0.04, 0.0143) and (0.2594, 0.2898), respectively. The flow speed maps were obtained from a single frame and no averaging of frames were performed. FNR of various T 0 was calculated in the ROI of 7 × 7 pixels. FNRs of equal and exponential exposure patterns are shown in Table 4. The measured FNRs with the 4-tap CMOS image sensor show that the equal exposure pattern provides better performance than exponential exposure pattern.

Discussion
The purpose of this paper is to demonstrate performance of a MELSCI system with an image sensor operating at a moderate frame rate to reduce the power consumption of image sensor and processing hardware. In previous works, with a single-photon avalanche diode (SPAD) image sensor, 21 exposures from 200µs to 20ms were implemented for in-vivo imaging of the mouse brain [11]. It utilized the fast frame rate capability to increase the acquisition speed and single-shot acquisition MELSCI (sMESI) to acquire thousands of frames for the shortest exposure time with negligible inter-frame delay. In post-processing, the images for longer exposures were synthesized. MELSCI systems using a high frame rate CMOS image sensor were implemented with a field programmable gate array (FPGA) [12,13]. In the recent work of MELSCI, the blood perfusion images are processed by a FPGA that is connected to a 1000-fps 1-Mega-pixel camera [14]. Seven exposure images from 1ms to 64ms were captured so that the effective sampling rate was 15.6 fps. In this paper, MELSCI was performed with video rate acquisition. However, to taking advantage of the programmable exposure, four taps may not be optimal. Investigation on the number of optimal taps per pixel and exposure patterns is necessary in future work. The required processing power for the proposed method should be evaluated as well.
In designing a multi-tap CMOS image sensor, we have a tradeoff between the number of taps per pixel and the pixel size. The charge transfer gate, charge storage diode, and readout circuit are necessary for each tap. We have confirmed that approximately 10 × 10µm 2 pixel is possible in a 0.18µm CMOS image sensor process and 22.4 × 22.4µm 2 pixel with 8 taps has been developed in 0.11µm CMOS image sensor process [32]. A larger pixel area is needed for more taps, which will cause slower optics because a large f -number would be required for such large pixels. Basically, pixels with many taps are useful to measure the whole shape of the K 2 curve. In this paper, the parameter β was not included in the fitting because the estimation of β required measuring the first flat part of the K 2 curve. To achieve this, a very short exposure time as short as tens of µs is necessary. An increase in the number of taps can contribute to two advantages: the increase of the estimated signal to noise ratio due to the averaging effect and complete estimation of all the parameters, ρ, β, τ c , and v o when illumination intensity is sufficiently strong. It is possible to implement very short exposure even in the 4-tap CMOS image sensor. However, we will obtain a very dark image because the brightness of the image is simply proportional to the exposure time. This drawback has been solved by averaging multiple frames in the high-speed camera. Acquisition of the K 2 value for a very short exposure is an issue in the proposed method.
Optimization of the image sensor is also an issue. In this paper, an image sensor dedicated to time-of-flight range imaging was used. However, TOF sensors have a low conversion gain to accumulate a large number of electrons, typically, a hundred thousand electrons and the noise level is high. We have developed high-sensitivity low-noise two-tap and four-tap time-resolving CMOS image sensors with the two-step charge transfer structure to perform true correlated double sampling [20,21]. Based on such technology, it is possible to design multi-tap CMOS image sensors specialized in MELSCI.

Conclusion
We proposed a power-efficient MELSCI system based on a laboratory-developed multi-tap charge modulation image sensor for blood flow speed imaging at near video rate (30fps). The multi-tap CMOS image sensor operates in the global shutter mode and every pixel is equipped with multiple charge storage diodes. The exposure of each tap is programmable. The problem of inter-frame delay does not occur because there is no gap time in exposure. To evaluate the effectiveness of the proposed method and to find the suitable exposure pattern and duration, the simulation with the real speckle images captured by a high-speed camera at 40kfps was performed. A ground glass plate on a motorized stage and flowing Intralipose in a tube were measured. Two exposure patterns: equal and exponential exposures were discussed. The simulation results suggested that the equal exposure provided a larger speed-to-noise ratio than the exponential exposure. Based on the simulation, a flow speed measurement with a 4-tap CMOS image sensor originally designed for time-of-flight range imaging was performed. It is confirmed that the 4-tap sensor can measure flow speed and mapped the flow image.

Funding
Japan Society for the Promotion of Science (18H01497, 18H05240); Arnold and Mabel Beckman Foundation; National Institutes of Health (P41 EB015890).