Direct mapping from diffuse reflectance to chromophore concentrations in multi-fx spatial frequency domain imaging (SFDI) with a deep residual network (DRN).

Spatial frequency domain imaging (SFDI) is an emerging technology that enables label-free, non-contact, and wide-field mapping of tissue chromophore contents, such as oxy- and deoxy-hemoglobin concentrations. It has been shown that the use of more than two spatial frequencies (multi-fx ) can vastly improve measurement accuracy and reduce chromophore estimation uncertainties, but real-time multi-fx SFDI for chromophore monitoring has been limited in practice due to the slow speed of available chromophore inversion algorithms. Existing inversion algorithms have to first convert the multi-fx diffuse reflectance to optical absorptions, and then solve a set of linear equations to estimate chromophore concentrations. In this work, we present a deep learning framework, noted as a deep residual network (DRN), that is able to directly map from diffuse reflectance to chromophore concentrations. The proposed DRN is over 10x faster than the state-of-the-art method for chromophore inversion and enables 25x improvement on the frame rate for in vivo real-time oxygenation mapping. The proposed deep learning model will help enable real-time and highly accurate chromophore monitoring with multi-fx SFDI.


Introduction
Spatial frequency domain imaging (SFDI) is an emerging label-free technique that can provide quantitative tissue chromophore concentrations on a pixel-by-pixel basis in a wide-field format [1][2][3][4][5]. It has been applied to numerous biomedical scenarios for oxy-and deoxy-hemoglobin mapping such as burn wound monitoring, tumor monitoring, clinical tissue flap monitoring, and others [2,3,[6][7][8][9][10]. The details of SFDI image acquisition and processing have been described elsewhere [1]. Briefly, in order to obtain chromophore concentration maps, a series of sinusoidal patterns with different spatial frequencies are projected onto the tissue, and the corresponding reflectance images are collected by the camera. The collected images are demodulated and calibrated to get diffuse reflectance (R d ) images which are then fed into an inverse model that maps diffuse reflectance to optical absorption maps. The calculated optical absorption at different wavelengths are finally used to calculate chromophore concentrations by solving a set of linear equations based on Beer's law.
In order to separate tissue absorption from scattering, a minimum of two spatial frequencies is required in SFDI measurements [11]. However, it has been shown that SFDI chromophore extraction with only two spatial frequencies is subject to relatively large measurement uncertainties, The diagram of the SFDI instrument utilized in this work is shown in Fig. 1(a). Two LEDs of different wavelengths (685 nm and 851 nm, respectively) were sequentially used as light sources to illuminate the digital micromirror device (DMD, V-650L, ViALUX, Chemnitz, Germany) where the light was spatially modulated. The choice of LED wavelengths was based on Mazhar et al. which was the very first comprehensive study on wavelength selection for SFDI chromophore mapping [11]. The modulated light patterns of different spatial frequencies were then projected onto the tissue, and the reflectance images were collected by the camera (BFS-U3-04S2M, FLIR Systems, Oregon, United States). For multi-f x SFDI, a spatial frequency combination of 0, 0.05, 0.1, 0.2, and 0.4 mm −1 was shown to possess small chromophore extraction uncertainties and used in this work [12,20]. For each spatial frequency, patterns of three phases (0°, 120°, and 240°) were projected for measurement, leading to data throughput of 14 images per wavelength (for 0 mm −1 a planar illumination image and a dark measurement image were collected). The collected phase images were sequentially demodulated and calibrated to create diffuse reflectance maps, with pixel values normalized between 0 and 1. The diffuse reflectance (R d ) maps were created for each spatial frequency and each wavelength. Specifically, for the demodulation of AC (0.05, 0.1, 0.2, and 0.4 mm −1 ), we followed the original Cuccia et al. phase stepping method, as shown in Eq. (1), where I is the intensity of the demodulated image and I 1 , I 2 , and I 3 represent raw images acquired for each of the phase shifted illumination patterns [1]. For the demodulation of DC (0 mm −1 ), we used the subtraction of planar illumination image by the dark measurement image, so that only two (instead of three) images were required. In addition, for system calibration, we used a reference phantom with known optical properties and the "white" Monte Carlo model as forward model to retrieve the diffuse reflectance maps [21]. At each wavelength, the demodulated images of the sample and reference phantom at spatial frequency f x were used to calculate the diffuse reflectance of the sample at each pixel, as shown in Eq. (2).
(2) represent diffuse reflectance of the sample, diffuse reflectance of the reference phantom, demodulated image of the sample, and demodulated image of the reference phantom, respectively, all at spatial frequency f x . As a result, each pixel location on the camera corresponds to an R d vector of 10 elements (2 wavelengths × 5 spatial frequencies). Key to the extraction of chromophore concentrations is the mapping from R d vector to chromophore values. In previous SFDI works, the R d maps were first converted to optical absorption values at each wavelength using an inverse model (such as the NS or MLP), and then a set of linear equations were solved to extract oxy-and deoxy-hemoglobin concentrations (denoted as HbO 2 and HHb, respectively) with Beer's law. The above process is illustrated in Fig. 1(b) with blue arrows, next to corresponding example images of diffuse reflectance, optical absorptions and extracted chromophores. Such two-step extraction is typically time-consuming for multi-f x SFDI measurements, and is relatively slow for real-time clinical monitoring applications. In order to address the speed bottleneck, we developed a deep residual network (DRN) that is able to directly map from R d maps to chromophore concentrations with over 10x speed improvement, represented as the pink arrow in Fig. 1(b). In the next subsection, we will detail the DRN framework and demonstrate its advantages over the state-of-the-art method (i.e., MLP).

Deep residual network (DRN) for direct mapping from R d to chromophores
For this work we developed a deep learning model that accepts the measured multi-f x R d vector as input, and directly outputs oxy-and deoxy-hemoglobin concentrations (denoted as HbO 2 and HHb, respectively). The power of deep learning lies partially in its ability to automatically detect and approximate nonlinear patterns in high dimensional space [22]. This is ideal for multi-f x SFDI chromophore extraction, which maps an n-dimensional R d space (where n is the number of spatial frequencies) to a 2D chromophore space. Previous deep learning model for multi-f x SFDI chromophore extraction requires a two-step processing, which was relatively time-consuming, resulting in a slow frame rate for real-time oxygenation mapping [12]. Meanwhile, deep neural networks have been successfully applied to other problems in the field, such as optical parameter estimation with temporally resolved reflectance [23,24]. To the best of our knowledge, this is the first deep learning model that directly maps diffuse reflectance to chromophore concentrations for multi-f x SFDI.
The structure of the deep learning model is shown in Fig. 2(a). It is a densely connected network with cross-layer connections designed with reference to ResNet and DenseNet [25,26]. This network structure tends to have fewer free parameters than other fully-connected deep neural networks, and is faster to train without loss of accuracy [25,26]. The inputs of the model are the R d values at different spatial frequencies of different wavelengths. Here we demonstrate the case of 5-f x and 2 wavelengths, i.e., [0, 0.05, 0.1, 0.2, 0.4] mm −1 and [685, 851] nm. The diffuse reflectance values of 5 spatial frequencies were used as input to the DRN. The 5-f x R d vectors were respectively processed by three layers, each with two, four and two neurons. The output of those layers were added, and the results were used as input to following fully-connected layers. The final output of the DRN was an estimate of the tissue chromophore contents, specifically, oxy-and deoxy-hemoglobin concentrations. One of the challenges for building a deep learning model of direct chromophore mapping was the proper generation of training samples. The required training data for the DRN in this work was diffuse reflectance -chromophore concentration pairs. While there was no prior SFDI model that could directly generate the R d -chromophore concentration pair, we designed a data generation strategy using a combination of "white" Monte Carlo model and Beer's law. The "white" Monte Carlo model is a widely used SFDI forward model that maps optical properties to corresponding diffuse reflectance at different spatial frequencies [21]. For the training data generation, a wide range of tissue optical properties was fed into the Monte Carlo model and the corresponding diffuse reflectance at desired spatial frequencies was generated. Optical absorption range of [0.001, 0.15] mm −1 with 0.002 mm −1 increments and reduced scattering range of [0.51, 2.0] mm −1 with 0.04 mm −1 increments were used in the training data generation, which were chosen to cover the range of typical optical properties of tissue at the near-infrared wavelengths utilized in the system. In order to generate the R d -chromophore pairs, absorption values for 685 nm and 851 nm were sampled in a full permutation manner from the above absorption dataset. The corresponding chromophore concentrations were calculated with Beer's law, resulting in paired R d -chromophore data for training. The generated pairs were filtered based on a wide physiological range of chromophore concentrations in literature ( [5,250] µM for oxy-hemoglobin and [5,100] µM for deoxy-hemoglobin) [8,12,27]. The generation of the training data took approximately 13 seconds on the desktop, resulting in a number of 1,088,776 R d -chromophore pairs. The DRN training was performed using the MATLAB neural network toolbox and TensorFlow with Keras as a high-level application programming interface (API). Hyperparameters including the number of layers and the number of neurons were tuned in Keras using Adam optimization with an initial learning rate of 0.001 and batch size of 128 [28]. The mean squared error was minimized as a loss function. The training was completed after 1000 epochs, which took approximately 1 h. The trained model was implemented as a MATLAB function to facilitate speed comparisons to prior methods. The DRN model is publicly available for download in Code 1 (Ref. [29]). In addition to the training data, a test set of 100,000 R dchromophore pairs was generated. The test data generation procedure was identical to that of the training data, except that the relevant optical properties were randomly selected in the range of [0.001, 0.15] mm −1 for absorption and [0.51, 2.0] mm −1 for scattering. To prevent the model from overfitting, the trained model was applied to the test (unseen) data and the estimated chromophore concentrations were compared with the known ground truth, as shown in Fig. 2(b). The percent error was 0.5 ± 1.7% for oxy-hemoglobin estimation and 0.4 ± 1.4% for deoxy-hemoglobin. It can be seen that the trained DRN model was unbiased with small standard deviation errors on the new dataset, indicating no sign of overfitting and reasonable generalization capability.  [12,20]. First, we randomly generated a test set of 10,000 R d -chromophore pairs following the procedures as described above, and calculated the percent errors for chromophore extraction with both 2-f x and multi-f x , respectively. The chromophore estimation was conducted with optical property extraction followed by inversion with Beer's law. The experiment was performed under different noise levels from 0-3% on the R d values and the results are summarized in Table 1. The data shows that the multi-f x SFDI performs similarly with 2-f x SFDI for chromophore extraction in the noise-free scenario. With an increasing noise, both 2-f x and multi-f x SFDI had increased errors. In the meantime, the multi-f x consistently had equivalent or improved performance than the 2-f x in terms of both average and standard deviations of the percent error. Particularly, under 3% Gaussian noise, compared to the 2-f x , the standard deviation in percent errors with multi-f x SFDI reduced approximately 35% for oxy-hemoglobin, and 40% for deoxy-hemoglobin.

Comparison with state-of-the-art multi-f x SFDI chromophore extraction method
The proposed DRN model was then compared with the state-of-the-art multi-f x SFDI chromophore extraction method (i.e., MLP) as well as the conventionally used iterative method in terms of speed as well as accuracy under different noise levels.
The accuracy of chromophore extraction was compared between the three methods. Specifically, a new test set of 10,000 R d -chromophore pairs was generated as before. Different levels of unbiased (0 mean) Gaussian noise (with standard deviation of 0%, 1%, 2%, and 3%, respectively) were introduced to the R d values, and the chromophore concentrations were estimated by the three competing methods. The chromophore extraction accuracy was quantitatively compared for each noise level, as shown in Table 2. It can be seen that with increasing level of noise, the chromophore extraction errors also increase for all three methods, and that the errors of DRN were similar to those of the iterative solver and MLP for different noise levels. Specifically, the DRN had similar error standard deviation and moderately larger average percent error compared to the iterative solver and the MLP. We note that the average extraction error by DRN could be reduced with more complex network structures at the cost of slightly decreased speed. The current structure was determined based on a combination of significantly improved speed and comparable accuracy with previous methods. As shown in Fig. 3, the proposed DRN method was graphically compared with the iterative method and the MLP in terms of extracted oxy-and deoxy-hemoglobin concentrations under 3% Gaussian noise. It can be seen that the extracted chromophore concentrations from the three methods matched with the known ground truth values and had comparable accuracies.  Speed-wise, the computational time cost was compared by generating hemoglobin arrays from 1 × 1 to 720 × 540 (full camera image size). The R d -chromophore pairs were randomly generated as before. The chromophore concentrations were extracted by the MLP and DRN methods. The iterative method was not involved in speed comparison since it would take > 10 hours to extract optical properties and chromophore concentrations for a 540 × 720 pixel FOV, making it not ideal for practical use [12]. The experiments were repeated 10 times to reduce random effects, and the time costs were recorded. The computations were conducted using Matlab on a desktop computer with an Intel i7-8700 3.2 GHz CPU and 16 GB RAM. The resulting speed performance is shown in Table 3. It can be seen that for a single data point and a small array of 10×10, the DRN was 3x faster than the MLP. For array size of 30×30, the DRN was 7x faster than the MLP. When the array size increased to 50×50, the DRN was 10x faster than the MLP. For larger array sizes from 100×100 to 540×720, the speed improvement of DRN also remained approximately 10x. Overall, the data shows that the speed improvement by DRN would be less for small arrays such as from 1×1 to 30×30. For larger arrays from 50×50 to 540×720, the DRN achieved speed improvement by an order of magnitude. For the 100 × 100 R d maps (i.e., 100,00 inversions from R d vector to oxy-and deoxy-hemoglobin concentrations), the MLP took 8.4 ms on average. In contrast, the proposed DRN method took 0.8 ms, which is 10x faster than the DRN method. Furthermore, for a full-sized image of 720 × 540 pixels, the MLP took over 490 ms, while the DRN only took approximately 48 ms. In both cases, the proposed DRN was over an order of magnitude faster than state-of-the-art MLP method.

In vivo real-time chromophore mapping with improved frame rate
The proposed DRN model for direct chromophore mapping was integrated with the multi-f x SFDI system for in vivo chromophore measurements. Molar concentrations of oxy-hemoglobin and deoxy-hemoglobin from the dorsal side of a normal volunteer's moving hand was measured in a wide-field, non-contact manner. The measurement was performed in accordance with an institutionally approved protocol. For proof-of-concept, the multi-f x SFDI measurements were conducted with [0, 0.05, 0.1, 0.15, 0.2] mm −1 spatial frequencies and three phases (0°, 120°, and 240°) for each spatial frequency, while the 0 mm −1 was measured with planar illumination followed by a dark measurement to account for ambient light. The measurements were repeated every 0.2 s for 24 s. The image acquisition, demodulation, calibration, chromophore extraction for oxy-and deoxy-hemoglobin, and visualizations were performed in real-time. The imaging field-of-view (FOV) was approximately 5 × 4 cm, and the camera was controlled by an external trigger to synchronize with the DMD and light source. The measurement wavelengths were 685 nm and 851 nm. The subject's hand was moving upward and downward freely, and a 250 × 300 pixels region of interest (ROI) at the center of the FOV was used to demonstrate the measured real-time changes of average oxy-and deoxy-hemoglobin concentrations. A video captured during the measurements demonstrating the real-time chromophore mapping is shown in Visualization 1. The final frame of the video is shown in Fig. 4. Figures 4(a)-4(c) demonstrate the extracted oxy-hemoglobin concentration, deoxy-hemoglobin concentration, and oxygen saturation maps, respectively. Figure 4(d) demonstrates the changes of average hemoglobin concentrations in the ROI (dashed red box) induced by the hand movement. Overall, the proposed DRN model enabled real-time frame rate of 5 Hz for multi-f x SFDI wide-field chromophore mapping, achieving 25-fold improvement compared to the state-of-the-art 0.2 Hz of the MLP [12].

Frame rate analysis for real-time chromophore extraction
The time cost of the real-time chromophore extraction was analyzed post-measurement, including data import, demodulation, calibration, DRN inversion, visualization and wait time. Data corresponding to a number of 120 frames (i.e., 24 s) of the video was used for the analysis. The average and standard deviation of the time costs for each part were calculated, as shown in Fig. 5. It can be seen that in the experiment the total time cost was approximately 176 ms from image import to the visualization of extracted chromophore maps for each frame, leading to a ∼5.6 Hz upper limit for the frame rate in current setup. In the experimental demonstration, the target frame rate was set to 5 Hz. Therefore, a 24 ms wait time was added in the program to wait for images of the next frame streamed to the desktop hard drive by the camera in real-time. In addition, as an attentive reader might have noticed, approximately 46% of the time for each frame was occupied due to data import. Therefore, there are a few ways to potentially reduce the time cost, and further increase the real-time monitoring frame rate. For example, one can assign part of the computer RAM for image storage, which would presumably give faster data transaction speed. Alternatively, one can also use two threads of the CPU in parallel, for streamed data reading and image processing respectively. Last, we will discuss a few other directions to further enhance this work in the next section.

Discussion and conclusions
In this work, we proposed a deep residual network (DRN) model for direct chromophore mapping from diffuse reflectance, without extracting optical absorption or solving linear equations. Compared to the state-of-the-art method (i.e., MLP), the proposed method is 10x faster to extract chromophore concentrations. The proposed method also enabled 5 Hz real-time oxygenation monitoring with multi-f x SFDI, which was 25x faster than what was previously achieved with the MLP. The speed improvement with DRN was accomplished by the replacement of the optical absorption extraction and linear equation inversion with direct mapping operation. In this work, real-time in vivo chromophore extraction with two wavelengths was demonstrated for proof-of-concept and head-to-head comparison with the state-of-the-art method. While a 10x improvement was achieved for the mapping from R d values to chromophore concentrations, a 25x improvement was observed for the real-time oxygenation monitoring with multi-f x SFDI. It is important to note that the 2.5x discrepancy was due to hardware improvement. Specifically, in the previous work by Zhao et al. a commercial SFDI system was used, where the data acquisition and transfer to hard drive took a few seconds [12]. In this work, to demonstrate the improved speed of chromophore extraction, we built a custom SFDI system that employed an Arduino board to control and synchronize the camera, DMD, and LEDs with TTL triggers, which combined with the proposed DRN, significantly improved the real-time oxygenation-monitoring rate with multi-f x SFDI. In addition, it is important to note that while the DRN enables 5 Hz processing time as demonstrated in this work, the 5 Hz multi-f x SFDI also requires corresponding real-time data acquisition. Furthermore, while multi-f x SFDI is more accurate than single-phase method (i.e., SSOP), the single-phase method is intrinsically faster in data acquisition, and recent advances in SSOP and related works have made it capable of fast measurement and real-time use [30][31][32][33][34].
Going forward, there are a few directions that can further improve and expand the capability of this new DRN inversion algorithm. For example, the DRN model can be incorporated with the SFDI system for real-time clinical monitoring applications. The accuracy of the model can potentially be improved by using more densely sampled data during training. The DRN framework can also be adopted for other wavelength pairs and spatial frequency combinations, as well as for the extraction of other chromophores such as water and lipids [35]. Additionally, the speed of the DRN inversion can be improved with GPU-based computations in which pixels are processed in parallel. Furthermore, real-time processing with other programming languages such as C and C++ may be faster than the Matlab program used in this work. Finally, implementation of the SFDI processing pipeline onboard (e.g., with camera FPGA) may further enhance the real-time monitoring frame rate.
In summary, this work introduced a deep residual network for direct mapping from diffuse reflectance to chromophore concentrations for multi-f x SFDI, which provides over 10x improvement in inversion speed while substantially reducing measurement uncertainties with commonly used 2-f x SFDI. This method also enabled real-time in vivo chromophore mapping with 25x improvement compared to the state-of-the-art.