Tissue imaging depth limit of stimulated Raman scattering microscopy

: Stimulated Raman scattering (SRS) microscopy is a promising technique for studying tissue structure, physiology, and function. Similar to other nonlinear optical imaging techniques, SRS is severely limited in imaging depth due to the turbidity and heterogeneity of tissue, regardless of whether imaging in the transmissive or epi mode. While this challenge is well known, important imaging parameters (namely maximum imaging depth and imaging signal to noise ratio) have rarely been reported in the literature. It is also important to compare epi mode and transmissive mode imaging to determine the best geometry for many tissue imaging applications. In this manuscript we report the achievable signal sizes and imaging depths using a simultaneous epi/transmissive imaging approach in four diﬀerent murine tissues; brain, lung, kidney, and liver. For all four cases we report maximum signal sizes, scattering lengths, and achievable imaging depths as a function of tissue type and sample thickness. We report that for murine brain samples thinner than 2 mm transmissive imaging provides better results, while samples 2 mm and thicker are best imaged with epi imaging. We also demonstrate the use of a CNN-based denoising algorithm to yield a 40 µm (24%) increase in achievable imaging depth. coherent Raman microscopy: stimulated Raman scattering (SRS) microscopy and coherent anti-stokes Raman scattering (CARS) microscopy. Both experiments utilize ultrafast laser pulses to coherently excite vibrational transitions in target molecules. Though their origin of signal and detection methods diﬀer, both experiments typically operate in either transmission mode (for thin samples) or epi mode (for thick samples). epi signal size and imaging a of epi

coherent Raman microscopy: stimulated Raman scattering (SRS) microscopy and coherent anti-stokes Raman scattering (CARS) microscopy. Both experiments utilize ultrafast laser pulses to coherently excite vibrational transitions in target molecules. Though their origin of signal and detection methods differ, both experiments typically operate in either transmission mode (for thin samples) or epi mode (for thick samples).
Unfortunately, to date there has been scarce report on the penetration depth limit of coherent Raman microscopy. Those studies which explicitly report the achieved imaging depths of coherent Raman microscopy in tissue suggest the limit to be around 20 µm -100 µm [16][17][18][19][20], with varying tissue types being the cause of the large variance. Additionally, there have been no reports (to our knowledge) directly comparing the performance of epi and transmissive coherent Raman imaging in a side-by-side manner. Transmissive imaging is often advantageous in the case of optically thin samples and epi imaging is the default for optically thick samples (typically live animals or whole excised tissue). For samples of intermediate optical thickness (for example, sectioned tissues) there is no clear understanding of which imaging modality will provide the best results. A better understanding of these fundamental imaging limits and questions surrounding coherent Raman microscopy is crucial to the continuing development of these techniques as they continue gaining popularity in the biomedical imaging space.
Here we report a comprehensive study of epi and transmissive SRS signal size and imaging depth using a variety of ex vivo murine tissues (brain, lung, liver, and kidney). In this work we begin by characterizing SRS signal size and imaging depth through simultaneous epi and transmissive imaging of murine brain samples of varying thicknesses. As one would expect, epi (transmissive) SRS signal size and imaging depth increases (decreases) as tissue thickness grows, reflecting an increase (reduction) in backscattered (transmitted) photons. When tissue thicknesses reach 2 mm epi and transmissive signal sizes are roughly equivalent, and epi images provide slightly higher penetration depths than transmissive images. Using this method, we further characterized three additional tissue types (kidney, lung, and liver), each of which displays different scattering character reflecting their disparate structures and chemical compositions. Finally, we applied a recently developed convolutional neural network (CNN) based denoising algorithm to SRS images and demonstrated the ability to resolve structural features at depths exceeding 210 µm in epi-imaging of brain, representing a 40 µm increase in imaging depth. We hope that this work will serve as a useful benchmark for the growing number of experimentalists entering the SRS microscopy field and a useful source to consult when considering how best to image a given sample.

Sample preparation
Murine tissue was harvested from recently sacrificed animals provided by UW Animal Use Training Services (AUTS) according to IACUC protocol 3388-03. After excision, samples of varying thicknesses were cut using razor blades with pre-measured spacers (250 µm, 500 µm, 1 mm, and 2 mm) to ensure accurate and uniform sample thicknesses. In the case of brain imaging, samples were prepared and imaged at all four thicknesses. Liver, lung, and kidney samples were all imaged at 1 mm thicknesses.

SRS imaging
SRS images were collected using the homebuilt SRS microscope shown in Fig. 1. The pump (800 nm) and Stokes (1040 nm) pulses are provided by the tunable and static outputs, respectively, of a dual output ultrafast oscillator (Insight DeepSee +, SpectraPhysics). The Stokes pulse is directed through an electrooptic modulator which modulates the 80 MHz output pulse train to 20 MHz and provides the reference frequency used for lock-in detection. After modulation the Stokes pulse is sent onto a delay stage (used to control temporal delay between the pump and Stokes pulses) and then to a grating stretcher used to impart linear chirp [21]. The delay stage is used to determine the vibrational frequency images are collected at, and was positioned at the temporal delay corresponding to 2920 cm −1 for the duration of these experiments. The pump pulse is directed through 60 cm of H-ZF52A glass to provide similar chirp. The pump and Stokes pulses are recombined at a dichroic mirror and routed to a set of galvanometer mirrors and finally into the back aperture of a microscope objective (Olympus XLPLN25XWMP2). Transmitted probe light is collected by a condenser lens and directed to a photodiode for lock-in detection. In the epi direction, backscattered and depolarized light is recollected by the focusing objective and isolated using a polarizing beam splitter. After filtering out residual Stokes light, backscattered pump photons are sent to a separate photodiode for lock-in detection. Transmissive and epi images were collected simultaneously for all samples and fields of view. SRS images were collected using 40 mW average power in both the pump and Stokes pulse trains. Each image samples a 285 µm × 285 µm field of view. All images were collected with an acquisition time of 4 s.

Deep learning training and denoising
Deep learning denoising was performed as reported previously [22]. Briefly, SRS images of murine brain samples were acquired at a depth of ∼20 µm at low power (40 mW pump, 2 mW Stokes) and high power (40 mW pump, 40 mW Stokes) A U-Net deep learning architecture (publicly available code originally developed by Ounkomol et al. [23] and optimized for this application) was used to train a denoising algorithm that takes the low power (low SNR) images as input and predicts a corresponding high power (higher SNR) image of the same field of view similar to previously reported methods [24,25]. We then use the trained algorithm to denoise images taken deep in the murine brain at high power where SNR is low due to scattering and absorption of light in tissue.
The algorithm used here was supplied with 40 fields of view at both low and high power corresponding to signal and truth respectively. The 40 fields of view were randomly split into 10/30 test/train pairs. The algorithm was then trained over the course of 50,000 epochs using a learning rate of 0.001 with an Adam optimizer, momentum values of 0.5 and 0.999, and a batch size of 30 images. Following training, the algorithm was then fed the fields of view acquired deep in the brain for denoising. Training and denoising were performed on the University of Washington Hyak Mox supercomputer equipped with an Nvidia P100 graphics processing unit.
The training session lasted ∼10 hours, while denoising of the test and deep images took ∼0.1 seconds per field of view.

Effects of tissue thickness on epi and transmissive SRS signal sizes and imaging depths
As mentioned above, murine brain tissue samples were cut to varying thicknesses (250 µm, 500 µm, 1 mm, and 2 mm) and simultaneously imaged in transmissive and epi geometries. Throughout each experiment, images were collected down to 240 µm at 5 µm intervals. The DC (unmodulated) and AC (modulated) portions of the epi and transmissive signal were recorded as a function of depth. Each measurement was performed in triplicate (on different fields of view) for each tissue thickness. Samples were prepared to enable imaging of areas with similar structural features and chemical composition across multiple samples. All brain data discussed in this manuscript refers to images collected of the cortex. Transmissive DC signal size (in the form of absolute photocurrent detected) as a function of depth and tissue thickness is shown in Fig. 2(A). As one may expect, this signal represents total pump photons detected and the signal size is highest for the thinnest sample (250 µm, black) and decreases with thickness. Depth-dependent epi DC signal sizes, which increase as a function of tissue thickness, are shown in Fig. 2(B). The trends observed agree with what one might expect from thicker tissues enabling more scattering. Differences in the magnitude of the DC signal between transmissive and epi imaging largely reflect the differing collection efficiencies of the two imaging modalities.
Maximum recorded AC (SRS) signal sizes follow the same trends as the DC signal sizes for both transmissive and epi imaging. As expected, thinner samples (< 1 mm) yield significantly higher signals in transmissive mode than epi mode. When samples reach 2 mm thick, however, epi and transmissive signal sizes are functionally identical. Maximum DC and SRS signal sizes in both epi and transmissive mode are compiled below in Table 1, along with the calculated modulation depths (the ratio of AC to DC signal) recorded using each imaging modality across all samples. In principle, the modulation depths recorded using epi and transmissive images should be identical. However, here we report roughly a factor of two shallower modulation depth for epi images in thin samples than those recorded in a transmissive geometry. This is possibly due to the fact that in thin tissue, a larger fraction of detected photons are reflection from the cover slide or tissue surface. Figures 2(C) and 2(D) show the recovered transmissive and epi SRS signal sizes as a function of depth, respectively. Unlike the DC signals, SRS signal decreases exponentially as a function of depth. The exponential decay of each curve can be fit to Eq. (1) to determine tissue scattering length: where A is maximum signal intensity, z is the depth beneath the surface, and L s is the effective scattering length [26]. Scattering lengths recovered from 250 µm, 500 µm, 1 mm, and 2 mm using transmissive (epi) imaging are 119.3 ± 5.4 (104.2 ± 1.4), 110.3 ± 1.5 (101.9 ± 1.2), 105.0 ± 1.1 (95.7 ± 1.1), and 91.4 ± 1.9 (83.8 ± 1.1), respectively. We note that the exact scattering lengths generally match the lower end of previously reported range of 90 µm -120 µm [27,28]. One likely contribution of low value measured here is the fact that unlike two-photon fluorescence, the use of two different wavelengths in SRS makes it more susceptible to aberration (both chromatic and spherical) induced signal degradation. This additional signal decrease with depth manifests as shorter scattering length. Detailed comparison with literature is further complicated by the dependence of scattering length on animal age and sample preparation.

Imaging depth limit of SRS in murine brain
While direct signal size measurements and comparisons are useful from a benchmarking point of view, the figures of merit in many biological imaging experiments are signal-to-noise ratio (SNR) and effective imaging depth. For signal to noise measurements the signal value used was calculated as the mean pixel value of the field of view at a given depth. We chose this approach, as opposed to using the brightest features in an image to determine signal size, in an attempt to quantify the depth at which meaningful structural information could be observed. The standard deviation used in the calculation was the standard deviation of the deepest frame of each image stack. This frame was chosen since the negligible amount of SRS signal present at extreme depths suggests that the standard deviation would be dominated by shot noise as opposed to heterogeneities in signal magnitude across the field of view, as can be seen in images taken from shallower depths. Figure 3 shows plots of epi and transmissive image SNR as a function of depth for four different tissue thicknesses. Maximum SNR for 250 µm thin tissue is 10 times higher in transmissive imaging compared with epi-imaging. The achievable imaging depths for transmissive and epi modalities (defined by the depth at which SNR first falls below 2.0, shown by black dashed lines in Fig. 3) are 205 µm and 110 µm, respectively. The difference becomes much smaller as tissue thickness increases. In agreement with the observed trends for both transmissive AC and DC signal size, maximum SNR and imaging depth decrease with tissue thickness for transmissive measurements. The maximum SNR for 2 mm thick tissue is 4 times lower. The SNR and imaging depth for the epi geometry follow the same trends seen for epi AC and DC signal sizes as well, in that they increase as a function of tissue thickness. 2 mm thick tissue yields maximum SNR that is 3 times higher than that in 250 µm thick tissue. Peak SNR and maximum imaging depth values for all four tissue thicknesses are compiled in Table 2. In the case of thinner samples (250 µm -1 mm) it is unsurprising that peak SNR is so much lower in epi images than transmission images. Since SRS microscopy is in general a shot-noise limited experiment, signal will scale as the square root of the average power of the detected pump pulse [29]. Based solely on the differences between transmission and epi DC signal sizes, thinner tissues should see a factor of 3.5-5 lower SNR in epi imaging. Here we report peak epi SNRs which are a factor of 3.7-9.7 lower than their transmission counterparts. This outsized loss of SNR in epi images can be traced back to the differences between epi and transmissive SRS signal modulation depths reported in Table 1.
At a thickness of 2 mm, however, epi imaging yields a higher peak SNR (41.75 compared to 34.8) and a slightly deeper imaging depth (135 µm compared to 130 µm) than transmissive imaging. This reversal of SNR suggests that for brain samples thicker than 2 mm, it is advantageous to use epi imaging. We expect that imaging SNR and penetration depth will further increase in intact brain due to even higher numbers of backscattered photons.

Epi and transmissive imaging through different murine tissues
Following analysis of epi and transmissive SRS signals in murine brain tissue, we sought to gain a better understanding of SRS imaging in additional types of murine tissue. To that end, simultaneous transmissive and epi imaging experiments were conducted on 1 mm thick slices of murine lung, liver, and kidney tissue. Raw SRS and DC signal sizes in epi and transmissive modes for each tissue sample are shown in Figs. 4(A) and 4(B), respectively. When compared to the measurements conducted on 1 mm thick murine brain tissue slices, we can see that kidney and liver tissue yield rather similar transmissive DC signals but roughly half epi DC signals. Lung tissue, on the other hand, yields a transmissive DC ∼60% lower and an epi DC signal four to eight times higher than all other studied tissues, suggesting that lung tissue is a much stronger scatterer than brain, kidney, or liver tissue. SRS signal sizes as a function of depth for each of the three tissue types during transmissive and epi imaging are shown in Figs. 4(C) and 4(D), respectively. To compare the scattering properties of each tissue type, the signal decay curves were fit with Eq. (1) to determine the scattering length of photons in each tissue type. The results of the fittings are shown in Table 3. SNRs as a function of depth for each tissue type were also calculated for each of the three tissue types. The imaging depths and peak SNRs are shown in Figs. 4(C) and 4(D). For both kidney and liver tissues, transmissive imaging provided a higher peak SNR than epi imaging, in agreement with the trends we report above for brain tissue. Transmissive imaging also achieved a significantly deeper imaging depth relative to epi imaging for kidney and liver tissues. In the case of lung tissue, however, epi imaging yielded a higher SNR and comparable imaging depth.

Deep learning to enhance imaging depth
After comparing the epi and transmission SRS imaging modalities across a variety of murine tissues, we decided to explore possible avenues to increase the potential imaging depth of SRS microscopy. To that end, we employed our previously reported CNN-based denoising technique to determine whether its denoising capabilities would translate into deeper achievable imaging depths. Figure 5 shows the application of a machine learning denoising algorithm to epi collected SRS images at various depths. Figures 5A -5D show data from an epi imaging experiment imaging a 2 mm thick sample of murine brain tissue at 15 µm, 100 µm, 170 µm, and 210 µm depths respectively.  Fig. 5(D) some distinct features such as nuclei and axons are visible, but the quality of the image as a whole is quite low. By comparison, Fig. 5(H) shows significantly more identifiable features and a higher quality image.
With respect to quantitative measurements of quality at these depths, the image shown in Fig. 5(C) has an SNR of 2.1, which means it represents the maximum imaging depth based on the criterion we outline above. For the denoised images, however, SNR as defined above is not a responsible metric to compare image quality as there is no field void of signal features post-denoising. Other commonly used metrics such as peak SNR, root mean squared error, Pearson's correlation coefficient, and structural similarity index are also not useful here as a reliable truth image is not available for comparison at this depth of imaging. As such, signal to background (SBR) is calculated for each image and compared. SBR was chosen as a comparison metric over SNR because the SNR of images ran through the CNN denoising algorithm were found to be largely independent of depth. We suspect this is caused by the algorithm adjusting the average value of the input images to better match the images used in its training, resulting in all images having very similar SNRs after denoising, regardless of the SNR of the input image. To calculate SBR, an area of low signal and few features is selected for both the noisy and denoised image (for example, the dark area in the bottom center of Figs. 5(C) -5(D) and 5(G) -5(H)). The average pixel value of this area is taken to be the background. Then 6 lines spanning the field of view are selected in the image (3 horizontal and 3 vertical, each set of 3 equally spaced from one another). The peak pixel value from each of these lines is taken as signal plus background. The background value is subtracted from the peak value and then divided by the background value to give SBR. The SBR for the image is the average value from the 6 lines sampled on the image. We have chosen to use an area average for the background as noisy images (Fig. 6, shown in the Appendix) have large variance along a sampled line and using the minimum pixel value along such a line as the "background" will often grossly inflate SBR values for an image.
Using the outlined method for calculating the SBR, Figs. 5(C) -5(D) exhibit an SBR of 2.6 and 2.3 respectively. After denoising, the same fields of view shown in Figs. 5(G) -5(H) exhibit an SBR of 7.2 and 6.4 respectively. Thus, the deep learning denoising used here effectively improves the SBR of images at the defined limit of image quality by a factor of over 2.5. This data confirms that deep learning provides a promising avenue in extending the depth limitations of SRS microscopy. Line plots of the images used in this calculation are provided in Appendix.

Discussion
SRS microscopy is a powerful technique for the characterization of biological samples. When a transmissive collection is not possible, or the sample is thick enough that transmissive imaging results in minimal signal collection, an epi imaging geometry offers a viable alternative. The relative performances of epi and transmissive imaging are, as we demonstrate above, dependent on many parameters including tissue type and tissue thickness.
In the case of murine brain tissue, we report that tissue samples under 2 mm are most effectively imaged using a transmissive imaging geometry. Samples 2 mm or greater in thickness yield the best results in terms of peak SNR and imaging depth when imaged in an epi geometry. The threshold for which tissue thickness yields higher SNR in epi mode is strongly dependent on tissue scattering length. For example, lung tissue (which exhibited the shortest scattering length of all the interrogated tissues) yielded significantly better images in an epi geometry at a thickness of just 1 mm. We also report significant variation in absolute signal magnitude and achievable imaging depth between different tissue types, and even between samples of the same tissue type, demonstrating the role tissue heterogeneity plays in determining imaging quality. We note that while we only measured one Raman band, tissue scattering length is only a function of wavelength and tissue composition and thus is independent of which Raman band is imaged.
Finally, we demonstrated the potential utility of CNN-based denoising algorithms towards achieving deeper maximum imaging depths. An algorithm developed for the purpose of these experiments was able to denoise images and increase imaging depth from 170 µm to 210 µm in 2 mm thick murine brain tissue samples. This 40 µm increase in imaging depth proves deep learning based denoising algorithms are poised to play a pivotal role in biological imaging in the coming years.
There are several other avenues to improve signal size and imaging depth in coherent Raman imaging that could be explored as well. In our measurements, the maximum pump photons detected in the epi-direction is <10% of total pump photons. Saar et.al. has shown that using an annular detector [30] for epi imaging, up to 28% of photons can be collected. In combination with the polarizing beam splitter-based epi-imaging, it is possible to increase the collection efficiency by four-fold and sensitivity by two-fold, pushing imaging depth over 250 µm. Of course, this comes at the cost of additional experimental complexity [30]. Imaging at longer pump/Stokes wavelengths has also been shown to increase imaging depth in phantom samples due to increased scattering length, however this approach comes at the cost of SRS signal intensity [31]. The actual benefit in using long wavelength for tissue imaging warrants further study. Correcting for optical aberration is another approach that can push the imaging depth even lower. Previous report shows that coherent Raman signal can be increased by 6-fold in muscle tissue [16], potentially allowing another increase of over 50-100 µm. Tissue clearing methods can also be used to achieve deeper imaging depths [20,32], though some methods have been shown to alter the structure of the cleared tissue [33,34]. However, tissue clearing (with the exception of skull clearing [35]) is incompatible with live processes and therefore is limited to study of fixed tissue.
In conclusion, we have conducted a comparative study of epi and transmissive imaging in various types of murine tissue. Throughout this study we have characterized epi and transmissive imaging efficacy as a function of tissue thickness and tissue type, and we report the recovered scattering lengths, maximum imaging depths, SNRs, and absolute signal magnitudes. Additionally, we have shown that CNN-based denoising algorithms can increase the maximum imaging depth in coherent Raman microscopy experiments, though further experiments are required to ascertain the quantitative utility of this approach.

Funding
National Institutes of Health (R35GM133435).