Multimodal microscopy for the simultaneous visualization of five different imaging modalities using a single light source

: Optical microscopy has been widely used in biomedical research as it provides photophysical and photochemical information of the target in subcellular spatial resolution without requiring physical contact with the specimen. To obtain a deeper understanding of biological phenomena, several efforts have been expended to combine such optical imaging modalities into a single microscope system. However, the use of multiple light sources and detectors through separated beam paths renders previous systems extremely complicated or slow for in vivo imaging. Herein, we propose a novel high-speed multimodal optical microscope system that simultaneously visualizes five different microscopic contrasts, i.e., two-photon excitation, second-harmonic generation, backscattered light, near-infrared fluorescence, and fluorescence lifetime, using a single femtosecond pulsed laser. Our proposed system can visualize five modal images with a frame rate of 3.7 fps in real-time, thereby providing complementary optical information that enhances both structural and functional contrasts. This highly photon-efficient multimodal microscope system enables various properties of biological tissues to be assessed.


Introduction
In biological research and medical diagnosis, optical imaging has been widely used owing to its high spatial resolution at the cellular, subcellular, and molecular levels while affording a minimally invasive manner [1][2][3]. Because cells and tissues are composed of various molecules, when these biological specimens are struck by a high-energy light source such as laser, several different types of photophysical events will occur depending on the properties of the target molecules within them. Accordingly, different types of optical imaging techniques exist to target and visualize each photophysical event [4]. Depending on the purpose of observation in biology and medicine, one or two key optical signals generated by optical events are typically imaged.
Although if optical microscopy can provide high spatial resolution, having a single optical characteristic information is sometimes insufficient to fully understand biomedical phenomena. Therefore, enhancing both structural and functional contrasts is essential to determine the key characteristics of microstructures in biomedical investigations [5][6][7]. Especially, incorporating multiple imaging techniques can help enhancing the structural and functional contrasts of an image. Therefore, considerable effort has been devoted to integrating multiple optical imaging techniques into one system under a single objective lens for co-registered images-known as multimodal optical microscopy (MOM)-to overcome the disadvantages and leverage the advantages of each imaging technique [8][9][10][11].
In many previous studies regarding MOM systems, multiple imaging modalities were developed by installing multiple light sources and detectors in a system; however, this renders the MOM system complicated and expensive [12]. In addition, owing to the use of multiple light sources, many existing multimodal microscopies require a switching method that converts the beam path from one imaging modality to another, thereby significantly decreasing the multimodal image acquisition speed [8,13]. One of the most well-known methods to implement multiple imaging modalities using a single light source is to combine multiple nonlinear optical (NLO) microscopies [9,10,[14][15][16]. However, in most cases, the imaging modalities are limited to the capabilities of NLO microscopy (e.g., two-photon excitation, second-harmonic generation, or coherent anti-Stokes Raman scattering (CARS) microscopy). Even though multimodal NLO microscopy is relatively easy to implement using a single light source, the simultaneous acquisition of all modality images is difficult when more than three modalities are integrated into one system [17,18]. Catheter-based optical coherence tomography (OCT) has also been successfully coupled with fluorescence imaging modality for in vivo real-time imaging; however, the number of imaging modalities that the system can provide remains limited, and the fluorescence signals are not fully co-registered with OCT images [19].
In contrast to previous studies, we developed a system that simultaneously acquires five different imaging modalities by exploiting most emission signals generated by a single femtosecond pulsed laser. Based on the notion that a femtosecond laser beam in the near-infrared (NIR) wavelength can produce multiple emission spectra with various peak wavelengths ranging from 300 to 900 nm, we segregated the emission signals for each wavelength band and each signal was detected individually using multiple detectors. Our MOM system enables simultaneous imaging of reflectance confocal microscopy (RCM), two-photon excitation (TPE), second-harmonic generation (SHG), near-infrared fluorescence (NIRF), and fluorescence lifetime imaging microscopy (FLIM). In particular, real-time FLIM technique was crucial to incorporate a chemical imaging contrast into our system; hence, a high-speed fluorescence lifetime imaging technique developed previously by our group was integrated into the MOM system [19,20]. Conventional FLIM techniques utilizing the time-correlated single-photon counting (TCSPC) method provide extremely slow imaging speed due to the long signal acquisition time for single-photon detection and complicated post-processing time for lifetime curve reconstruction. By contrast, the analog mean-delay (AMD) FLIM method adopted in our MOM system provides real-time data acquisition and processing. More importantly, these five different imaging modalities enable the detection of structural and biochemical properties within tissues of normal and atherosclerotic plaque build-up of the rabbit abdominal aorta. Our MOM system successfully identified collagen, elastin, and a lipid-rich region in an atherosclerosis model. In particular, the detailed structure within the plaque area, such as foam cells, became more visible through further statistical analysis of FLIM images. show an illustration that depicts the key concept of the signal generation and detection process of our single-light source MOM system. A femtosecond pulsed laser was tuned at the NIR wavelength (780 nm) and focused on the target specimen through a high numerical aperture (NA) objective lens to induce both nonlinear and linear optical processes. In nonlinear optical processes, SHG and TPE signals can be generated by non-centrosymmetric molecules and various endogenous fluorophores within the tissue, respectively ( Fig. 1(a)) [21,22]. In addition to these nonlinear effects, some portion of the incident light can backscatter, thereby providing RCM signals ( Fig. 1(a)). In addition, if the tissue contains any fluorophore that can be excited by NIR light, some other portion of the incident light can produce a single-photon NIRF signal ( Fig. 1(a)). Our MOM system acquires all four different signals generated from different optical processes using each correlated imaging modality. As the central wavelengths of each optical signal are different from each other ( Fig. 1(b)), we can chromatically separate them using multiple dichroic beam splitters (DBSs) and detect them individually using four sets of emission filters and detectors. The fluorescence lifetime of NIRF was measured by processing NIRF and RCM signals. The RCM signals were used as the instrument response function (IRF) when calculating the fluorescence lifetime.

Multimodal imaging system development
A femtosecond pulsed oscillator (Mira 900F, Coherent) pumped by a 10-W diode-pumped solid-state laser (DPSS, Verdi V10, Coherent) was used as the light source. A Faraday isolator (BB-05-I-800-000-900, Electro-Optic Technology) was placed in front of the laser exit to prevent the light reflected from the MOM optics from returning to the femtosecond oscillator cavity. A pair of Glan-Taylor calcite polarizers (GT5-B, Thorlabs) was used to optimize the laser output power to avoid the tissue from being photo-damaged. The output wavelength of the femtosecond laser has been tuned to optimize the signal-to-noise ratio (SNR) for all five modalities. In particular, 780 nm was selected according to the excitation wavelength of SHG and NIRF signals (from ICG) [23,24]. The output power can be adjusted by changing the polarization angle of the first polarizer while maintaining the polarization angle of the second polarizer the same as the laser output. The beam was enlarged using a beam expander and then scanned two-dimensionally using a galvanometric scanner (6230H, Cambridge Technology) and a 4-kHz resonant scanner (CRS 4K, Cambridge Technology) along the slow and fast axes, respectively. The scanners allow two-dimensional scanning with variable field of view (from 150 × 150 µm to 600 × 600 µm) and imaging size of 1024×1024 pixels. The double-telecentric relay lens used was composed of two achromatic doublets as a scan lens and an achromatic doublet as a tube lens. The relay optics were designed to provide uniform imaging performance over the entire field of view while avoiding vignetting when the light passes through the infinity port and the internal optics of the commercial microscope (DMi8, Leica Microsystems) attached to the MOM system, and the back aperture of the objective lens (OL, XLUMPLFLN-20XW, 1.0NA, Olympus). We composed a relay optics using achromatic doublets (Thorlabs) to configure a double telecentric with 2.5× magnification. Since all five modalities in our MOM system share the same illumination beam path, the optics has been optimized to achieve target resolution and diffraction-limited performances over the broad bandwidth ranging from near-ultraviolet (UV) to near-infrared (NIR) wavelengths.
As depicted in Fig. 1(c), our system employs both descanned and non-descanned detection schemes for multimodal image acquisition. The non-descanned detection optics acquires the emission light of the TPE and SHG signals through a size-customized cold mirror (FF705-Di01, Semrock) and a collecting lens. The collecting lens optics is composed of three commercial singlets (Thorlabs). The optics were designed to provide over 98% of theoretical transmission without showing the vignetting on both channels. The TPE and SHG signals were split using a DBS (Di02-R405, Semrock), transmitted through emission filters (EMs) FF694/SP and FF01-392/23 (Semrock), and detected separately using two different photomultiplier tubes 1 (PMT1, H7422A-50) and 2 (PMT2, H7422P-40, Hamamatsu Photonics), respectively. Different types of PMTs were used for the TPE and SHG channels, considering each emission wavelength. The SHG signal was detected with H7422P-40, which has the gain spectrum located in the shorter range of the wavelength spectrum than H7422P-50 for TPE. The signals detected by the two PMTs were amplified using current-to-voltage amplifiers (C6438-01, Hamamatsu Photonics) and transferred to a two-channel frame grabber (PCI-1410, National Instruments) embedded in the computer to generate images. RCM and NIRF images were acquired through the descanned detection optics collecting the light returning through the relay optics and the two-axis scanner. The NIRF and RCM signals were first separated from the illumination beam path by using a short-pass dichroic beam-splitter (DBS, FF791-SDi01, Semrock). Since the reflection signal is much higher than the fluorescence signal, we have intentionally chosen the beam-splitter as dichroic to maximize the efficiency of the fluorescence channel and keep the signal intensity and gain of the RCM channel comparable to the NIRF channel. Most of the RCM signal (780 nm) will transmit through this DBS, and only a small fraction will be reflected and directed to the RCM detection optics. After passing DBS, the beam was split by a 9:1 BS: 90% to NIRF detector and 10% to RCM detector. This BS again reduces the RCM signal to a similar level as the NIRF signal. Each signal was passed through an EM (FF01-832/37 and FF01-769/41, Semrock, respectively) and pinholes (150 and 100 µm, respectively) to be detected by the PMTs (PMT3 and PMT4, H7826-01, Hamamatsu Photonics). In front of the RCM detection unit, a polarizer paired with a quarter-waveplate (WPQ20ME-780, Thorlabs) was placed between the scan lens and tube lens to filter out the reflected signal from optical components other than the specimen. The signals were amplified by the current-to-voltage amplifiers (C9663, Hamamatsu Photonics), and the voltage outputs were directly sampled using a two-channel digitizer (U5309A, Keysight). In the descanned detection part, the observed lateral resolution and axial resolution share approximately 0.9 and 2.6 µm, respectively. In the non-descanned detection part, the observed lateral resolution and axial resolutions are approximately 0.9 and 3.4 µm, respectively (Fig. S1). The laser output power was adjusted to 20 mW to prevent thermal damage to the tissue [25]. The energy received by the tissues was less than 0.16 J/cm 2 , which is shown to be safe with a femtosecond laser.

Data acquisition and processing
A custom-built software was used to visualize the RCM and NIRF images acquired with a digitizer, and the TPM and SHG images acquired with a frame grabber. The RCM and NIRF signals were then used to reconstruct the FLIM images. We have designed and developed multiple custom electronic boards to synchronize all five modalities to the laser pulse. The sync signal from the laser was utilized to convert the PMT signals of TPE and SHG channels into the standard RS-170 format. This same sync signal was also used to trigger the digitizer for RCM and NIRF acquisitions. High-speed FLIM technology called the AMD method was adapted to achieve the desired multimodal imaging speed [26][27][28]. Instead of utilizing the stochastic measurement of TCSPC, which detects single photons and hence yields a slow imaging speed, the AMD method stretches the photon signal in the time domain to improve the photon economy and increases the imaging speed. We use a digitizer with two channels to collect the reflection and fluorescence signals from the specimen. The intensity images of each signal are reconstructed into RCM and NIRF images, respectively. Then, a custom processing software measures the mean temporal delay of each channel from the laser pulse sync based on the following equation [28]: where, τ, τ fl , and τ irf are fluorescence lifetime, the mean-temporal delays of the acquired fluorescence signal (i fl (t)), and the mean-temporal delays of the IRF signal (i irf (t)). The IRF refers to the sum of the temporal delays introduced in the laser pulse signal by the instrument, including both the optical and electrical systems. Since the reflectance signal from the specimen was designed to pass through the same beam path with the fluorescence signal and delivered to the digitizer through the same electrical circuit, it can be regarded as the instrument response function of the system. Therefore, the fluorescence lifetime (τ) can be calculated by simply subtracting the mean-temporal delay of the RCM channel from the mean-temporal delay of the NIRF channel. We also leveraged previously developed FLIM-related technologies to effectively integrate FLIM into four different imaging modalities and enable real-time visualization. A parallel computing library known as Threading Building Blocks (Intel) was utilized to process a significant amount of data (2 Gb/s) coming from the digitizer in real-time [20]. In addition, line-to-pixel referencing technique [28] combined with the error compensation method for high-speed femtosecond laser were fully utilized in this system [29]. During data acquisition, the images were displayed in the user interface in real-time and saved to a hard-disk drive simultaneously with approximately 3.7 frames per second (fps) speed. The images were further processed using a custom-developed MATLAB-based software to remove imaging artifacts and noise (Fig. S2) [30,31]. Unprocessed and processed image snapshots are shown in Fig. S2.

FLIM validation
The accuracy and reliability of our high-speed lifetime microscopy system were validated using two standard samples with known lifetime values [32]. We prepared 160 and 80 µM of indocyanine green (ICG) solution dissolved in milk, placed it onto a glass-bottom dish (200350, SPL), and obtained the lifetime value. The lifetime values of these samples were measured to be 0.4091 ± 0.0190 and 0.5838 ± 0.1321 ns, which agrees well with previously reported lifetime values of 0.43 and 0.56 ns, respectively. In addition, to ensure the reliability of our FLIM system, calibration using a mirror that has to produce a lifetime value of 0 ns and a standard sample (160-µM ICG solution dissolved in milk) was performed prior to each measurement.

Atherosclerotic rabbit preparation
A rabbit atherosclerotic model was prepared via balloon injury and high-cholesterol diet feeding. Balloon injury was induced at the vertebral L2-L4 positions of the infrarenal aorta of a male New Zealand white rabbit weighing approximately 3-3.5 kg (NZWR, DooYeol Biotech, Korea). The rabbit was fed with 1% high-cholesterol diet (1% cholesterol and 5% peanut oil, C-30293, Research Diets) for a week before the balloon injury was induced [33]. Rabbit continued to be fed a 1% high-cholesterol diet for three weeks after the balloon injury. Subsequently, we reduced the cholesterol amount to 0.1% (0.1% cholesterol and 5% peanut oil, C-30293, Research Diets) for four weeks before the rabbits were sacrificed. All animal experiments were conducted according to the protocol approved by the Institutional Animal Care and Use Committee of Korea University (KOREA-2018-0066-C2).

Tissue sample preparation
Seven weeks after the balloon injury, the rabbits were euthanized via CO 2 gas inhalation. Atherosclerotic plaque and normal arterial segments were obtained from the balloon-injured abdominal aorta and non-injured proximal aorta, respectively. The resected tissue segments were embedded in optimal cutting temperature compound (Sakura Finetek, Japan) and sectioned at 50-µm thickness. ICG (Tokyo Chemical Industry) was chosen as the NIRF probe. Before imaging, the tissue sections were incubated in 3-mM ICG dissolved in distilled water overnight, washed three times with PBS, and mounted with mounting medium (Crystal mount, Biomeda). In addition, 3D multimodal images were acquired from a swine skin tissue. Swine skin tissue was resected in a small piece (approximate size: 1 × 1 × 1 cm), incubated in 3-mM ICG dissolved in distilled water overnight, washed with distilled water thoroughly, and placed on a coverglass bottom dish (SPL).

Histological validation
The morphological examination was performed with hematoxylin and eosin (H&E, Scytek, USA) staining, and the presence and distribution of lipids, collagen, and macrophages within the tissues were confirmed with oil red O (ORO, Scytek), picrosirius red (PSR, Scytek), and immunostaining with RAM11 (M0633, DAKO), respectively. For histological validation, normal and plaque segments were serially sectioned at 10-µm thickness, and each staining was performed according to the manufacturer's instructions. To immunostain macrophages, tissue sections were fixed with ice-cold acetone for 10 min, and treated with peroxidase blocking solution (DAKO, Glostrup, Denmark) for 5 min at room temperature to block endogenous peroxidase. Slides were incubated with mouse anti-rabbit macrophage clone RAM11 (1:500; M0633, DAKO) for 30 min at room temperature, and washed three times with Tris-buffered saline with 0.1% Tween 20 Detergent (TBST) for 10 min. Sections were incubated with EnVision Horse Radish Peroxidase (HRP) systems (K4001, DAKO) for 30 min at room temperature. After washing three times with TBST for 10 min each, slides were transferred to the 3,3 ′ -diaminobenzidine (DAB) substrate (GBI Labs, Mukilteo, WA, USA) and soaked for 5 min. Subsequently, tissue sections were counterstained with hematoxylin, followed by dehydration, clearing, and mounting.

Correlation analysis between ICG lifetime and lipid distribution
To confirm the correlation between ICG lifetime and lipid distribution, ORO-stained aorta images and FLIM images were compared (Figs. S4 and S5). First, we converted the ORO images from the RGB color space to the HSV color space; subsequently, and we determined the ORO-positive pixels and ORO staining level using HSV scales. The H, V, and S scales represent hue, saturation, and value, respectively. Among these three scales, we used H to identify ORO-positive pixels. Only pixels within the range of between -72°to -21.6°in the H scale were considered as ORO-positive pixels ( Fig. S6(b)). Subsequently, the S scale was used to determine the ORO-staining level. The ORO-negative pixels were defined as zero levels, regardless of their original saturation values (Fig. S6(d)). Then, the mean ORO-staining levels at three different locations (1. plaque in the atherosclerotic vessels, 2. media of the atherosclerotic vessels, and 3. normal vessels) were compared to the mean lifetime values of the matching location in the sister section (Fig. S7). Two values were plotted on a scatterplot and fitted to a first-order polynomial curve with a 95% prediction interval (Fig. S8).

Statistical analysis
Six atherosclerotic vessel wall images and three normal vessel wall images were analyzed. Prior to analysis, the background pixels were excluded from all images. The distribution and mean values of the fluorescence lifetimes were obtained from the following three locations: 1. plaque in the atherosclerotic vessels (n = 6), 2. media of the atherosclerotic vessels (n = 5), and 3. normal vessels (n = 3). To obtain the probability distribution function shown in Fig. S9, we eliminated pixels that exhibited an intensity level less than 10 because lifetime values measured at low intensities are considered unreliable in the AMD method (the images were processed to 8 bits; therefore, 255 was the maximum level of the intensity). In particular, the media and plaque area were analyzed separately from the same atherosclerotic vessel FLIM images based on each distinct area segregated by the elastin boundary shown in the TPE images of the same ROI using ImageJ software (US National Institutes of Health, Bethesda, MD). Subsequently, to separate the image pixels within the FLIM images into these two groups, the fluorescence lifetime threshold t was determined using Otsu's method, which minimizes the weighted sum of variance of the two groups [34]. The weighted-sum of the variances of the two groups, known as intra-class variance, can be calculated as follows: ω normal and ω plaque are the probabilities of the two groups, whereas σ normal and σ plaque are the variances of these two groups. Minimizing the intra-class variance is analogous to maximizing the inter-class variance, which can be written as follows: where µ normal and µ plaque represent the means of the two groups, and σ is the variance of the entire group. By repeating the calculation of the inter-class variance for possible t values, we can obtain the threshold t that maximizes the inter-class variance.

MOM system
Our high-speed MOM system achieved a maximum frame rate of 3.7 fps when producing all five different modality images simultaneously (Fig. S10). To visualize these five different imaging modalities in real-time, AMD FLIM technology, instead of TCSPC technology, was utilized. In addition, to process the significant amount of data produced from these detecting and visualizing processes without any delay, a parallel computing method established in our previous study was adapted to this MOM system [20,29,35]. Consequently, the total time required to process the raw data as well as generate and save multimodal images composed of more than five million pixels was less than 93 ms (Fig. S11), which was much lower than the data acquisition time per frame (approximately 270 ms, Fig. S10) and provided a safety factor of 3 for data streaming in real-time.

Multimodal imaging of ICG stained rabbit abdominal aorta
Considering that the atheromatous plaque aortic tissue exhibits distinct structural and chemical characteristics, we regarded it as a suitable imaging target for our MOM system. Atheromatous plaque tissue is composed of fibrous connective tissue (or cells), lipids, macrophages, elastin, and collagen, which are known to have different excitation wavelengths and different morphological characteristics. Given that each modality in our system is segregated according to different excitation wavelengths, these features of the plaque tissue were considered to be perfect for exploiting all modalities. In addition, we chose the rabbit model over other animal models because the size of the rabbit aorta was the best suited for the given field of view (FOV, 500 µm × 500 µm) of the system. Given an emission filter with a central wavelength of 392 nm and bandwidth of 23 nm, we expected the SHG signals to capture the unique structural characteristics of collagen fibrils within the aorta [36]. Based on the excitation wavelength (780 nm) and emission bandgap (405-694 nm) afforded by the dichroic mirror and emission filter installed in front of the TPE detector, the TPE signals captured the distributions of intrinsic extracellular and intracellular fluorophores such as elastin, NAD(P)H, and flavin [37,38]. RCM detects the overall topography of the surface and the tomography of intra-tissue based on the reflected light from the normal aorta and atherosclerotic plaque. NIRF signals with an appropriate NIRF probe can be useful for augmenting the contrast by staining the microstructures within the tissue while minimizing signal crosstalk caused by autofluorescence [39]. In addition, through the FLIM signal from NIRF, we expected to visualize the local chemical properties based on the fluorescence lifetime [40]. The chemical properties of the plaque region were one of our main concerns. ICG is a commonly used in vivo contrast agent that can be directly injected into the blood vessel with Food and Drug Administration (FDA) approval. In addition, ICG has a high binding affinity for lipid molecules, rendering it useful for detecting plaque regions [19,41]. Therefore, we used ICG as the NIRF probe to stain the tissue samples. For a comparative study, both a normal artery and atherosclerotic plaque build-up obtained from the rabbit abdominal aorta were imaged. Figure 2 shows the multimodal images from both the normal artery and plaque build-up. All images in Fig. 2 show a portion of the arterial wall cross-section, and the lumen is located on the right side of the images. Ten successive images were averaged and processed for denoising to generate a clear image of each modality. As shown in Figs. S2 and S3, the snapshot shows sufficiently clear images in most modalities, but the SHG image is further enhanced by averaging. The normal arterial wall shows three main vascular structures: adventitia, media, and intima.
In the atheromatous plaque sample, an immensely thickened intima is particularly noticeable adjacent to the media. The structural elements of the region of interest (ROI) from the normal and atherosclerotic aortas are described in Fig. 2(a) and 2(g), respectively. Above all, the thicknesses of the two samples are distinctly different. In the atherosclerotic sample, a large plaque area appears immediately next to the media layer, and the intima layer is deflected from this ROI. In the SHG images, signals assumed to be collagen fibers appear in the entire normal aorta (Fig. 2(b)). Similarly, in the atherosclerotic section, SHG signals appear in the entire ROI region, including the tunica adventitia, media, and plaque area (Fig. 2(h)). This result seems to be reasonable considering that collagen deposits at the plaque area as neointimal hyperplasia progresses [42,43]. The TPE image of the normal tissue (Fig. 2(c)) shows the detailed structures of the arterial wall, particularly the tortuous shape of the elastic fibers (or elastin) of the media, internal elastic lamina, and external elastic lamina. This unique form of elastic fiber, which allows the aorta to be compliant and withstand the blood pressure, can serve as an indicator of the boundaries between the media and intima and between the media and adventitia [44]. This form is also observed in atherosclerotic tissue; however, as shown in Fig. 2(i), it appears to be slightly altered; the elastic fibers of the media and elastic lamina become somewhat straightened compared with the normal tissue, and some of the internal elastic lamina appears to be ruptured (denoted with white arrows) (Fig. 2(i)). Autofluorescence signals not only highlight the elastic fiber within the tissue, but also appear in the entire plaque region where smooth muscle cells, foam cells, macrophages, lipids, and collagen exist.
Whereas the RCM images of the normal tissue show relatively low contrasts across the entire surface ( Fig. 2(d)), several bright spots indicating highly reflective matter or surface appeared in the plaque region of the RCM image of the artery with atherosclerotic plaque build-up ( Fig. 2(j)). Owing to the high contrast of these bright spots in the plaque region, the signal from the media and adventitia is almost invisible. Based on previous studies demonstrating that lipids or nuclei can yield intense reflection signals and our observation that the signal occurred primarily in the plaque region, we presumed that the signal can be generated by lipids or nuclei of the cells within the plaque [45,46]. In the NIRF images, the structural contrast is enhanced through ICG staining, and the same wavy structures of the elastic fiber observed in the TPE image are shown. ICG cannot selectively bind to a specific target molecule or a protein but is has a higher affinity for certain substances such as lipids or lipoproteins; therefore, it is often utilized to visualize plaques in vivo [24,41]. In normal tissues, ICG seems to be absorbed more into the external elastic lamina than other parts of the tissue; thus, more intense NIRF signals appear in the external elastic lamina area. By contrast, in the atherosclerotic aorta, the NIRF signals unveil detailed structures of the plaque and internal elastic lamina (Fig. 2(k)). An intense signal close to the saturation level is observed, as shown at the left side of Fig. 2(k), which might indicate the perivascular tissue attached around the adventitia harvested together with the aorta. For reference, the merged image can be seen in Fig. S12.
Finally, Figs. 2(f) and 2(l) show the fluorescence lifetime distribution of ICG in normal and atherosclerotic aortas, respectively. The normal aorta shows a relatively uniform lifetime distribution across the entire region. By contrast, the fluorescence lifetime changes notably within the plaque and perivascular areas, as compared with the media and adventitia areas (Fig. 2(l)).

Histological evaluation
Sister sections of both normal and atherosclerotic tissues were stained and analyzed to histologically evaluate the tissues and to compare the results to images obtained using the MOM system. We used staining agents that can exhibit the general histological characteristics of the plaque region-H&E, PSR, ORO, and RAM 11-to identify intracytoplasmic structures, collagen, lipids, and macrophages, respectively. The H&E image of the normal tissue predominantly shows nuclei that are elongated and aligned with a particular pattern, observed as the nucleus of smooth muscle cells (SMCs) within the media (Fig. 3(a)). By contrast, many round-shaped nuclei were observed throughout the plaque area in the H&E images of the atherosclerotic tissue (Fig. 3(e)). These circular nuclei are assumed to be associated with foam cells, macrophages, or SMCs. In the normal aorta, collagen fibers appeared in the entire area of the aortic wall ( Fig. 3(b)); however, lipids (Fig. 3(c)) and macrophages (Fig. 3(d)) are not conspicuous in the normal aorta (Figs. 3(c) and 3(d)). In general, atherosclerotic plaques can be characterized by intimal thickening, rupture of the internal elastic lamina, fibrous caps, and intimal hyperlipidemia-induced macrophage (or foam cell) infiltration, which can be shown in histological images [47,48]. A thickened intimal wall (Figs. 3(e)-(h)) containing abundant collagen (Fig. 3(f)), lipids ( Fig. 3(g)), and macrophages ( Fig. 3(h)) are observed in the plaque area. Based on this histologic assessment, we identified some of the sources of signals obtained from our MOM system. As expected, the SHG signals exhibit a distribution pattern similar to that of PSR, in which Type-I and Type-II collagen can be visualized. In the NIRF images, the signal originating from ICG (Figs. 2(e) and 2(k)) differs slightly from the ORO-stained pattern (Figs. 3(c) and 3(g)). More specifically, the NIRF signals in the normal samples and the media of the atheromatous plaque samples suggest that they do not originate from the ICG bound to lipids but from the ICG absorbed into the tissue with relatively a high permeability. Considering the non-specific binding characteristic of ICG (although it has a high affinity for lipids), it is reasonable that the NIRF and ORO regions do not match perfectly. However, when ORO-stained images are compared with the FLIM images (Figs. 2(f) and 2(i)), the fluorescence lifetime appears to change in the presence of lipids.

Correlation between fluorescence lifetime and plaque
To investigate the mechanism by which the FLIM signal reflects the chemical properties of the aortic tissue and how the FLIM signal can be correlated with the distribution of lipids, we performed an additional analysis as follows: First, we attempted to obtain a direct correlation between ORO-staining levels and fluorescence lifetime. ORO-staining levels were determined using S scales of ORO images converted to the HSV color space (see representative images shown in Figs. S6 and S7(a)). In addition, the mean lifetime in three different locations based on their histological elements (1. plaque in the atherosclerotic vessels, 2. media of the atherosclerotic vessels, and 3. normal vessels) were obtained. A strong positive correlation was revealed from the scatter plot between the ORO-staining level and fluorescence lifetime, where the Pearson's R-value was 0.8233 (P < 0.0003) (Fig. S8). (e-g) lifetime value of pixels lower than threshold, (h-j) lifetime value of pixels higher than threshold. Significance was determined by unpaired two-tailed t-test. (* P < 0.05; ** P < 0.01; *** P < 0.001; **** P < 0.0001). Scale bars = 100 µm.
To further compare the fluorescence lifetime values within the plaque and non-plaque areas, the three locations defined above were classified as plaque and non-plaque areas; 1 was classified as a plaque area, and 2 and 3 were classified as non-plaque areas. As shown in Fig. 4(a), the fluorescence lifetime values of the two groups, plaque and non-plaque areas, showed apparent differences. To separate the image pixels within the FLIM images into these two groups, the fluorescence lifetime threshold was determined from the population distribution of the fluorescence lifetime shown in six ROIs (Fig. S9) using Otsu's method [34]. The fluorescence lifetime threshold, which segregates the atherosclerotic aorta into media and plaque, was determined to be 0.3059 ns. The FLIM images were segmented using this threshold value. Figures 4(c), 4(d), and 4(e) show the original NIRF intensity, FLIM, and intensity-weighted FLIM (lifetime values multiplied by NIRF intensities) images, respectively. The second row (Figs. 4(e)-(g)) and third row (Figs. 4(h)-(j)) represent images composed of pixels with fluorescence lifetime values lower or higher than the threshold, respectively. The intensity-weighted FLIM images composed of pixels below the threshold show the media region (Figs. 4(e)-(g)). By contrast, intensity-weighted FLIM images with pixels above the threshold show the plaque region (Figs. 4(h)-(j)). In particular, circular clusters that appeared to be lipid-rich macrophages or foam cells are conspicuous, as shown in Fig. 4(j).
The increase in the fluorescence lifetime of ICG when lipids are concentrated remains to be elucidated. Previous studies have reported that the fluorescence lifetime of ICG can change depending on the solvent used (e.g., water, milk, or 1% intralipid solution), and it was demonstrated that the fluorescence lifetime increased in the presence of lipids [32]. To investigate the correlation between the lipid concentration and the fluorescence lifetime of ICG, we acquired FLIM images of 20-µM ICG solutions dissolved in 1%, 4%, and 16% intralipid solutions (Fig.  S13). The results show that the ICG lifetime is positively correlated with the concentration of the intralipid solution. As such, the fluorescence lifetime of ICG can change depending on the lipid concentration. Based on this causal relationship, we can identify the existence of lipids and show the changes in lipid density based on the ICG fluorescence lifetime.

Discussion
In this study, we developed a high-speed MOM system that utilizes a single femtosecond pulsed laser to visualize multiple imaging contrasts. The key technical advantage of our system is that five complementary and co-registered information can simultaneously visualize within the same ROI in real-time. The imaging speed of 1024 × 1024 imaging pixels was approximately 3.7 fps (approximately 3.8 megapixels per second), which we believe is fast enough for live imaging of stationary organs such as the skin, brain, or kidney. In particular, considerable efforts have been expended to construct a system with a simple optical beam path and a single light source. Our system was designed by slightly altering the optical design of a conventional two-photon laser scanning microscope system. A light source, a set of scanners, relay optics, an objective lens, and other auxiliary optics for illuminating the specimen were shared among different imaging modalities. Subsequently, four different detectors with four different filters, a two-channel frame grabber, and a two-channel digitizer were used to detect four different signals and process five imaging contrasts. Furthermore, these five imaging modalities produced optically sectioned images. RCM, NIRF, and FLIM signals eliminate out-of-focus signals using pinholes. As the nonlinear optical processes were restricted to the focal spot, TPE and SHG have intrinsic confocality. Therefore, we can ensure that all pixels of the five different modalities originate from the same location of the scanning field and depth simultaneously.
To demonstrate the 3D imaging performance of our systems, we additionally performed a quick 3D multimodal imaging with a swine skin tissue. A 500 × 500 × 100 (x/y/z) µm 3 3D region was scanned. Figure 5(a) shows the composite image of SHG (blue), RCM (green), and NIRF (red) which represents the overall tissue structure from top to bottom part of epidermis before the dermal-epidermal junction. In SHG and TPE modalities, the wrinkle structure of the stratum corneum is visible. A hole-like structure (marked with white arrowheads) that seems to be part of the honeycomb pattern of the keratinocytes was visible in the RCM channel between 50 to 60 µm depths from top surface. As depicted in Figs. 5(e) and 5(f), strong NIRF signal from ICG is visible. Three-dimensional scanning shows that the ICG signal is distributed at about 30 to 50 µm deeper layer in the epidermis than SHG/TPE channels [49,50]. This excellent 3D imaging performance is indeed powerful because all five modality images can be taken simultaneously, facilitating the investigation of dynamic biological events inside live tissues. For example, cardiac muscle contraction or immune cell rolling within the blood vessels can be an interesting dynamic events that can be captured with our MOM system [51,52]. Another advantage of our system is that it uses a wavelength-tunable laser (Mira 900F, Coherent). The central wavelength of the laser output can be tuned from 700 nm to 980 nm, and we can ideally obtain the fluorescence signals excited in the range from ∼380 to ∼500 nm, which includes a wide range of autofluorescence signals. In this study, we used a wide bandwidth emission filter with 780 nm excitation, but we can more precisely target the molecule of interest by changing the emission filter to a more specific and narrower bandwidth and changing the central wavelength of the laser. For example, if we switch the emission filter to a central wavelength of 460 nm and bandwidth of 40 nm and tune the laser to a central wavelength of 720 nm, we can more precisely target the nicotinamide adenine dinucleotide (NADH) autofluorescence signal.
In addition, while the current imaging speed is quite fast, we can further improve the imaging speed to video-rate with some modifications to the system. The imaging speed can be increased by replacing current 4 kHz scanners with 8 or 16 kHz scanners and applying bi-directional scanning. The pixel clock provided by 8 and 16 kHz resonant scanners is typically 512 pixels per scan, resulting in higher frame rates. For example, by replacing the current 4 kHz scanner with an 8 kHz scanner and reducing the number of imaging pixels to 512 × 512, the fastest possible imaging speed will be increased to approximately 15 fps and further improved with bi-directional scanning to approximately 30 fps. More importantly, the accuracy of FLIM will be unaffected because the pixel rate remains unchanged.
Conversely, the use of a single light source can lead to the drawback that the laser illumination level cannot be optimized for each modality; namely, illumination properties such as the average output power, central wavelength, pulse width, peak power, or repetition rate, cannot be adjusted to the optimum emissions of five different modalities. Accordingly, the emission level must be adjusted using neutral density (ND) filters (NDK01, Thorlabs) to the lowest emission modality and compromise the quantum yields or fluorescence efficiencies of other modalities. Because single-photon fluorescence typically exhibits higher excitation efficiency compared with nonlinear effects such as TPM and SHG, we sometimes needed to place an ND filter in front of the NIRF (a single-photon excitation) channel to avoid saturation. In that case, photons above the saturation level in the NIRF channel were wasted in our MOM system. However, we were able to stably acquire images with reliable quality from all five imaging modalities, including NIRF and NIRF-FLIM, for a sufficient period of time. As such, the SNR or bit-depth of NIRF channel was not affected much, and this disadvantage was considered trivial. Apart from this argument, we believe our system is still superior in terms of photon efficiency because it fully utilizes the emissions by detecting most signals generated from wavelengths of 300 to 900 nm. The optimization of other parameters such as pulse width, peak power, or repetition rate is also trivial because two NLO modalities, TPE and SHG, share their optimal parameters, and NIRF and RCM modalities are not affected by these parameters.
In addition to these advantages in terms of hardware, the images acquired by our MOM system are proven to be highly useful for biological imaging as they indicate that all imaging modalities provide unique structural or chemical contrasts. Targeting an atherosclerotic aorta tissue with complicated structural and chemical characteristics, our MOM system simultaneously visualized the distribution and shape of elastin, collagen, SMCs, and lipid-rich cells based on TPE, SHG, RCM, NIRF, and FLIM signals. More specifically, TPE and SHG effectively imaged the detailed structures of the lamellar units within the arterial wall at high resolution with excellent optical sectioning. NIRF-FLIM with ICG staining has been proven to be suitable for investigating the distribution and density of lipids within tissues. Moreover, the fluorescence lifetime of ICG changes with concentration and highlights lipid-rich macrophages within the plaque area.
Given that ICG is an FDA-approved fluorophore that can be applied to the human body, our MOM system can be utilized to visualize lipid concentration maps and co-register them with other structural imaging modalities of live tissue in real-time. Previous studies demonstrated that the fluorescence lifetime of ICG is altered when it permeates the tumor mass [53]. Pal et al. proposed an anti-EGFR conjugated NIRF probe, which successfully improved the sensitivity and specificity of NIRF-FLIM to in vivo tumors [54]. These reports suggest that our MOM system can be extended to cancer diagnosis and oncology research. In addition, to further expand the applicability of the FLIM modality in our system, our NIRF-based FLIM can be switched to a TPE-based FLIM system. Currently, the fluorescence lifetimes of the NIRF signal are calculated by processing and utilizing the signal from the RCM channel as the IRF. For the TPE-FLIM system, signals from SHG can be utilized as the IRF to calculate the lifetime of the TPE signal [29]. Because TPE captures endogenous autofluorescence inside tissues, such as NADH associated with cellular metabolism, the metabolic state or phenotype of the target cells and tissues can be detected [55]. Sun et al. demonstrated that metabolically active cancerous sites expressed decreased lifetime signals owing to stronger NADH, which has a shorter lifetime compared with the autofluorescence signal from the surrounding matrix, i.e., collagen [56]. With reference to these findings, we expect our MOM system to be applicable for monitoring cancerous tissues by simultaneously imaging the metabolic state of cells and the structural contrast of the extracellular matrix within the tissue. Currently, the objective lens part of the suggested MOM system is integrated into an inverted microscope and hence may restrict on in vivo applications. Therefore, to enable the application of the system to a wider range of in vivo or clinical applications, the objective lens part can be integrated into a catheter or a handheld microscope. We believe that these advances will provide a deeper understanding of biological research or medical diagnosis by enabling multi-faceted observations of various biological events. Disclosures. The authors declare that there are no conflicts of interest related to this article. Data availability. Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.
Supplemental document. See Supplement 1 for supporting content.