Multifrequency-swept optical coherence microscopy for highspeed full-field tomographic vibrometry in biological tissues

: Because conventional laser Doppler vibrometry or Doppler optical coherence tomography require mechanical scanning probes that cannot simultaneously measure the wide-range dynamics of bio-tissues, a multifrequency-swept optical coherence microscopy with wide-field heterodyne detection technique was developed. A 1024 × 1024 × 2000 voxel volume was acquired with an axial resolution of ~1.8 μ m and an acquisition speed of 2 s. Vibration measurements at 10 kHz were performed over a wide field of view. Wide-field tomographic vibration measurements of a mouse tympanic membrane are demonstrated to illustrate the applicability of this method to live animals.


Introduction
In recent years, there has been an increasing demand for measuring the vibration mode of living tissues in order to analyze their mechanobiological properties, which are directly related to the physiological function of multicellular tissues. For instance, in the areas of middle ear and inner ear research, a variety of optical coherence tomography (OCT) techniques, which have been replacing laser Doppler vibrometry [1][2][3], are being widely adopted for measuring biomechanics deep in subsurface tissue composed of multi-layered cells [4][5][6][7][8][9][10][11][12]. Low-coherence OCT with a Doppler vibrometer [4][5][6], phase-sensitive spectral domain (SD)-OCT [7][8][9][10], and swept-source (SS)-OCT [11,12] have been used for measuring high-speed vibrations from various surfaces within the organ of Corti and the tympanic membrane. However, these techniques involve scanning single measurement points along the two lateral dimensions based on fiber-optic OCT. Moreover, multiple A-scans are required for observing vibrational displacement in the region of interest (ROI). Therefore, conventional SS-OCT and SD-OCT cannot simultaneously measure the wide-range dynamics of bio-tissues, which is thought to be crucial for elucidating the biomechanics. To address this issue, en-face OCT with a full-field vibrometry technique might be necessary.
To detect en face images with a much higher transverse spatial resolution than those of conventional fiber optic-based OCT, optical coherence microscopy (OCM) techniques [13][14][15][16][17][18][19][20][21][22][23][24][25] have been developed for full-field (FF) OCT based on spatial low-coherence interferometry. An FF-OCT image can be produced by acquiring interferometric images with an area camera and by illuminating the whole field of view using a low-coherence light source using a comparably simple arrangement without x-y lateral scanning. With this alternative FF-OCM technique, an ultrahigh resolution better than 1 μm axial and 0.5 μm transverse was achieved using high-numerical-aperture (NA) optics and a superluminescent diode (SLD) [13]. Furthermore, high-speed FF-OCT systems, which are superior to conventional SS-OCT or SD-OCT systems, have been developed by utilizing a heterodyne detection method with an highspeed camera [14]. However, these FF-OMC techniques have not been used to analyze the functional characteristics of biological tissues because their primary purpose so far has been to achieve high imaging resolution [15][16][17] and suppression of motion artifacts [18][19][20][21]. As an advanced FF-OCT system to assess biomechanical properties of the retinal vascular system, Spahr et al. (2015) demonstrated full-field measurement of arterial and venous pulsations by employing a phase-sensitive FF-SS-OCT with a highspeed camera [22]. This technique was capable of highspeed measuring axial expansion directly with a temporal resolution of 0.5 ms. In this method, however, the intervolume phase differences were utilized to obtain the axial expansion after acquiring a whole volume during one wavelength sweep, which inherently requires a significantly highspeed and expensive camera. In the case of periodic vibrations caused by sound stimuli in the range of few tens kHz occurring in the cochlea and tympanic membrane, other FF-OCT techniques specialized for effective vibration measurements that permit the usage of even conventional image sensor are necessary.
For FF vibration measurements of vibrating surfaces, several interferometric techniques have recently been proposed [23][24][25]. Multiplexed digital stroboscopic holograms were proposed for two-dimensional (2D) surface vibration measurements [23]. Stroboscopic whitelight interferometry with a pulsed LED was used to measure the vibration modes of MEMS in the megahertz frequency range [24]. Further, stroboscopic digital holography was utilized for three-dimensional (3D) vibrometric measurements of human tympanic membranes [25]. These techniques, however, require complex configurations with a stroboscopic scheme, which may make it difficult to combine them with an OCT system. To date, these techniques have not been demonstrated in the tomographic analysis of multicellular tissues.
Herein, we report the development of a new technique, which we term "multifrequencyswept optical coherence microscopic vibrometry" (MS-OCMV), which provides not only 3D volumatic OCT but also FF vibration measurement of the spatial distribution of the amplitude, phase, and frequency without lateral x-y scanning. A previous prototype, an MS en-face OCT system [26] fabricated with a Fizeau-type interferometer, was improved by combining an ultralong-working-distance microscope and an ultrahigh-speed CMOS camera. The proposed system is based on two key techniques, i.e., a multifrequency-swept OCT (MS-OCT) [26-28] and wide-field heterodyne interferometric vibrometry (WHIV) [29]. By implementing the multifrequency-swept interferometry, low-coherence tomographic measurements can be performed without axial scanning of a reference mirror, which permits utilizing an ultralong-working-distance microscope in the interferometer. Thus, the defocusing that occurs in mechanical axial scans of the reference mirror can be prevented. Additionally, sinusoidal phase modulation of the sample arm can be conducted by using the WHIV technique with a vibrated reference mirror, in which a dual-phase modulated heterodyne signal is produced to generate a beat frequency component detectable by an image sensor. Thus, the internal surface of biological tissue vibrating at a much higher frequency than the frame rate of the camera can be measured in a wide field of view. The operating principle is explained and the performance validation of the MS-OCMV system is reported through OCT measurement of a 20-mm-thick glass plate and vibration measurement of a planar mirror with a frequency of 10 kHz. Further, high-speed 3D volumetric OCT and widefield vibration measurement were performed using a mouse tympanic membrane as a biological tissue sample to demonstrate the potential applicability of this novel technique to optical tools for biomechanical analysis.

Instrumentation
The MS-OCMV system is based on the OCM technique, which is used to achieve high transverse spatial resolution. The instrumentation is shown in Fig. 1(a). The system consists of an inverted ultralong-working-distance microscope, a Michelson-type polarization interferometer, a low-coherence optical comb light source with a broadband superluminescent diode (SLD), and a system control computer (PC).
The SLD (T-850-HI, Superlum), which had a wavelength of 850 nm and a full width at half maximum (FWHM) of 120 nm, was utilized to generate multifrequency light using a Fabry-Perot (FP) filter. Figure 1(b) shows the spectrum of multifrequency light produced by the FP filter. The FP filter consisted of two plates. One plate was attached to a piezoelectric actuator (MD-140L, MESS-TEK) to vary the cavity length, d. The second plate was fixed in place. A variable range of 180 μm was achieved by applying a synchronized ramp wave from a function generator (AFG2000, Tekitronix). The amplitude voltage was amplified to 150 V by a high-power amplifier (TZ-0.5P, Matsusada Precision).   The generated low-coherence comb was directed to the interferometer. The combination of the initial polarizer (P) and the polarized beam splitter (PBS) divided the beam into reference and sample beams based on a variable ratio according to their reflectance. Because the returning reference and sample beams were orthogonally polarized by the quarter-wave plates (QWPs) and the PBS, the interference irradiance was controlled by the analyzer before it reaches the detector. These polarization configurations allowed us to optimize the interference contrast, although the power reflected from the bio-tissue was quite low as compared to the reference power. The reference mirror was a 10-mm-thick polished glass plate with a reflectance of 4% in visible light. For the FF vibration measurement utilizing the WHIV technique, the reference mirror was vibrated using a piezoelectric transducer (PZT) to apply a sinusoidal phase modulated to the reference path length.
The inverted ultralong-working-distance microscope imaged the interference pattern of the beams, which interacted with the sample and reference mirror surfaces. This interference pattern was then projected onto the CMOS chip. The working distance of the microscope was approximately 200 mm to ensure sufficient space for in-vivo measurement of guinea pig tympanic membrane. The transverse resolution of the microscope was 5.04 μm, with a numerical aperture, NA, of 0.067. The transverse measurement area was approximately 5.5 mm (horizontal) × 5.5 mm (lateral).
A high-speed CMOS camera (FASTCAM Mini AX200, Photron) was employed as the detection camera in this system. This camera allowed high sensitivity corresponding to ISO 40000 and a capture speed of 1000 fps, with a high resolution of 1024 × 1024 pixel images. It took 1 s to acquire 1024 × 1024 × 1000 voxels, which corresponds to approximately 1 GHz A-scan/s. The frame rate and trigger pulse were synchronized with the function generator controlled by the system control PC. The system control PC generated the trigger pulse for image acquisition, and a voltage was supplied to a Piezoelectric actuator of the FP filter via a function generator controlled by LabView software. All captured images were processed and analyzed using MATLAB software. A graphics processing unit (GPU, Tesla K40, NVIDA) was used for parallel calculations of the volume rendering and processing of large-volume voxel data.

Principle of MS-OCT
In the OCT measurement mode, sequentially captured images in conjunction with the axial scan (i.e., A-scan) make an en-face OCT volume. In this system, the axial scan was conducted by varying the interval frequency of the optical comb. As mentioned above, the incident light passing through the FP filter from the SLD became the low-coherence comb (or quasi-comb) with a constant interval frequency, Δν. Because Δν and the cavity length, d, have the relationship of Δν = c/(2d), Δν can be swept by varying d. The interference signal exhibits repeating fringe peaks (i.e., high-order interference) with a constant interval of d [26]. Thus, the interference signal near the first-order interference peak position can be described as where A 0 is the bias component and B is the overall interference amplitude distribution, including the coherence function and the attenuation coefficient arising from the finesse value of the FP filter. L is optical path difference (OPD) considering refractive index. B can be written as where B 0 is the interference contrast, F is the finesse value of the FP filter, and h is the inverse Fourier transform of the envelope of the whole SLD spectrum, which represents the coherence function. Equation (2) indicates that the improvement of the finesse value is a tradeoff between suppressing the attenuation of coherence and total loss of the optical power. When the value of d is near that of L, a strong interference fringe is observed. The distribution of |B(d)| has a maximum and the phase of the fringe is equal to 0 or 2π rad when d = L/2. Thus, the varying of d, which changes the first-order interference position, is utilized for axial scanning. The distribution of |B(d)| can be extracted based on the Hilbert transform from the interference signal. The width of |B(d)| describes the coherence length, which is inversely proportional to the spectral bandwidth of the light. The axial resolution is defined here as the FWHM of |B(d)| 2 .
When the measured sample is a complex, multilayered object such as biological tissue, the positions of the strong interference fringes correspond to the internal reflecting positions of the sample during an axial scan. The distribution of |B(d)| 2 provides the MS-OCT signal along the depth direction. This signal is acquired at each pixel of the CMOS camera using the microscope in order to produce an en-face OCT image.

Principle of wide-field vibration measurements
In the full-field vibration measurement mode, the WHIV technique was utilized. This technique was previously proposed for measuring high-speed vibrations utilizing a conventional charge-coupled device image sensor [29] and applied to en-face OCT measurements [26]. As shown in Fig. 2, the targeted surface in the ROI at an arbitrary depth was determined by varying d. When the targeted surface vibrates at a frequency of f s , the RM vibrates at a slightly different frequency of f r = f s + δf, thus producing a dual sinusoidal phasemodulating signal expressed as where Z s , Z r , φ s , and φ r are the vibrational amplitudes and phases of the sample and the RM, respectively. In the CMOS image sensor, the high-frequency components of the interference signal are averaged, whereas the beat signal frequency δf, which is less than that of the frame rate f FPS , can be detected. When f r is set to satisfy δf < f FPS < 2δf, the interference signal at the vibrating surface near L (i.e., the position of first order interference) can be written as where J n is the n-th Bessel function of the first kind and δφ is the relative phase, defined as |φ r − φ s |. The interference amplitude, B, can be assumed to be nearly constant because the vibration amplitude is in the sub-micrometer range. α is the overall phase of the interference fringe that does not contribute to the vibration. By setting f r to satisfy δf = 1/8f FPS , one period of the heterodyne signal can be captured by eight frames of the CMOS camera. The frame number must be a power of two (2 N frames) for phase matching of the discrete fast Fourier transform (FFT).
To elucidate the vibration amplitude distribution, appropriate image processing based on the Hilbert transform was utilized to reduce the spatial artifacts of A and the carrier component of cos(α) from the distributions of F(0) = A + Bcos(α)J 0 (Z r )J 0 (Z s ) and F(δf) = Bcos(α)J 1 (Z r )J 1 (Z s ), respectively. From the extracted distributions of BJ 0 (Z r )J 0 (Z s ) and BJ 1 (Z r )J 1 (Z s ), the spatial interference intensity distribution B is removed by calculating the ratio R 01 . The ratio of the zero-and first-order components is obtained as R 01 = K| J 0 (Z s )|/| J 1 (Z S )|, where K = |J 0 (Z r )/J 1 (Z r )| is a known value because Z r can be adjusted before the measurements. By comparing the theoretical and experimental values of R 01 , the amplitude distribution, Z s , can be uniquely estimated under the condition of Z s < 2.4 rad. Finally, the vibrational amplitude in terms of length can be calculated as = Z s λ/4π. If extraction of zeroorder component is difficult due to the bias noise and low-frequency noise (LFN), it is possible to use a second-order component instead. However, the vibration amplitude should be enough large to detect the second-order component. In this case, the ratio of first-and second-order components,

OCT imaging capability
To evaluate the axial resolution of our system, a 20-μm-thick polished glass plate with a profile irregularity less than λ/10 (λ = 683 nm) was used as a test sample. The axial scan range was approximately 200 μm. 1000 interference images were captured sequentially for the measurement. The acquisition speed to acquire a 3D volume of 1024 × 1204 × 1000 voxels was 1 s. Figure 3(A) shows an A-scan interferogram of a first-order interference fringe acquired by one pixel of the high-speed CMOS camera after signal processing. There are two interference peaks, which correspond to the two reflecting surfaces in the glass plate. The axial resolution was estimated to be 2.4 μm based on the optical thickness of the glass plate and taking into account the refractive index. Figure 3(B) shows the cross-sectional image of the sample. This image is presented in linear grayscale.
To validate the use of MS-OCMV in scattering media such as biological tissue, we examined a dehydrated mouse tympanic membrane. The intensity of the returning light from the RM and the tympanic membrane was 5.5 μW when the power of the incident lowcoherence comb was 1.8 mW. The sensitivity of the camera was enough to capture the image without binning mode or accumulation. 2000 interference images could be captured sequentially at a frame rate of 1000 fps. The acquisition speed was 2 s during the acquisition of a 3D volume consisting of 1024 × 1204 × 2000 voxels. The transverse captured area was approximately 4.4 mm (horizontal) × 4.4 mm (lateral). Figure 4 shows the results of volume rendering. The resulting image reproduced both flexion and damaged sections of the membrane well with high axial and transverse resolution over a wide field. The thickness of the membrane was estimated to be 5 μm, which is in accordance with the thickness reported in conventional histological studies.

Validation of WHIV technique in MS-OCM
Prior to the vibration measurements of the mouse tympanic membrane, a thick aluminum planar mirror was measured to validate the usefulness of the WHIV technique in the MS-OCM system with the high-speed CMOS camera. Axial vibrations of 10 kHz were elicited on the mirror surface by using the PZT. Meanwhile, the reference mirror was vibrated at a different frequency of 10.125 kHz to produce a beat frequency of 125 Hz, which was oneeighth of the frame rate. The acquisition time required to measure 2048 frames was approximately 2 s, and thus 256 periods of the heterodyne signal were detected during one acquisition.
The vibration amplitude of the sample was varied by supplying sinusoidal modulation signals with voltage magnitudes of 7, 1, and 0.1 V pp . The reference mirror was calibrated to vibrate at an amplitude of approximately 77.2 nm. Figure 5 shows the differences in the vibration amplitude distributions and corresponding phase distributions. The amplitude distribution increased with increasing voltage application by the PZT. Table 1 shows the average amplitude values and spatial fluctuations for each voltage application. The average amplitude values corresponded to the amplitudes measured by a conventional laser vibrometer based on point measurements, also shown in Table 1. The estimation error of the amplitude distribution was evaluated by the spatial fluctuations in terms of standard deviation of the amplitude distributions because the sample was a rigid material that was expected to elicit little difference in the pattern of dynamics. This measurement error was arisen from nonuniform illumination power and image processing error as elucidated in previous research [30], which limits accuracy of this method.
The      Figure 6 shows spectral intensity distributions obtained by FFT of the representative temporal heterodyne signals acquired at one pixel of the camera. The frequency resolution was approximately 0.488 Hz, of which the frequency of the first-order component F(δf), was set to be an exact integer multiple (in this case, 256 times, corresponding to 125 Hz). Therefore, the frequency components of sound stimuli were selectively detected. The signalto-noise ratio (SNR) of the F(δf) component when the voltage was 0.1 V pp was approximately 8 dB, which indicates that this method is sufficiently sensitive on a nanometer scale. Owing to the high frame rate of 1 kHz, the F(δf) component could be extracted, completely separating it from LFN. The LFN was distributed within 0-30 Hz, which might disturb detection of the F(δf) component. Thus, δf should be much higher than the LFN frequency, which is the reason a high-speed camera is recommended for practical use in animal experiments.
To improve the spatial fluctuation in the resulting amplitude distribution, second-order component was used for estimating the amplitude of the planar mirror. As shown in Fig. 6(A), the second-order component could be extracted from the FFT spectrum intensity distribution in the case that the supply voltage was 7 V p-p . Figure 7(A) shows resulting vibration amplitude distribution obtained by utilizing the ratio, R 12 . The averaging amplitude value was 41.7 ± 3.0 nm (mean ± standard deviation), which was agreed well with the value obtained by the conventional laser vibrometer. The second-order component didn't contain the bias component. Furthermore, spatial distributions of the interference amplitude B and the phase term cos(α) included in the second-order component was almost the same as those of firstorder component. Therefore, more accurate vibration amplitude distribution could be obtained.

Wide-field vibration measurements in bio-tissue
The vibration of a mouse tympanic membrane was measured using the proposed MS-OCMV system. As described in a previous section, the heterodyne interference signal appeared on the same plane at a depth position determined by d, which swept the interference peak position (i.e., the coherence gate). Therefore, the ROI in depth was selected by varying d. After selecting the ROI position, a pure sinusoidal sound wave was applied to the sample by using a custom-made PZT speaker. The stimulus frequency and amplitude were 10 kHz and 90 dB SPL, respectively. The amplitude and phase distributions of the sound-induced vibration of the membrane over a wide field of view are shown in Fig. 8. As with the previous experiment of the mirror vibration, 2048 frames, which included 256 periods of the heterodyne signal, were analyzed with a frequency difference of δf = 125 Hz. As shown in Fig. 8(A)-8(D), we were able to distinguish the spatial vibration amplitude and phase distributions at arbitrary depths (55 μm and 75 μm depth from the zero OPD position.). The vibration distributions were formed tracing inclination of the curved membrane surface.
In this measurement, first-and second-order components were used to estimate the amplitude distributions because the distributions of zero-order component were not extracted effectively caused by LFN and large bias component. Owing to comparatively large vibration amplitude of the membrane in 100 nm range, second-order components could be extracted legibly, as shown in Fig. 8(E) and 8(F). By utilizing them, the amplitude distributions were successfully estimated. The average vibration magnitude was 228.8 nm which correlates well with vibration responses reported by other methods [10,11].

Discussion
In the MS-OCMV system, the sensitivity of our system was 45 dB in terms of the SNR of the imaging contrast, as shown in Fig. 9, which is inferior to the sensitivity of conventional OCMs. Technical limits such as power reduction using the FP filter, the sensitivity limit of the ultrahigh-speed camera, and dispersion mismatch or unbalanced reflectivity between the sample and the reference arms in the interferometer constitute the major factors causing the deterioration of the sensitivity. Higher-finesse FP filters may improve the coherency of the low-coherence comb, whereas the total illumination power might be attenuated. Utilizing a supercontinuum generated by a frequency-tunable fiber laser comb [28] is expected to improve both the coherency and the illumination power. The ripples (side peaks) of the interference-peak envelope may affect the imaging resolution and contrast, especially when the sample is biological tissue. As shown in Fig. 9, the ratio of the intensity between the main peak and the side peak of the interferogram was 15 dB, which can cause ghost peaks in very thinly layered media. To prevent this phenomenon, an optical filter, which can reshape and cut off the edge of the whole spectral envelope, should be applied. Another approach to compensating for ghost peaks is to use data denoising processing using the sparse method based on a lattice-based redundant transform as an adaptive filter [30].
Another factor causing the deterioration of contrast in imaging is the accumulation effect of the detector. The interferometer is very sensitive to the phase change caused by the path length variation. Therefore, accumulation during the exposure time in a pixel of the image sensor may cause blurring of the detected interference signal [18]. We investigate the influence on the contrast reduction caused by the accumulation time and scan speed. Accordingly, we assume that the phase variation caused by the axial displacement at a constant scanning speed v is as follows: The accumulated interference signal of the n-th frame with exposure time T can be written as follows: where, φ n − 1 is the initial phase when the accumulation of the n-th frame starts. This equation shows that the fringe contrast varies with the axial scanning speed v. Therefore, we can define the limit of speed v lim as the first zero of the sinc function. In this system, v lim is calculated as v lim = 425 μm/s at T = 1 ms (1000 fps) and λ = 850 nm. Figure 10 shows the normalized imaging contrast as a function of v. In our experiments, as the scanning speed of the glass plate and mouse tympanic membrane measurements were v glass = 200 μm/s and v tymp = 100 μm/s, respectively, the theoretical fringe contrasts were estimated to be 68% and 91% of the maximum contrast, respectively. For the vibration measurement, the fringe contrast deterioration from Eq. (4) can be estimated in the same manner. Assuming that the vibration amplitude is 100 nm, the instantaneous maximum scanning speed corresponds to approximately 25 μm/s because δf = 125 Hz, of which the contrast is more than 99%. Thus, measuring the nanometer scale vibration in the biological tissue has little effect on the contrast deterioration in this system.

Summary
We have demonstrated the use of an MS-OCMV system for wide-field OCT measurements and vibration distribution in fundamental biomechanical research. The use of an ultrahighspeed CMOS camera and ultralong-working-distance microscope allowed the imaging of a full-field 3D tympanic membrane sample and mapping of the spatial vibration amplitude phase distributions. The acquisition of 1024 × 1024 × 2000 voxels at an acquisition speed of 2 s (corresponding to 1 G voxel/s) resulted in high transverse and axial resolution. We have therefore shown the applicability of this method to in-vivo, in situ vibration measurements of living animals undergoing surgery. Finally, proposed MS-OCMV is likely to provide a platform for mechanobiology, targeting vibrating organs including heart and cochlea.