Rapid imaging of tympanic membrane vibrations in humans

: Functional imaging of the human ear is an extremely challenging task because of its minute anatomic structures and nanometer-scale motion in response to sound. Here, we demonstrate noninvasive in vivo functional imaging of the human tympanic membrane under various acoustic excitations, and identify unique vibration patterns that vary between human subjects. By combining spectrally encoded imaging with phase-sensitive spectral-domain interferometry, our system attains high-resolution functional imaging of the two-dimensional membrane surface, within a fraction of a second, through a handheld imaging probe. The detailed physiological data acquired by the system would allow measuring a wide range of clinically relevant parameters for patient diagnosis, and provide a powerful new tool for studying middle and inner ear physiology.


Introduction
Diagnosing hearing problems requires assessment of the multiple factors involved in the mechanical transduction of sound into the inner ear. These tests are some of the most common diagnosis procedures in the world, where approximately 466 million people (over 5% of the world population) suffer from some sort of a disabling hearing loss [1,2]. In most cases, early detection and intervention could significantly reduce the negative impact of hearing problems; in some types of conductive hearing loss (CHL) such as otosclerosis [3] and otitis media [4], for example, preventive treatment may include prescription of various drugs [5,6] that could save the need for hearing aids [7] or surgical intervention [8].
A range of technologies are available today for hearing diagnosis, including pure tone audiometry [9] and pneumatic otoscopy [10]; however, most techniques are relatively subjective and provide only partial information that limits effective diagnosis. More objective approaches for functional hearing diagnosis include measuring the auditory brainstem response (ABR), otoacoustic emissions (OAE's) under different acoustic stimuli [11,12], and acoustic reflections of single [13] and multiple [14] harmonic stimuli.
A direct measurement of the tympanic membrane movements would be a most desirable clinical test, as it may provide invaluable information on the physiological status of the membrane itself, as well as on the middle and inner ear. Several experiments have demonstrated the feasibility of optical interferometry to measure acoustic vibrations; Laser Doppler vibrometry (LDV) [15] was proven effective for measuring nanometer-scale motion at a single point of the tympanic membrane [15][16][17]. By scanning the membrane point by point, optical coherence tomography (OCT) allowed 3D imaging of an ex vivo tympanic membrane, including its acoustic vibrations map, and several groups have measured the dynamics and structure of the tympanic membrane in vitro [18,19]. Recently, phase-sensitive OCT allowed in-vivo measurement of the middle ear and the tympanic membrane vibrations in human subjects at multiple locations [19][20][21][22][23].
Despite these developments, in-vivo imaging of both amplitude and phase across the continuous membrane surface could be challenging using beam scanning techniques, mainly due to the inevitable motion artifacts that occur during the slow data acquisition. Scan-free methods such as stroboscopic holography [24][25][26] were demonstrated ex vivo on surgically exposed membranes within a fresh temporal bone; however, in vivo imaging with this technique may be challenging due to the relative complexity of the holographic imaging apparatus.
Single-shot phase-sensitive imaging of an entire line would be ideal for addressing the challenge of in vivo high-speed nanometric imaging of a vibrating surface, as it allows the use of line cameras that can be faster than a single acoustic period. By encoding a single lateral dimension with wavelength, spectrally encoded endoscopy [27,28] allows high-resolution imaging through a single optical fiber using compact imaging probes. Adding low-coherence, phase-sensitive spectral-domain interferometry allowed interferometric spectrally encoded endoscopy (ISEE) to image nanometric-scale surface vibrations within a fraction of a second [29,30]. Briefly, ISEE measures the spectral interference between a reference and spectrally encoded reflections from the target tissue. Under acoustic stimulations, the axial tissue motion induces wavelength-dependent phase shifts that are then captured by a high-speed spectrometer. By slowly scanning the imaging line across the tissue, the full vibration pattern is recovered with high lateral resolution and nanometric axial sensitivity.
In this work, we design and construct an ISEE system capable of imaging the tympanic membrane in human subjects. After testing the optical performance of the system and adjusting its user interface to allow single-hand operation, we demonstrate effective imaging of various vibration patterns in two volunteers under different harmonic stimuli. We next demonstrate the system potential for extracting relevant functional parameters by employing a single-line measurement technique, which permits rapid measurements of the membrane response to continuously varying sound amplitudes and frequencies.

Experimental setup
The optical setup of the imaging probe ( Fig. 1) was designed to allow single-hand operation by the clinician, and is based on our previous bench-top setups of ISEE [29,30]. Light from a fiber-coupled broadband (50 nm, 840 nm center wavelength) superluminescent diode (SLD) array was split by a 50/50 fiber coupler (FC) to the sample and reference arms of a Michelson interferometer. At the sample arm, light was collimated by a 11 mm focal-length lens (L1), scanned by a single-axis galvanometric scanner (GS), diffracted by a 1200 lines/mm transmission grating (G), magnified (2×) by an achromatic telescope (L2, L3) and focused on the tissue surface using an additional imaging lens (100 mm focal length, L4). Light reflected from the tissue propagated back through the same optical path, coupled into the single-mode fiber (which serves as an effective pinhole for confocal imaging), and measured by a high-speed line camera (Basler AG, spL4096-70km, 70 kHz maximum line rate) within a custom-built spectrometer. Additional components within the probe included an optical shutter (OS), a conventional otoscope (OT) with an optical window (W) replacing its original lens, and a robust handgrip (H) for guiding the speculum (SP) into the ear canal. Widefield illumination of the sample was obtained using the integral white-light illumination of the otoscope.
Real-time widefield imaging of the tissue was attained using a dichroic mirror (transmission threshold 650 nm, DM), a low-pass filter (LP, 90% transmission below 750 nm), a Fourier-plane iris (I), an additional lens (L5, 40 mm focal length), and a small camera (C, 30 Hz). At the reference arm, a delay line was used to adjust the axial imaging depth, placing the virtual reference plane (RP) behind the tympanic membrane for preventing mirror artifacts. Two polarization controllers (not shown) were used at the sample and reference fibers to improve interference contrast. Sound stimulus (created by Audacity software, 16-bit) was provided in 44.1 kHz using a PC sound card through a Samsung EO-HS3303WE earphone that was attached to the otoscope pneumatic port (EP). The excitation sound amplitude was estimated by placing a calibrated microphone (TRAM Lavalier Microphones, model TR50) within an ear model that simulates the human ear canal.
The field of view of the system was a square of 4.5 × 4.5 mm 2 . The line-camera imaging speed was adjusted according to the exact acoustic stimulation, and was chosen to maintain twelve measurements per single acoustic period for all frequencies. For example, a full scan with 2000 Hz excitation was acquired at 24 kHz line rate, resulting in a total of 400 acoustic periods per single frame acquired during 0.2 s. System sensitivity (ratio between the signal from a perfect reflector and the noise floor) at this imaging speed was approximately 69 dB, and the axial imaging range was limited mainly by (twice) the Rayleigh range (approximately 6.6 mm), which was somewhat smaller than the 9 mm coherence range determined by the spectrometer resolution (0.034 nm). The axial sensitivity, i.e. the accuracy of determining the axial location of the membrane, was approximately 5 nm, determined by the effective width of the measured membrane. The lateral optical resolution was approximately 12.5 µm (FWHM of the line-spread function), but was digitally reduced in the horizontal dimension to approximately 30 µm due to the 32-pixel window of the Hilbert transformation [29]. As a result, a single 4096×5150-pixel raw image acquired by a single y-scan of the galvanometric scanner, yielded a three-dimensional surface having 128×128 lateral resolvable points that were sampled by 256×256 pixels. Membrane axial motion was computed by multiplying the phase difference between the Hilbert transforms of subsequent spectral interferograms by 4π/λ i [30], where λ i denotes the encoding wavelength at each location along the spectrally encoded line (see Fig. 1). Motion artifacts due to axial probe motion were removed by filtering out the resulting uniform, non-periodic spectral phase shifts.
The in vivo experiments included four sets of measurements: (i) single two-dimensional imaging of volunteer 1 with 2000 Hz pure-tone 90 dB excitation, (ii) Comparative two-dimensional imaging of volunteers 1 and 2 with 5500 Hz pure-tone 90 dB excitation, aiming to demonstrate the differences between different subjects, (iii) two-dimensional imaging of volunteer 2 under different pure-tone 90 dB excitation between 750 Hz and 5500 Hz, (iv) single-line measurement with continuous amplitude sweep 0-90 dB in volunteer 1, and (v) single-line measurement with a continuous frequency sweep in volunteer 1 (sound amplitude not calibrated).

Results
For in vivo imaging, the volunteer was seated next to the imaging probe which was supported by a jointed arm for reducing its effective weight and allowing continuous imaging with minimal motion artifacts. After guiding the speculum into the (right) ear canal, a full vibration image ( Fig. 2(a) and Visualizations 1) was captured by a single y-scan of the galvanometric mirror during 200 ms under pure-tone 2000 Hz acoustic excitation. The same field of view was also imaged by the widefield camera ( Fig. 2(b)) and included the umbo point (U), the cone of light (CL), the malleus lateral process (MLP) and the reflections from the spectrally encoded line (SE). Two axial cross-sections of the vibration movie are shown in the lower panel of Fig. 2(a), corresponding to the blue (CL-U) and red (U-MLP) lines in the top 3D panel. The characteristic curved surface of the tympanic membrane is clearly visible in the height map ( Fig. 2(c)), which was computed using a windowed Fourier transformation at each sample location. The axial resolution in this image was 940 µm, and depended on the imaging bandwidth at each sample location (approximately 0.33 nm). Several features are worth noting in the resulting vibration movie (Visualization 1 and Fig. 2(a)). First, the curved and angled membrane made imaging of its most posterior part (left-hand side of the frame) challenging, mainly due to the weak reflections from that region. In contrast, the strong reflections from the cone of light often saturated the spectrometer camera and limited our dynamic range. Second, vibration amplitudes at the lateral process of the malleus were as high as ±50 nm, notably higher compared to those at the con-of-light and the umbo. The malleus itself, a rigid bone that touches the tympanic membrane at the umbo and with its lateral process, appears to be vertically translating and rotating, as evident by the approximately 90°oscillation difference between these two points. And third, at 2000 Hz the membrane did not oscillate uniformly; rather, the regions of the cone of light and the malleus lateral process oscillated ahead of the umbo region by approximately 90°, and the resulting oscillation waves appears to propagate inward, from these regions toward the umbo.
The vibration patterns in different human subjects could be quite different from one to another, as evident by comparing the vibration patterns of in two healthy volunteers under 5500 Hz excitation (Visualization 2 and Fig. 3). In volunteer 1 (healthy male, 50 years old) the region near the cone of light appears to be leading the oscillatory wave that propagates outward, reaching the umbo approximately half a period later; whereas in volunteer 2 (healthy male, 38 years old) the oscillation wave propagates mainly upward toward the region of the malleus lateral process, and at the same time moves toward the umbo, which oscillates with an opposite phase. In agreement with previous measurements [20,21,25,26], the vibration patterns of the tympanic membrane changes considerably under different excitation frequencies ( Fig. 4 and Visualization 3). As sound frequency increases, surface vibrations vary from uniform in-phase motion up to 1500 Hz, through various asymmetric patterns at 2000-3500 Hz), to patterns with some rotational symmetry at 4000-5500 Hz [31]. Note that these in vivo patterns differ from the in vitro patterns measured in excised human temporal bone [25,26], most likely due to the obvious physiological and mechanical differences between living and excised tissues. Our measured vibration patterns appear mostly similar to those measured in humans by endoscopic OCT [20], except for the rotational patterns revealed in Figs. 3-4 for frequencies above 4000 Hz. The oscillation amplitudes at the umbo region were in the range 0-50 nm, in agreement with those measured ex vivo by Cheng et al. in human fresh temporal bones (0-50 nm). Some of the image artifacts visible in Figs. 2-4 include a prominent vertical dark line caused by loss of signal at a narrow spectral band, most likely due to polarization deviation between the sample and the reference, which was not perfectly compensated across the entire encoding bandwidth. Occasional horizontal lines (visible mainly in Fig. 4 at 750 Hz, 1000 Hz and 2500 Hz excitations) were caused mainly by occasional unintended axial motion of the probe relative to the membrane, causing temporal fringe washout that could not be recovered by data processing. In order to extract meaningful clinical data that would be compatible with conventional methods for hearing diagnosis [32], the spectrally encoded line was placed at the center of the field of view, constantly illuminating both the umbo region and the cone of light. Without any vertical scanning, the system can now continuously measure the membrane response to various excitation parameters, where the raw data includes the rapidly varying spectral interferogram (x-axis) as function of time. To demonstrate the system ability to assess membrane linearity in response to sound stimulus, the carrier frequency was set to 2000 Hz and the amplitude was increased linearly from zero to 90 dB over a period of 46 ms (Fig. 5). The raw data file (4096 × 2500 pixels total, Fig. 5(a)) reveals the axial harmonic (2000 Hz) movements of the membrane, visible as continuous periodic shifts of the interference fringes (50 kHz acquisition line rate, 25 pixels per single acoustic period). The increased amplitude as function of time is clearly visible in three high-magnification views (white rectangles) of the raw data. The vibration amplitude (Fig. 5(b), blue data points) at a single location on the membrane (marked by an arrow in Fig. 5(a)) fits well to the acoustic excitation wave (red line), demonstrating the linear relation (R 2 =0.983) between sound amplitude and membrane displacement. Once excitation amplitude reached its maximum amplitude after 46 ms, it abruptly dropped to zero. Note that the membrane oscillation has decayed only after approximately 2 ms, a decay time that could be attributed to earphone ringing.
The single-line measurement technique could also be useful for measuring the frequency response of different parts of the membrane, including their relative phases. To demonstrate this capability, the imaging line was kept at its position between the umbo and the cone of light (Fig. 6, top-right inset), the line camera was set to a constant 50 kHz line rate, and a linear excitation frequency chirp was applied between 500 Hz and 6250 Hz. The periodic displacements of both umbo (blue line) and cone-of-light (red line) followed the general frequency response of the sound system, which was not calibrated for this measurement. The phase difference (green line) between the two points, however, showed constant drifted during the sweep: under 2 kHz the two points oscillated with similar phases, with only minor differences (smaller than 15°) up to 3.5 kHz. Between 3.5 kHz and 5 kHz the phase difference increased gradually, with the cone of light lagging behind the umbo point below 4250 Hz, reaching a complete out-of-phase 180°o scillation at 4300 Hz (the strong ±π jumps is an artifact caused by noise on the wrapped phase), and oscillating ahead of the umbo up to 5000 Hz, where the two points returned to in-phase oscillation.

Discussion
In vivo imaging of tympanic membrane dynamics in response to various acoustic stimuli would be invaluable for studying the function of this important organ. Being physically connected to the malleus, the first of the three ossicles, the tympanic membrane dynamics is strongly affected by the entire chain of mechanical sound conduction and consequently by the mechanical properties of the inner ear as well.
The system presented and demonstrated in this study can capture the full vibration patterns of the human tympanic membrane in response to an arbitrary acoustic stimulus, thanks to the combination of spectrally encoded endoscopy with phase-sensitive spectral-domain low-coherence interferometry. The higher spectral resolution of the spectrometer (0.025 nm) compared to that of the imaging probe (0.075 nm) allows the system to record spectral interference from every sample location, and while the optical bandwidth that encodes this location allows relatively low axial resolution (approximately 940 µm), the relative phases of the modulated spectra could be captured with extremely high accuracy, yielding axial sensitivities of only a few nanometers. Such nanometric sensitivity, when combined with a high-speed heterodyne line measurements of up to 70 kHz, allows the direct detection of the axial acoustic oscillations of a three-dimensional surface, within a fraction of a second.
In comparison to OCT, which is capable of high-resolution imaging of the membrane structure and thickness, as well as deeper structures of the middle ear [18][19][20][21][22][23]33], the advantages of ISEE stem mainly from the single-shot line acquisition, that results in considerably faster imaging, less motion artifacts and simpler imaging probes. For example, capturing a single-frequency vibration pattern of the human tympanic membrane in vivo by Kim et al. [21] required 95 seconds for completing a single scan. In contrast, the ISEE movie presented in Fig. 2 was captured only within 0.2 second, faster by more than two orders of magnitudes. Furthermore, ISEE imaging at 11 different sound frequencies (Fig. 4) required only 2.3 seconds, and single-line measurements required even shorter times. Such rates are particularly important in the clinic, where fast, comprehensive functional measurements are often essential for timely diagnosis. The need to scan only a single axis also allows smaller imaging probes that could be held and operated by a single clinician. While our system was assembled mainly from off-the-shelf components and hence was relatively bulky, a single-handed operation was still possible, and future versions of the system would be made considerably smaller and lighter. Additional advantage of ISEE is the continuous lateral scanning, allowing continuous full 2D sampling of the membrane surface without the interpolation typically used in OCT [22,23]. For 3D imaging, however, ISEE is limited by its axial resolution (Fig. 2(c)), which is typically lower by 1-2 orders of magnitude compared to OCT due to the small bandwidth that encodes each lateral location. Hence, when compared to OCT, ISEE cannot directly measure the tympanic membrane thickness, may be affected by reflections from ossicles beneath the membrane, and is therefore less suitable for high-resolution 3D imaging of the outer and middle ear. Note that the axial nanometric sensitivity is similar to that of OCT, as phase-sensitive spectral-domain interferometry is common to both modalities.
Some technical issues must be addressed before the system could make an impact in the clinic. First, the user interface of the system must be refined to allow more user-friendly and fluent operation of the imaging probe. Many of our imaging attempts suffered from motion artifacts and sub-optimal polarization control, that resulted in occasional fringe washout and consequent data loss. Such issues could be resolved by providing a real-time visual feedback to the clinician, improving the physical contact between the otoscope speculum and the patient's ear canal, and by using faster acquisition rates and polarization-maintaining optics.
Second, in order to obtain meaningful physiological data of clinical significance, the system needs to be calibrated according to protocols that are commonly used for hearing tests. For example, while the frequency-sweep data shown in Fig. 6 recorded the true axial displacements of the tympanic membrane, its magnitude does not reflect the true frequency response of the membrane because the true excitation sound amplitudes were unknown to us, and depended mainly on the frequency response curve of the earphones used in these experiments. An effective calibration procedure would require high-quality broadband earphones and sound systems that must be fully integrated into our modified otoscope system. Note, however, that while such calibration is required for measuring the exact membrane mechanical properties, non-calibrated (but repeatable) measurements could still be useful for clinical diagnosis, for example by comparing the frequency response curves between different patients under identical excitation waves.
Third, while measuring mechanical nonlinearities in the ear such as those caused by the stapedius reflex [34] would be also extremely valuable from a clinical perspective, our measurements did not detect any significant nonlinearity (see for example Fig. 5), most likely due to the relatively noisy environment of the optics laboratory, where our experiments were performed. Future experiments with a calibrated system and proper acoustic isolation may allow detection of the stapedius reflex and imaging this phenomenon with high spatial resolution. Finally, the high acquisition speed and relative simplicity of ISEE may also allow straightforward assessment of an array of clinical conditions such as observation of changes in the tympanic membrane following surgery or trauma. In the clinic, the proposed system could also help diagnosing otosclerosis and otitis media, and may even help identifying and studying otoacoustic emissions in patients with tinnitus.
In summary, we have presented and demonstrated an optical system suitable for high-resolution in vivo imaging of tympanic membrane vibrations. With nanometric axial sensitivity and single-shot line acquisition, the full vibration maps of the tympanic membrane were recorded noninvasively within a fraction of a second, revealing vibration patterns that vary between different healthy human subjects. In the future, these new capabilities could serve as a powerful tool for patient diagnosis, providing a set of highly sensitive measurements that could detect numerous physiological and clinically relevant parameters.