High-resolution infrared imaging of biological samples with third-order sum-frequency generation microscopy

: We studied the use of vibrationally resonant, third-order sum-frequency generation (TSFG) for imaging of biological samples. We found that laser-scanning TSFG provides vibrationally sensitive imaging capabilities of lipid droplets and structures in sectioned tissue samples. Although the contrast is based on the infrared-activity of molecular modes, TSFG images exhibit a high lateral resolution of 0.5 µ m or better. We observed that the imaging properties of TSFG resemble the imaging properties of coherent anti-Stokes Raman scattering (CARS) microscopy, oﬀering a nonlinear infrared alternative to coherent Raman methods. TSFG microscopy holds promise as a high-resolution imaging technique in the ﬁngerprint region where coherent Raman techniques often provide insuﬃcient sensitivity.


Introduction
The mid-infrared (MIR) wavelength range roughly stretches from 3 to 10 µm, and covers the energy range of many fundamental molecular vibrations. The absorption of MIR radiation by molecular vibrational modes forms the basis of the contrast seen in IR spectroscopy and microscopy. In Fourier transform IR (FTIR) microscopy, the absorption signatures are recorded as a function of position in the sample, producing a chemical map without the need for sample labeling. This capability is of tremendous relevance to biomedical imaging, and many efforts have shown that FTIR microscopy holds promise as a label-free imaging method for tissue inspection. [1][2][3][4][5] relatively low spatial resolution. The diffraction-limited resolution is dictated by the wavelength of light, corresponding to a practical point spread function in the 4 to 7 µm range for the fingerprint vibrational bands. Although individual cells can be examined [10,11], the intracellular structures are typically not resolved in MIR microspectroscopic imaging because of the limited resolution. The low resolution affects the diagnostic capabilities of the MIR approach to imaging. Additionally, since MIR radiation is strongly absorbed by water over a wide range, microscopic imaging of hydrated tissue samples in this wavelength regime is challenging. For this purpose, most imaging studies are performed on dehydrated samples, thus largely excluding applications that focus on aqueous specimens or live cells and tissues.
Several approaches have been undertaken to tackle some of the issues above. In terms of improving the detection strategy, a handful of methods have been developed that convert the MIR information to a visible signal. For example, the principle of up-conversion of MIR radiation in a nonlinear material has been used for this purpose [12], enabling the detection of IR images with a silicon-based camera. [13,14] Another technique uses a secondary visible laser beam to probe the MIR-induced photothermal effect in the sample. [15][16][17] The heat induced refractive index changes in the sample affect the beam properties of the visible beam upon traversing the specimen, which can be captured with a visible photodetector. In both approaches, expensive MIR detectors can be avoided and replaced by more efficient and cost effective detectors in the visible/NIR.
Other efforts have focused on improving the spatial resolution of MIR-based imaging. The most commonly used microscope focusing element is a reflective Cassegrain objective, which is commercially available and features numerical apertures (NA) up to ∼ 0.7. However, over the past three decades, there has been limited progress in developing reflective objectives with a higher NA. An alternative focusing configuration uses a solid immersion lens (SIL) to improve the NA of the reflective mirrors. For instance, a silicon SIL has been used to improve the MIR imaging resolution by twofold. [18] Yet, the resolution of the imaging system is still determined by the IR-diffraction limit. The spatial resolution in the photothermal MIR imaging technique, on the other hand, is not determined by the diffraction-limited focal volume of the MIR excitation alone, but rather by the overlap volume of the MIR pump and the visible probe. The dimensions of this interaction volume are roughly equal to the diffraction-limited spot size of the tightly focused visible laser beam, thus offering a sub-µm imaging resolution. Recent implementations of the photothermal MIR microscopy technique have reached a lateral resolution in the 0.6 µm range [16,17], which is about an order of magnitude better than the resolution set by the IR-diffraction limit of the same lens.
The discussion above emphasizes the advantages of a technique like photothermal MIR microscopy. Shifting the probing of the MIR absorption event to the visible range of the spectrum both relaxes the constraints on the photodetector and substantially improves the effective spatial resolution of the imaging system. It is also possible to probe MIR-excited vibrational modes in a fully coherent manner with nonlinear optical (NLO) microscopy. A current example is sum-frequency generation (SFG) microscopy [19], which uses a MIR pump beam to prepare a coherent superposition of the ground state and a vibrationally excited state. The coherence is subsequently probed with a visible/NIR laser pulse, producing a SFG signal in the visible range of the spectrum. Like photothermal MIR microscopy, SFG benefits from shifting the detection to a visible wavelength. Laser-scanning SFG microscopy offers MIR-sensitive images with 0.6 µm lateral resolution and image acquisition times that are similar to the acquisition times in coherent Raman scattering (CRS) microscopy. [20] Nonetheless, SFG microscopy is a second-order NLO technique that probes materials that exhibit non-centrosymmetry. While SFG is a good method for studying fibrillar collagen [21,22] or cellulose [23,24], structures that fulfill the requirement for non-centrosymmetry, it is less useful as a general IR-sensitive imaging technique.
We have recently shown the feasibility to perform third-order SFG (TSFG) measurements in a laser-scanning microscope. [25] In TSFG, the MIR-driven coherence is probed with a two-photon hyper-Raman interaction, producing a coherent signal in the visible part of the spectrum, as shown in Figure 1(A). Unlike SFG, which probes the χ (2) properties of the sample, TSFG is a four-wave mixing technique that is sensitive to the χ (3) properties of the material. Like CRS methods, TSFG is not limited to certain symmetries of the sample, and can thus be used as a general vibrationally sensitive imaging technique. Whereas CRS techniques are sensitive to Raman-active modes of the sample [26], TSFG is sensitive to IR-active modes. In other words, the TSFG microscope can be regarded as a nonlinear equivalent of the MIR-absorption microscope.
Although the feasibility of measuring TSFG has been demonstrated, it is unclear what the potential of the technique is in the context of biological and biomedical imaging. In this contribution, we push TSFG in the direction of biomedical imaging applications. In particular, we discuss the resolution and spectral dependence of the signal in the CH-stretching range of the spectrum. We also study the TSFG signal of neutral lipids in aqueous medium and compare the image features with those of coherent anti-Stokes Raman scattering (CARS) images. We furthermore present the first TSFG images of intact biological tissue. These results indicate that TSFG microscopy holds promise as an IR-sensitive NLO imaging tool for biological and biomedical imaging applications.

Light source
The light source consists of a Yb-doped fiber laser (aeroPULSE, NKT Photonics), which produces a 76 MHz pulse train at 1032 nm (ω p ). The pulse width is about 10 ps. The laser output synchronously pumps an optical parametric oscillator (Levante OPO, A.P.E. Berlin). The OPO produces a signal (ω s ) and idler (ω i ) beam through parametric down-conversion of the pump photon (ω p = ω s + ω i ). The OPO has been modified to accommodate the generation of mid-IR radiation. The linear cavity is resonant with the signal and includes a fanned periodically-poled crystal that allows signal generation in the 1350-2000 nm range, corresponding to 2200-4300 nm for the idler beam (2320-4550 cm −1 ). At an average pump power of 7.5 W, the power in the signal beam is ∼ 1.5 W, while the idler power is ∼ 550 mW. All three output beams (ω p , ω s , ω i ) are attenuated with a half waveplate and a polarizer, and the transverse beam profiles are improved with spatial filters. The ω p and ω s beams are delayed with translation stages such that all pulse trains overlap temporally after combining the beams collinearly on custom dichroic mirrors (Layertec, GmbH). A schematic of the beam routing is shown in Figure 1(B).

Microscope
The microscope is based on a laser-scanning imaging system (Fluoview 300, Olympus), interfaced with an inverted microscope frame (IX71, Olympus). The scanning mirrors are aluminum coated, and support all beams used in the experiments reported here. The beams pass a f=50mm CaF 2 scan lens and a f=200mm CaF 2 tube lens before entering a 0.65 NA reflective objective (5007, Beck). For the TSFG experiments, the powers at the sample are 1-10 mW for the idler beam and 10-60 mW for the pump beam. For the CARS experiments, the idler beam is blocked and the signal beam is allowed to pass through the sample at average powers of 1-20 mW. Signals are collected in the forward direction by a 1.4 NA oil immersion condenser. A dichroic mirror splits the signal into two contributions, one isolates the TSFG signal through a 450 ± 10 nm bandpass filter, and the other captures the CARS signal through a 780 ± 20 nm bandpass filter. Both detectors are photomultiplier tubes (7422, Hamamatsu) equipped with pre-amplifiers (TIA60, Thorlabs). Signals are detected in the single photon counting mode, by passing the signals through a discriminator (F-100T, Advanced Research Instruments), which converts each counting event into a TTL pulse. The TTL pulses are subsequently stretched in time by a waveform generator (AFG3102, Tektronix) to optimize the integration of the signal directly on the ADC card. The pixel dwell time is 1 µs. To improve image contrast, frame averaging is used with up to 60 averages, giving rise to a maximum acquisition time of about a minute per averaged image.

Sample preparation
The D 2 O used in the experiment is purchased from Sigma-Aldrich. A droplet of Olympus immersion oil is used as the mineral oil sample. Barium titanate (BaTiO 3 ) nanoparticles of diameter < 0.1 µm are obtained from Sigma-Aldrich. Tissue samples are derived from adult mouse adipose tissue and eyelid skin. Excised skin was embedded in OCT medium and cut to slices of either 10 or 25 µm thickness. Sectioned tissue was placed on a standard microscope glass slide and rinsed with room temperature saline buffer (either H 2 O or D 2 O based) to remove OCT medium. A borosilicate coverslip was placed on the immersed tissue and the sample was sealed with epoxy adhesive to prevent dehydration of the specimen. On some slides, droplets of neutral lipid were present and they formed during the cutting process. The droplets were used in some of the experiments as representative neutral lipids for characterizing the TSFG signal.

Observation and characteristics of the TSFG signal
With the ω i beam tuned to 2820 cm −1 , the TSFG signal at 2ω p + ω i is expected at 446 nm. Using a droplet of mineral oil on a glass coverslip as the sample, we indeed observe a signal in the TSFG channel when the ω i and ω p beams are overlapped in time. Interestingly, we also observe a signal near 652 nm, which we attribute to the TSFG signal at 2ω i + ω p . Both signals observed from the oil droplet show a spectral dependence when the ω i beam is tuned, indicating that both TSFG signals probe the molecular vibrational resonance. Nonetheless, because the average power of the MIR beam is purposely tuned much lower than the average power of the pump beam, the signal at 2ω p + ω i is significantly stronger that the signal at 2ω i + ω p . For this reason, the TSFG signals are detected in the wavelength range near 445-450 nm. The inset in Figure 2(A) shows the dependence of the 2ω p + ω i TSFG signal on the average power of the ω i and ω p beams. A linear and quadratic fit confirm that the TSFG signal scales linearly with the ω i beam and quadratically with the ω p beam. oil droplet (red circles). The signal peaks near 2820 cm −1 and dips near 3000 cm −1 . The spectral profile does not directly mimic the IR absorption profile of the oil, but rather shows a dispersive lineshape. Similar to CARS spectral lineshapes [26], the interference between resonant and nonresonant χ (3) contributions produces a dispersive profile. [25] The maximum TSFG signal from the oil samples in this range appears at 2820 cm −1 . The TSFG signal observed from a droplet of water is shown by the blue triangles. Near 2820 cm −1 the signal from water is significantly less than that from mineral oil, enabling imaging of concentrated aliphatic compounds without interference from a strong water signal, but near 3000 cm −1 the water signal has increased and is stronger than the signal from oil. Beyond 3050 cm −1 , we observe a decrease of the water signal when the vibrational frequency is increased. Again, the maximum TSFG signal appears significantly red-shifted from the maximum IR absorption in this range (∼ 3400 cm −1 ). A similar observation has been made for the CARS spectrum of water [27], and is due to the interference of resonant and nonresonant components. Note that the signal from D 2 O (green triangles) is relatively constant in the CH-stretching range (2800-3000 cm −1 ), due to the absence of O-H resonances. As expected, we observe an increase in the TSFG signal from D 2 O when ω i is tuned to ∼2550 cm −1 , which is closer to the O-D resonances.
A measure of the lateral spatial resolution is given in Figure 2(B), which shows the dependence of the TSFG signal when scanning a 0.1 µm BaTiO 3 particle through focus. The full-width half maximum (FWHM) of the resulting profile is 0.45 µm, corresponding to a resolution that is significantly higher than the resolution defined by the IR-diffraction limit. A measure for the axial resolution can be obtained by scanning an oil/glass interface axially through focus at 2820 cm −1 .
The results of such a step-edge scan are shown in Figure 2(C). We observe that the transition from the nonresonant glass signal to the resonant oil signal occurs in about 10 µm. A similar dependence is observed for the CARS signal, underlining the comparable performance of the two techniques. Among other factors, the axial resolution is largely dictated by the microscope objective. An elongated focal volume in the axial dimension is characteristic for laser beams focused by reflective Cassegrain objectives. Although the axial resolution is much less than what is common for NLO microscopes that use high NA refractive objective lenses, the probing volume is truly three dimensional because of the two-photon probing interaction with ω p . We note that unlike TSFG, regular IR-based microscopes do not exhibit an intrinsic 3D resolution. Figure 3(A) shows a TSFG image of lipid droplets immersed in water. The neutral lipids are obtained from adult mouse white adipose tissue and thus represent a relevant tissue component. The droplets are clearly observed in the image at 2820 cm −1 , whereas the TSFG signal from water is significantly less at this vibrational frequency. When the MIR frequency is tuned to 3005 cm −1 , shown in Figure 3(B), the signal from the droplets drops significantly while the water signal has increased. The droplets now appear with negative contrast relative to an elevated signal from the surrounding medium. The corresponding CARS images are shown in panels 3(C) and 3(D). There is a striking resemblance between the TSFG and CARS images. In both cases the signal shows a clear positive signal when tuned near the symmetric CH 2 stretching mode, and inverted contrast when tuned off-resonance at 3005 cm −1 . This resemblance underlines the similarities between TSFG and CARS, as both signals are coherent, carry vibrationally resonant and nonresonant contributions, and scale as | χ (3) | 2 . The main difference between the techniques is that TSFG drives IR-active vibrational modes whereas CARS drives Raman-active modes. Note that some of the subtle differences between the image features in the TSFG and CARS images can be attributed to the slight difference between the axial position where the TSFG and CARS signals are maximized. Under the current imaging conditions, the TSFG signal is weaker than the CARS signal. This is reflected in the signal-to-noise ratio (SNR) extracted from the resonant signals of the images shown in Figure 3. The SNR for TSFG is ∼ 6 while the SNR for CARS is twice as high.

Image features of TSFG microscopy
Because TSFG is also closely related to third-harmonic generation (THG), it is perhaps surprising that the TSFG signal from the bulk inside the droplets is strong and that the interface between the droplet and the aqueous medium is not significantly highlighted. In THG imaging, the Gouy phase shift mismatch between the incident field ω and the signal field 3ω is significant [28,29], suppressing the signal from the bulk. At interfaces, phase shifts occur that can offset some of the Gouy phase mismatch, producing far-field THG radiation within the cone angle of the condenser lens. The interface sensitivity of THG is not clearly seen in the TSFG images, indicating that phase-matching conditions are more relaxed in TSFG. [25] Consequently, like CARS, TSFG can be used as a general imaging method that is not significantly affected by interface effects.

TSFG microscopy of biological tissue samples
For imaging of biological tissue samples it is important to reduce the effects of heating of the sample due to MIR absorption, as well as minimize the average power of the pump beam. In the following, we reduce the MIR power to 3 mW and the pump power to 30 mW, and study the feasibility to perform non-destructive imaging of tissue samples. In addition, to relax the imaging conditions, the samples were immersed in a D 2 O-based phosphate buffer. Figure 4(A) shows a TSFG image of a sectioned mouse eyelid skin containing the so-called Meibomian gland. Meibomian glands have previously been studied with CRS microscopy. [30,31] The gland consists of acini that produce the lipid-rich meibum, which is compacted in ductules and collected in a central duct for release through the gland's orifice. The image shows a cross-section through several ductules, which contain a high concentration of compacted wax esters and cholesterol esters. Despite the reduced average powers of the incident beams, the TSFG image clearly identifies the lipid-rich regions in the tissue when the MIR frequency is set at 2820 cm −1 . When tuned off-resonance to 3005 cm −1 , the signal from the meibum-rich areas becomes very faint, emphasizing the vibrational contrast of the TSFG signal. Note that unlike in Fig. 3, the lipid-rich regions appear as positive signals relative to the surrounding matrix. This is due to the much weaker signal of D 2 O compared to H 2 O at 3005 cm −1 . The corresponding CARS images, taken with 1 mW of Stokes light and shown in panels 4(C) and 4(D), mirror the observations seen in TSFG, confirming the lipid origin of the observed TSFG image features.
The TSFG signal from the tissue disappears when either the ω i or ω p beam is blocked, or when the pulse trains are temporally offset. We find that the TSFG signal is spectrally confined to the expected wavelength range (445-450 nm), and no significant emission is observed that is red-shifted from the expected TSFG wavelength. This implies that nonlinearly excited fluorescence does not significantly contribute to the observed signal under the current excitation conditions.
We observe that the signal magnitude, image resolution and contrast rapidly deteriorate as a function of depth of the image plane in the tissue. We obtain good quality images down to ∼ 25 µm in the tissue, but the magnitude of the TSFG signal is insufficient for generating acceptable images deeper into the tissue under the excitation conditions used here. This signal deterioration can be attributed to several factors, including attenuation of the MIR excitation beam and the low quality of the focal volume deeper into the tissue due to scattering. Nonetheless, as shown here, TSFG microscopy produces acceptable images of thin tissue sections, which is a common target in MIR microspectroscopy applications.

Discussion
In this work, we have examined the TSFG signal of samples and materials of biological relevance. The TSFG technique is based on an optical nonlinearity that has not been used before for imaging biological samples. Although electronically resonant and nonresonant versions of the TSFG interaction have been used as contrast mechanisms in NLO microscopy [32,33], a vibrationally sensitive variation of the technique constitutes a new approach to biological imaging. The MIR-based TSFG interaction is exclusively sensitive to IR-active vibrational modes [25], and thus offers a direct NLO analogue of MIR absorption microspectroscopy. An attractive feature of TSFG is that it can be integrated into a laser-scanning NLO microscope. [25] This implies that IR-vibrational signals can be acquired alongside Raman-vibrational signals through CRS on the same imaging platform. We have not used advanced preparation protocols to specifically enhance the TSFG signal. Samples are mounted on a standard microscope glass slide (thickness 1 mm) and covered with a borosilicate coverslip (thickness 0.17 mm). When illuminating the sample from the coverslip side, MIR attenuation is present but acceptable, and imaging can proceed in a practical manner. Since the TSFG signal is in the 450 nm range, it is not significantly attenuated by the thicker microscope glass slide.
The experiments show a clear spectral dependence of the TSFG signal. Similar to the vibrational signatures in CARS, the vibrational lineshapes in TSFG appear dispersive, indicating the involvement of a nonresonant background. The dispersive lineshape is a clear indication that the signal is coherent, i.e. it does not rely on MIR-induced population transfer to a vibrational state followed by an incoherent hyper-Raman anti-Stokes interaction. The latter mechanism would produce dissipative lineshapes, contrary to what is observed in the experiments discussed here.
The TSFG signal of water in the CH-stretching range of the spectrum shows a significant spectral dependence. This indicates the involvement of the resonant excitation of OH-stretching modes, which extend well into this range in the IR absorption spectrum. Nonetheless, the TSFG signal of pure water at 2820 cm −1 is less than the TSFG signal from bulk lipid, enabling the relatively straightforward visualization of lipid materials in aqueous media. Although the TSFG signal itself is not overwhelmed by a water signal, direct absorption of the MIR radiation by water, followed by heating, cannot be avoided. Therefore, similar to any MIR-type measurement, caution should be taken to limit sample heating through absorption by water. Besides heating, absorption also affects the penetration depth. This is especially challenging in the CH-stretching range, the focus of the present work, but should be more relaxed in the fingerprint region where absorption by water is significantly less.
The spatial resolution in TSFG microscopy is determined predominantly by the two-photon interaction of the NIR pump beam, offering an imaging resolution in the 0.5 µm range. This resolution is virtually independent of the MIR wavelength, and is thus constant while scanning the vibrational energy in the imaging experiment. The TSFG process probes the MIR-induced vibrational coherence of the sample, and is not directly sensitive to the thermal population of vibrational states in the material. This implies that the resolution is also independent of light induced heating and the subsequent heat dissipation dynamics. In other words, the resolution should be the same in samples with either slow or fast heat dissipation properties.
The TSFG signals from tissue sections are reasonable and show clear vibrational contrast. The imaging properties of the TSFG modality are very comparable to those of CARS. Images could be acquired at low average powers of the incident beams, ensuring sample integrity during imaging. We found that the TSFG signal was acceptable up to tissue sections with a thickness of ∼ 25 µm. Thicker tissue samples gave rise to deteriorating TSFG signals, both in terms of signal strength and resolution. Adaptive optics approaches may help to improve the image quality at deeper depths, although the penetration depth is determined primarily by absorption rather than by scattering alone. Nonetheless, even without further improvements, TSFG already appears to work well with thin tissue slices, i.e. the type of samples that are of relevance to histopathology.
Among other factors, the image quality is also determined by the focusing properties of the microscope objective. Although the reflective objective enables achromatic focusing of NIR and MIR laser beams, the overall confinement and quality of the focal volume is substantially less compared to a refractive objective. The limited axial resolution of the Cassegrain objective implies that smaller features in the image are lost due to substantial signal integration along the axial dimension. To achieve high quality depth-resolved images similar to images commonly obtained in CRS with refractive objectives, it will be necessary to develop dedicated refractive lens systems that can handle both NIR and MIR light.
The TSFG technique brings some of the advantages of NLO imaging into the realm of IR-based microscopy. It offers the convenience of laser-scanning microscopy, a higher resolution and simplification of signal detection. Effectively, it moves the complexity of the experiment from the detection side to the excitation side. The picosecond MIR OPO system used in this work is a reliable and robust source, producing tunable narrow band radiation for selective excitation of vibrational modes. Conceptually, this form of microspectroscopy is similar to discrete frequency IR (DFIR) microscopy carried out with QCL excitation in the fingerprint region. [7] However, compared to an array of QCL modules, the OPO light source has a much larger footprint and requires occasional realignment. Further improvements of the light source will help make TSFG more practical. For instance, the picosecond pulses in our current system are not fully compressed to their Fourier transform limited pulse duration. There is much room for shortening the pump pulse, improve and shorten the output pulses from the OPO, and increase the TSFG signal efficiency by at least one order of magnitude. These simple measures would increase the effective image acquisition time and make TSFG more practical and robust. The success of techniques like SFG and TSFG, which rely on the generation of tunable MIR light, are likely to drive further developments of the light source, including reducing its footprint and cost.
Although the work here focuses on vibrational imaging in the CH-stretching range (∼2800-3000 cm −1 ), the real benefits of TSFG are likely found in the fingerprint region (∼800-1800 cm −1 ). The vibrational response of IR-active CH-stretching modes is weak compared to some of the most prominent fingerprint modes. Indeed, whereas in Raman spectroscopy the strongest spectral lines are found in the CH-stretching range for biological materials, in IR-based spectroscopy the strongest modes are located in the fingerprint region. Although the fingerprint vibrational range is outside the tuning range of our MIR OPO system, it is possible to perform an additional difference frequency generation (DFG) step to generate picosecond light of sufficient pulse energy in the 5-10 µm regime. With much higher vibrationally resonant χ (3) values in the fingerprint region, combined with the | χ (3) | 2 dependence of TSFG, we may expect signals that are up to two orders of magnitude stronger than in the CH-stretching range. The strong fingerprint TSFG signals are likely to outperform CARS in this range, where CARS signals are weak and often overwhelmed by background contributions. Therefore, the fact that TSFG microscopy already produces acceptable images in the challenging CH-stretching range is an encouraging prelude to TSFG imaging in the fingerprint region.

Conclusion
TSFG microscopy is a new technique that enables the generation of images with contrast based on the IR-activity of molecular vibrations. The TSFG process relies on a third-order light-matter interaction and is sensitive to the χ (3) properties of the sample. This nonlinear version of the IR microscope offers a high lateral resolution of 0.5 µm and an axial resolution of about 10 µm in its current implementation, thus enabling intrinsic 3D imaging with vibrational sensitivity. The technique can be incorporated into a laser-scanning microscope that uses common visible photodetectors, offering straightforward visualization of biological samples in a manner similar to coherent Raman scattering microscopy. With anticipated improvements of the TSFG approach, fast 3D IR imaging is within reach, and high-resolution mapping of tissues in the fingerprint region is a realistic possibility.