Artifact free time resolved near-field spectroscopy

We report on the first implementation of ultrafast near field measurements carried out with the transient pseudoheterodyne detection method (Tr-pHD). This method is well suited for efficient and artifact free pump-probe scattering-type near-field optical microscopy with nanometer scale resolution. The Tr-pHD technique is critically compared to other data acquisition methods and found to offer significant advantages. Experimental evidence for the advantages of Tr-pHD is provided in the near-IR frequency range. Crucial factors involved in achieving proper performance of the Tr-pHD method with pulsed laser sources are analyzed and detailed in this work. We applied this novel method to femtosecond time-resolved and nanometer spatially resolved studies of the photo-induced effects in the insulator-to-metal transition system vanadium dioxide. © 2017 Optical Society of America OCIS codes: (320.0320) Ultrafast optics; (180.4243) Near-field microscopy. References and links 1. D. N. Basov, R. D. Averitt, D. van der Marel, M. Dressel, and K. Haule, “Electrodynamics of correlated electron materials,” Rev. Mod. Phys. 83(2), 471–541 (2011). 2. R. D. Averitt and A. J. Taylor, “Ultrafast optical and far-infrared quasiparticle dynamics in correlated electron materials,” J. Phys. Condens. Matter 14(50), R1357–R1390 (2002). 3. A. H. Zewaii, “Femtochemistry: Atomic-Scale Dynamics of the Chemical Bond,” J. Phys. Chem. A 104(24), 5660–5694 (2000). 4. T. Kampfrath, K. Tanaka, and K. A. Nelson, “Resonant and nonresonant control over matter and light by intense terahertz transients,” Nat. Photonics 7(9), 680–690 (2013). 5. J. Orenstein, “Ultrafast spectroscopy of quantum materials,” Phys. Today 65(9), 44–50 (2012). 6. M. Liu, H. Y. Hwang, H. Tao, A. C. Strikwerda, K. Fan, G. R. Keiser, A. J. Sternbach, K. G. West, S. Kittiwatanakul, J. Lu, S. A. Wolf, F. G. Omenetto, X. Zhang, K. A. Nelson, and R. D. Averitt, “Terahertz-fieldinduced insulator-to-metal transition in vanadium dioxide metamaterial,” Nature 487(7407), 345–348 (2012). 7. D. J. Hilton, R. P. Prasankumar, S. Fourmaux, A. Cavalleri, D. Brassard, M. A. El Khakani, J. C. Kieffer, A. J. Taylor, and R. D. Averitt, “Enhanced Photosusceptibility near Tc for the Light-Induced Insulator-to-Metal Phase Transition in Vanadium Dioxide,” Phys. Rev. Lett. 99(22), 226401 (2007). 8. V. R. Morrison, R. P. Chatelain, K. L. Tiwari, A. Hendaoui, A. Bruhács, M. Chaker, and B. J. Siwick, “A photoinduced metal-like phase of monoclinic VO2 revealed by ultrafast electron diffraction,” Science 346(6208), 445–448 (2014). 9. A. X. Gray, M. C. Hoffmann, J. Jeong, N. P. Aetukuri, D. Zhu, H. Y. Hwang, N. C. Brandt, H. Wen, A. J. Sternbach, S. Bonetti, A. H. Reid, R. Kukreja, C. Graves, T. Wang, P. Granitzka, Z. Chen, D. J. Higley, T. Chase, E. Jal, E. Abreu, M. K. Liu, T. C. Weng, D. Sokaras, D. Nordlund, M. Chollet, H. Lemke, J. Glownia, M. Trigo, Y. Zhu, H. Ohldag, J. W. Freeland, M. G. Samant, J. Berakdar, R. D. Averitt, K. A. Nelson, S. S. P. Vol. 25, No. 23 | 13 Nov 2017 | OPTICS EXPRESS 28589 #300935 https://doi.org/10.1364/OE.25.028589 Journal © 2017 Received 27 Jun 2017; revised 22 Sep 2017; accepted 23 Sep 2017; published 3 Nov 2017 Parkin, and H. A. Dürr, “Ultrafast THz Field Control of Electronic and Structural Interactions in Vanadium Dioxide” arXiv:1601.07490 (2016). 10. Z. He and A. J. Millis, “Photoinduced phase transitions in narrow-gap Mott insulators: The case of VO2,” Phys. Rev. B 93(11), 115126 (2016). 11. D. Fausti, R. I. Tobey, N. Dean, S. Kaiser, A. Dienst, M. C. Hoffmann, S. Pyon, T. Takayama, H. Takagi, and A. Cavalleri, “Light-Induced Superconductivity in a Stripe-Ordered Cuprate,” Science 331(6014), 189–191 (2011). 12. J. Zhang, X. Tan, M. Liu, S. W. Teitelbaum, K. W. Post, F. Jin, K. A. Nelson, D. N. Basov, W. Wu, and R. D. Averitt, “Cooperative photoinduced metastable phase control in strained manganite films,” Nat. Mater. 15(9), 956–960 (2016). 13. E. Dagotto, “Complexity in Strongly Correlated Electronic Systems,” Science 309(5732), 257–262 (2005). 14. E. Dagotto, T. Hotta, and A. Moreo, “COLOSSAL MAGNETORESISTANT MATERIALS: THE KEY ROLE OF PHASE SEPARATION,” Phys. Rep. 344(1-3), 1–153 (2001). 15. G. Campi, A. Bianconi, N. Poccia, G. Bianconi, L. Barba, G. Arrighetti, D. Innocenti, J. Karpinski, N. D. Zhigadlo, S. M. Kazakov, M. Burghammer, M. Zimmermann, M. Sprung, and A. Ricci, “Inhomogeneity of charge-density-wave order and quenched disorder in a high-Tc superconductor,” Nature 525(7569), 359–362 (2015). 16. F. Chen, M. Xu, Q. Q. Ge, Y. Zhang, Z. R. Ye, L. X. Yang, J. Jiang, B. P. Xie, R. C. Che, M. Zhang, A. F. Wang, X. H. Chen, D. W. Shen, J. P. Hu, and D. L. Feng, “Electronic identification of the parental phases and mesoscopic phase separation of KxFe2-ySe2 superconductors,” Phys. Rev. X 1(2), 021020 (2011). 17. M. M. Qazilbash, M. Brehm, B. G. Chae, P. C. Ho, G. O. Andreev, B. J. Kim, S. J. Yun, A. V. Balatsky, M. B. Maple, F. Keilmann, H. T. Kim, and D. N. Basov, “Mott Transition in VO2 Revealed by Infrared Spectroscopy and Nano-Imaging,” Science 318(5857), 1750–1753 (2007). 18. A. S. McLeod, E. van Heumen, J. G. Ramirez, S. Wang, T. Saerbeck, S. Guenon, M. Goldflam, L. Anderegg, P. Kelly, A. Mueller, M. K. Liu, I. K. Schuller, and D. N. Basov, “Nanotextured phase coexistence in the correlated insulator V2O3,” Nat. Phys. 13(1), 80–86 (2017). 19. M. K. Liu, M. Wagner, E. Abreu, S. Kittiwatanakul, A. McLeod, Z. Fei, M. Goldflam, S. Dai, M. M. Fogler, J. Lu, S. A. Wolf, R. D. Averitt, and D. N. Basov, “Anisotropic Electronic State via Spontaneous Phase Separation in Strained Vanadium Dioxide Films,” Phys. Rev. Lett. 111(9), 096602 (2013). 20. Y. Zhu, Z. Cai, P. Chen, Q. Zhang, M. J. Highland, I. W. Jung, D. A. Walko, E. M. Dufresne, J. Jeong, M. G. Samant, S. S. P. Parkin, J. W. Freeland, P. G. Evans, and H. Wen, “Mesoscopic structural phase progression in photo-excited VO2 revealed by time-resolved x-ray diffraction microscopy,” Sci. Rep. 6(1), 21999 (2016). 21. M. Liu, A. J. Sternbach, and D. N. Basov, “Nanoscale electrodynamics of strongly correlated quantum materials,” Rep. Prog. Phys. 80(1), 014501 (2017). 22. L. Stojchevska, I. Vaskivskyi, T. Mertelj, P. Kusar, D. Svetin, S. Brazovskii, and D. Mihailovic, “Ultrafast Switching to a Stable Hidden Quantum State in an Electronic Crystal,” Science 344(6180), 177–180 (2014). 23. T. L. Cocker, V. Jelic, M. Gupta, S. J. Molesky, J. A. J. Burgess, G. De Los Reyes, L. V. Titova, Y. Y. Tsui, M. R. Freeman, and F. A. Hegmann, “An ultrafast terahertz scanning tunnelling microscope,” Nat. Photonics 7(8), 620–625 (2013). 24. J. M. Atkin, S. Berweger, A. C. Jones, and M. B. Raschke, “Nano-optical imaging and spectroscopy of order, phases, and domains in complex solids,” Adv. Phys. 61(6), 745–842 (2012). 25. M. Wagner, A. S. McLeod, S. J. Maddox, Z. Fei, M. Liu, R. D. Averitt, M. M. Fogler, S. R. Bank, F. Keilmann, and D. N. Basov, “Ultrafast Dynamics of Surface Plasmons in InAs by Time-Resolved Infrared Nanospectroscopy,” Nano Lett. 14(8), 4529–4534 (2014). 26. G. X. Ni, L. Wang, M. D. Goldflam, M. Wagner, Z. Fei, A. S. McLeod, M. K. Liu, F. Keilmann, B. Özyilmaz, A. H. Castro Neto, J. Hone, M. M. Fogler, and D. N. Basov, “Ultrafast optical switching of infrared plasmon polaritons in high-mobility graphene,” Nat. Photonics 10(4), 244–247 (2016). 27. M. Wagner, Z. Fei, A. S. McLeod, A. S. Rodin, W. Bao, E. G. Iwinski, Z. Zhao, M. Goldflam, M. Liu, G. Dominguez, M. Thiemens, M. M. Fogler, A. H. Castro Neto, C. N. Lau, S. Amarie, F. Keilmann, and D. N. Basov, “Ultrafast and nanoscale plasmonic phenomena in exfoliated graphene revealed by infrared pump-probe nanoscopy,” Nano Lett. 14(2), 894–900 (2014). 28. M. Eisele, T. L. Cocker, M. A. Huber, M. Plankl, L. Viti, D. Ercolani, L. Sorba, M. S. Vitiello, and R. Huber, “Ultrafast multi-terahertz nano-spectroscopy with sub-cycle temporal resolution,” Nat. Photonics 8(11), 841–845 (2014). 29. M. A. Huber, F. Mooshammer, M. Plankl, L. Viti, F. Sandner, L. Z. Kastner, T. Frank, J. Fabian, M. S. Vitiello, T. L. Cocker, and R. Huber, “Femtosecond photo-switching of interface polaritons in black phosphorus heterostructures,” Nat. Nanotechnol. 12(3), 207–211 (2017). 30. M. A. Huber, M. Plankl, M. Eisele, R. E. Marvel, F. Sandner, T. Korn, C. Schüller, R. F. Haglund, Jr., R. Huber, and T. L. Cocker, “Ultrafast Mid-Infrared Nanoscopy of Strained Vanadium Dioxide Nanobeams,” Nano Lett. 16(2), 1421–1427 (2016). 31. S. A. Dönges, O. Khatib, B. T. O’Callahan, J. M. Atkin, J. H. Park, D. Cobden, and M. B. Raschke, “Ultrafast Nanoimaging of the Photoinduced Phase Transition Dynamics in VO2,” Nano Lett. 16(5), 3029–3035 (2016). 32. F. Kuschewski, S. C. Kehr, B. Green, Ch. Bauer, M. Gensch, and L. M. Eng, “Optical nanoscopy of transient states in condensed matter,” Sci. Rep. 5(1), 12582 (2015). 33. H. Wang, L. Wang, and X. G. Xu, “Scattering-type scanning near-field optical microscopy with low-repetitionrate pulsed light source through phase-domain sampling,” Nat. Commun. 7, 13212 (2016). Vol. 25, No. 23 | 13 Nov 2017 | OPTICS EXPRESS 28590


Introduction
Ultrafast optical techniques provide access to processes that occur with awesome rapidity, enabling novel routes to control and interrogate the complex energy landscapes of materials at the focus of modern condensed matter physics [1,2].Ultrafast techniques have provided insights into coherent motions at atomic length scales [3], excitation or interrogation of selective electronic, lattice, spin or magnetic modes [4][5][6], and domain growth [7].In materials where multiple degrees of freedom compete ultrafast studies have allowed researchers to identify the degrees of freedom associated with emergent phenomena [8][9][10].Additionally, ultra-short light pulses have granted access to hidden states of matter [11,12], creating novel opportunities for material discovery and control.
In the case of quantum materials with strong electronic correlations spatial complexity across phase transition boundaries demands that measurements be performed with nanometric spatial resolution [13].This spatially-resolved approach is needed to map phase inhomogeneities which are thought to play a fundamental role in emergent behavior of a broad class of quantum materials including, but not limited to: colossal magneto-resistance manganites [14], Cu and Fe-based High-Tc superconductors [15,16] and transition metal oxides [17][18][19].Merging ultra-fast techniques with nano-meter spatial resolution both brings the unique merits of ultrafast measurements to the nanoscale and enables the exploration of connections between spatial and temporal responses at extreme small time and length scales [20][21][22][23].It is therefore imperative to develop advanced tools for time-resolved investigation at the nanoscale.
Scattering type near-field optical microscopy (s-SNOM) is well suited for optical spectroscopy and imaging at 10-20 nm length scales.The spatial resolution afforded by this method is independent of the wavelength of radiation used [24].A number of works, where nano-Fourier Transform Infrared (FTIR) spectroscopy [25][26][27][28][29] and electro-optic sampling (EoS) [28] were used have provided a robust demonstration of the potential to couple ultrafast lasers to s-SNOMs to successfully circumvent the diffraction limit and gain access to timeresolved information at the nanoscale.Recent results have also shown the strong potential to perform rapid time-resolved nano-imaging with s-SNOM [30,31], which enables a detailed exploration of the role of inhomogeneities across quantum phase transitions in complex materials.All these results have demonstrated that ultrafast s-SNOM is a powerful technique with a bright future [25][26][27][28][29][30][31][32][33].We also remark that spatially and temporally resolved measurements utilizing Scanning Tunneling Microscopy (STM) have been presented [22,23].
One potential difficulty in s-SNOM measurements is that a large contribution from background radiation accompanies the desired near-field signals.Decades of experience with s-SNOMs gained by the nano-optics community have identified experimental practices that allow one to suppress the background and thereby acquire genuine, artifact-free near-field data.One potent approach for eliminating the impact of background radiation is the pseudoheterodyne detection (pHD) method [34].However, the pHD method is yet to be adapted to pulsed laser sources.In this work we demonstrate that the pHD method is compatible with pulsed laser sources.Based on extensive analysis and modeling we conclude that pHD acquisition is imperative, at least in specific cases that are detailed in Appendix C. We then present the results for a prototypical insulator-to-metal transition system VO 2 , gathered with Tr-pHD with a probe wavelength near 1.5 μm and a pump wavelength near 780 nm.These data are free from the ill influence of background radiation and set the stage for future spatio-temporal exploration of quantum materials at the nano-scale.

Overview of Time-Resolved Near-Field Techniques
The aim of this work is to develop a framework for time-resolved near-field measurements that are guaranteed to be artifact free.We begin by discussing the experimental components that are needed to gain access to genuine time-resolved near-field information Fig. 1(a).
The centerpiece is an AFM with the tip of the cantilever illuminated with infrared lasers.The metallic AFM probe is polarized by incident light, and together with its mirror image in the sample, generates an evanescent electric field that is confined to the radius of curvature of the tip (10-20 nm); a feat that stems from the near field coupling between the tip and the sample.The AFM tip is then re-polarized by the tip-sample interaction and radiation is scattered into the far-field [24,35].This radiation, which contains background radiation as well as radiation produced by the near-field interaction, is then sent to a detector.The backscattered radiation from the AFM is usually superimposed with light from a reference arm in order to form an interferometric receiver.Interferometry can be used to eliminate the multiplicative contribution of diffraction limited background radiation and provide phase information -as will be detailed below [34,36,37].Since the voltage generated by common detectors, u , is proportional to the light intensity rather than its electric field, we consider the square of the sum of all electric fields: Where , , and E E E are the electric field phasors, respectively, of the reference arm, the background contribution as well as radiation scattered from the near-field.To experimentally eliminate terms, which do not contain near-field information, the well established tapping technique [38] is commonly used.Within this approach, all terms which are not proportional to NF E can be made negligible (Appendix A).Thus, when the tapping technique is used the detected intensity contains only the terms: Typically, the amplitude of the electric field phasors from the background and reference arm are orders of magnitude larger than that from the near-field.Thus the last term in Eq. ( 2 , and it's complex conjugate, are negligible and will not be considered in the remainder of this paper.If no reference arm is used, only one term is measured (Appendix C), which is generally referred to as the self-homodyne detection (sHD) method: Data acquired within the sHD method are proportionate to the amplitude of the background electric field and coupled non-linearly to the background phase.These contributions introduce the so called "multiplicative background" contribution to the sHD signal (section 3, Appendix C).It is immediately clear from an examination of Eq. ( 3) that more advanced methods are required to eliminate the contribution of background radiation.We provide a qualitative discussion, as well as quantitative estimates, of the influence of background radiation in Appendix A and C.
If the reference arm is added, we are left with: 2 cos( ) 2 cos( ) which is often described as the homodyne detection (HD) method.The HD signal is background-free provided the amplitude of the reference field is much stronger than that of the background field, ref BG EE [39].To totally eliminate the multiplicative background contribution, the so-called pseudoheterodyne detection method (pHD) (sections 5, 6) has been devised, which leaves only the term [34]: Furthermore, as we will detail in section 5 it is possible to extract both the amplitude and phase of radiation from the near-field in the pHD method.Results generated using pHD are not influenced by background radiation.By raster scanning the sample while keeping the positions of the AFM and optics fixed, one is able to extract signal from sHD, HD, or pHD on a pixel-to-pixel basis and construct an image.
In order to gain access to time-resolved information we use pulsed laser sources, Fig. 1(a).Radiation from one channel is sent to the AFM to probe the sample's momentary state in the near-field, purple in Fig. 1(a).A second illumination channel is used to pump (or perturb) the sample at a well controlled time delay, Δt ps , preceding the probing event; the role of the pump is to transiently alter the state of the sample, red in Fig. 1(a).We utilize two digital boxcars [40] to measure the pump-induced change of the near-field signal (Section 4; Appendix B), by collecting simultaneously the signals X R s just before, and X P s at Δt ps after the pump pulses.We then plot the difference ( ) /  which is non-zero only if the pump transiently modifies the response of a sample at a given pixel.In our notation, the upper-script assigns the method of time resolved detection, i.e. pHD for pseudo-heterodyne detection, sHD for self-homodyne detection, and HD for homodyne detection.

Time-Resolved Near-Field Studies of VO2
In this work we investigated thin films of vanadium dioxide (VO 2 ): a correlated electron material that undergoes an insulator-to-metal (IMT) transition above room temperature.The highly oriented VO 2 films on [001] R TiO 2 substrate, as well as polycrystalline samples on Al 2 O 3 substrates, were fabricated by the pulsed-laser deposition method; details of thin film fabrication and characterization have been reported elsewhere [17,41].Static near-field imaging works have shown that VO 2 experiences a percolative phase transition with coexisting insulating and metallic states amidst the IMT [17,19,41].The transition temperature of VO 2 films can be tuned by epitaxial strain [42].In general, compressive (tensile) strain along c R yields T IMT lower (higher) than in bulk [41][42][43].Bulk crystals and unstrained polycrystalline films on sapphire substrates usually have an IMT close to T IMT = 340 K. Films on [001] R TiO 2 substrate are compressively strained along c R , leading to a T IMT < 340 K [41].Topographic corrugations, or "buckles", locally relieve the strain in samples grown on TiO 2 [001] R .This creates a gradual increase in T IMT in mesoscopic region in the proximity to the center of the buckles.The mid-infrared optical response of the highly inhomogeneous IMT in VO 2 films grown on Al 2 O 3 has been previously characterized with static s-SNOM [17].The character of emergent domains can be classified as being in the random field Ising universality class, at least in a narrow temperature range surrounding the IMT [44].Data obtained by ultrafast [8][9][10]45] and nanoscale [17] methods have also provided insights into the long-standing debate regarding the roles of electronic or structural effects in the IMT, possibly revealing the existence of a monoclinic metallic state [46].The pioneering studies on nanoscale dynamics in VO 2 have recently been published [30,31].
In the present work we have investigated VO 2 thin films using s-SNOM that we have adapted for transient pump-probe experiments.In all measurements presented in this manuscript near-infrared radiation with 1.55 μm center wavelength and 15 nm bandwidth were used as the probe at the repetition rate of 600 kHz.For the pumping channel we used 1 mW of 780 nm radiation at a repetition rate of 300 kHz. at a time delay of Δt ps = 300 ps, which is much longer than the approximately 100 fs pulse duration of the pump and probe pulses.This late time delay was chosen as the peak pump induced change in reflectivity observed in [7], by Hilton et al., was not fully formed until Δt ps  300 ps.In Fig. 1(c the same time delay.With Tr-sHD a clear contrast is observed along buckles in our film, Fig. 1(b) whereas this is not the case in data taken on a similar region of the VO 2 thin film using Tr-pHD under identical pumping conditions Fig. 1(c).We emphasize that no pump-induced features above our noise floor are observed in the data in Fig. 1(c).A detailed analysis of results generated in the Tr-sHD method, which is presented in Appendix C, shows that the response displayed in Fig. 1  The ultrafast probe beam (purple) is focused onto the apex of an AFM probe at a precise time delay following a perturbation caused with a second ultrafast pump beam (red).Static infrared image, which was collected with the Tr-pHD method using the 5th harmonic of the tip-tapping frequency with a pulsed laser source.This image was obtained on a representative 10x10 μm 2 region.This image reveals metallic regions (gold) due to the compressive strain of the substrate as well as insulating regions (blue), where the film is strain relieved.b) Tr-sHD results obtained on the VO 2 /TiO2 [001] sample in a 5x5 μm 2 region at the pump probe time delay Δt ps = 300 ps.c) Tr-pHD results obtained on the VO 2 /TiO2 [001] sample in a 5x5 μm 2 region at the same time delay as in panel (b). ) HD method where a reference arm is added.c) pHD method where the reference arm position is modulated at a frequency M. d-f) Signals acquired using the detection methods in a-c.d) sHD signal, which shows peaks at high harmonics of the tip tapping frequency nΩ.e) HD signal, which shows that the magnitude of the peaks at nΩ are enhanced.f) pHD signal, which shows that, the peak at nΩ has returned to its sHD value.Additional peaks appear at the sum and or difference frequencies between the high harmonics of the tip tapping frequency and the reference arm nΩ +/NM.g) Schematic of the pulses involved and relevant time scales.In the schematic we show the individual pump (red), probe (purple) and reference (blue) pulses on the femtosecond timescale.A much longer time delay, Δt sswhich is the inverse of the repetition rate of the laser systemis indicated by the dashed line.The dashed line separates the first (ON) event, where both the pump and probe pulses arrive at the sample and a second (OFF) event where only the probe pulse arrives at the sample.This process is periodically repeated, and data are collected by separately integrating the detected voltage from many ON and OFF events.In the case of HD and pHD methods radiation in the reference arm (blue) temporally overlaps with the probe radiation.In the case of the pHD method, the time delay between reference and probe light, Δt rs is modulated sinusoidally at a frequency M.

Methods for Time-Resolved Near-Field Detection
We now proceed to develop a detailed comparison between the aforementioned dataacquisition protocols.Within the sHD method, Fig. 2(a), one utilizes a 50/50 beam splitter to collect back-scattering radiation from the AFM probe (Neaspec GmbH.) and guide this radiation to a detector.The AFM probe is tapped at a frequency Ω in the immediate proximity of the sample, which creates observable peaks at nΩ (with n = 1, 2, 3, etc.) when the detected signal is plotted against frequency, Fig. 2(d).The HD method Fig. 2(b) requires the addition of a reference arm configured in an asymmetric Michelson interferometer scheme.The path length difference between the interferometer and the backscattered radiation from the AFM probe is set to zero such that both pulse trains interfere constructively, which enhances the signal at nΩ, Fig. 2(e).The pHD method requires the path-length difference between the reference arm and sample to be modulated at a second frequency, Μ, Fig. 2(c).In pHD, the signal contained in the peaks at nΩ is partly transferred to sidebands separated by the tip tapping frequency, at nΩ +/ NΜ, (where N = 1, 2, etc.), Fig. 2(f).The suppression of farfield background is strongly affected by the choice of imaging method, as qualitatively outlined in section 2 and quantitatively modeled in Appendix C.
In Fig. 2(g) we show a schematic representation of the pulses involved and indicate the relevant time scales.To attain the highest possible signal-to-noise of the transient component of the near-field signal we adapted a boxcar-based approach [41] to time-resolved s-SNOM measurements.In this approach one utilizes a pair of probe pulses.The first (ON) probe pulse follows the pumping event at time delay Δt ps , marked by the red arrow in Fig. 2(g).This pulse provides signal associated with the pump-induced state of the sample at a time delay Δt ps .An electro-optic, or acousto-optic, modulator is used to eliminate the following pump pulse: an OFF event in our notation.The OFF probe pulse, therefore, arrives at a much later time delay Δt ps + Δt ss after the pumping event.Provided the sample has recovered its unperturbed state at Δt ps + Δt ss , the OFF signal contains information about the sample's unperturbed steady state.The intensities from both ON and OFF probe pulses are measured in a photoreceiver, whose response time is faster than the wait time between probe pulses, Δt ss (Appendix B), and the output is electronically integrated with a digital Boxcar (Zurich UHF-BOX).This process is repeated periodically and the integrated intensity values are registered as discrete data points at half of the repetition rate of the laser system.Standard lock-in demodulation of the boxcar outputs feeds the tapping harmonics of both ON and OFF probe pulses, provided that the repetition rate is sufficiently fast to satisfy the Nyquist criterion (Appendix B).The difference in the voltages demodulated from the ON and OFF pulses yields the information of reversible pump-induced changes to the sample.
Interferometric detection, where a pulsed laser is used, implies that the reference pulses temporally overlap with those from the sample on the detector.In Tr-HD, this is accomplished by using a micrometer stage to minimize the temporal mismatch between the tip-sample and reference arms, Δt rs in Fig. 2(g), which places their interference at a constructive maximum.In Tr-pHD, the temporal-mismatch between probe and reference arms, Δt rs , is first minimized and then modulated sinusoidally at a frequency Μ [34].By detecting the intensity at sidebands of the near-field signal nΩ+/NM the Tr-pHD signal collects interference terms between the near-field and reference arm (Eq.( 3)), thereby eliminating the multiplicative background contribution.In the limit that nΩ is sufficiently high to render the additive background contribution negligible (Appendix A) results collected with Tr-pHD are guaranteed to be background free.

Pseudoheterodyne detection for Artifact-Free Time-Resolved Near-Field imaging
In a static setting, pHD has been reliably used in a wide array of nano-infrared experiments over the past decade [17][18][19]34,47].To determine if the pHD method is compatible with pulsed laser sources and time resolved measurements, we proceed to discuss the quantitative details of the transient pseudoheterodyne method (Tr-pHD).
The first main benefit of the Tr-pHD method is that the influence of background radiation is eliminated, as shown in section 2. The second main benefit of Tr-pHD is that the amplitude and phase of the scattered field from the tip-sample interaction can be simultaneously extracted, analogous for CW laser sources, as shown in section 5 [34].The finite temporal duration of an ultrafast laser, however, implies that there is a finite bandwidth associated with the pulse train.In the case of broadband laser sources each frequency component of the detected signal is characterized with its own amplitude and phase.It is therefore prudent to examine the extraction of the pHD amplitudes and phases in the case of broadband pulsed laser sources.
In order to evaluate what signals are recorded by a lock-in amplifier, we consider the detected intensity.Note that the full expression for the detected intensity, when a pulsed laser source is used, is comprised of three periodic events.These are (1) the tip-tapping motion, (2) the reference arm phase modulation, (3) the arrival of laser pulses.We showed in Appendix B that the periodic train of laser pulses may be neglected in the demodulation of Tr-pHD signals.Therefore, we simply need to evaluate the appropriate Fourier expansion coefficients for the reference arm and tip-sample interaction [34].The Fourier expansion coefficient of the electric-field scattered from the tip-sample is , where we emphasize the dependence of the electric field phasor on optical frequency ω [28].To get the Fourier coefficient of the reference arm in the frequency domain we note that a sinusoidal variation in Δt rs , Fig. 2(g) implies that the spectral phase is modulated as  according to the Fourier shift theorem, where a m is the tip modulation amplitude: We used the Jacobi-Anger expansion in the second half of Eq. ( 6) to expand the reference arm electric field in terms of harmonics of the reference mirror modulation frequency.The Fourier expansion coefficient of the reference arm may be read directly from Eq. ( 6) as . Finally, as justified in the Appendix B, when the detected voltage is demodulated at frequency nΩ +/ NM the output is proportionate to the expansion coefficients, In the case of a continuous wave (C.W.) laser, we evaluate , nN u at a single frequency, 0 Equation ( 7) is identical to the formula derived by Ocelic et al., for the detected voltage demodulated at frequency nΩ +/ NM in [34].This equation can be further simplified when the first and second order Bessel functions are equal, i.e. when J 1 ( 0  a m ) = J 2 ( 0  a m ), which happens with 0  a m = 2.63.This condition is satisfied by setting the reference mirror's physical amplitude to Δl = ca m /4π 0  .The amplitude of the near-field signal is then recovered by taking: Likewise, the phase can be recovered with: where atan 2 (x) is the four quadrant inverse tangent of an argument x.This procedure provides a reliable method for extracting the near-field amplitude and phase in a static setting.We now proceed to analyze the pHD method for signal recovery in measurements employing a broadband pulsed laser.The condition that J 1 (  a m ) = J 2 (  a m ) cannot be simultaneously satisfied for all ω due to the finite bandwidth of the pulse train.To determine the voltage recovered with a broadband source, one must average over the frequency content of the laser, which is weighted by the frequency dependent response function of the detector element    for a particular choice of a m [28]: A cursory inspection of Eq. ( 10) reveals that the finite bandwidth of the laser source introduces several complications.Near-field information collected is necessarily averaged over the bandwidth of the laser pulse, which is weighted by the frequency content in the magnitude of the intensity of the detected light and further altered by the spectral responsivity of the detector.The detected response is also weighted by the Bessel functions J N (  a m ) which is a specific consequence of the pHD method.
We will now analyze the uncertainty of both amplitude and phase measurements within the pHD method by considering the worst-case scenario error.To do this, we calculated the pHD amplitude measured with a broadband source, Eq. ( 10), normalized to the monochromatic value: In our calculations, shown in Fig. 3 we make several simplifying assumptions that yield a worst-case scenario estimate of the error.We consider a case that the spectral field, () box function in the frequency domain, shown in Fig. 3(a), whose inverse Fourier transform has a pulsed nature (inset).We also neglect second order, and higher, spectral modifications to the phase so that ref   is a constant.In Fig. 3(b) we plot the normalized pHD amplitude Eq. ( 11), which is measured by a pulsed laser source, as a function of the laser bandwidth to center wavelength or relative bandwidth, Δω/ω c .We note that there are two when pHD is used with pulsed laser source.These are: (1) the amplitude that is recovered with a broadband laser source is less than what would be recovered with a monochromatic laser source; (2) the pHD amplitude recovered from a pulsed laser does depend on relative difference between the near-field phase of the sample and that from the reference arm, However, we observe that both of the aforementioned shortcomings are drastically reduced provided narrowband laser sources are used, as is the case with most experiments.We proceed to quantify the extent to which the bandwidth of the pulse train may contaminate a measurement of the near-field amplitude.In Fig. 3 For relative bandwidths less than 0.05 the error is less than 0.5% and will not be observable in most nearfield measurements.We therefore conclude that the error remains below typical noise levels in near-field experiments when narrowband laser sources are used.The results presented in Fig. 3 show that the near-field amplitude and phase can be reliably recovered with pHD when a narrowband pulsed laser source is used.By incorporating a second illumination channel to pump the sample background-free timeresolved measurements of the near-field amplitude and phase may be carried out, red signal in Fig. 's 1(a), 2(g).When the boxcar approach is used to extract the Tr-pHD amplitude all common phase changes between the ON and OFF pulses are also canceled.These effects include drift of the reference arm phase relative to that from the sample, as well as static variations in the near-field phase, Section 4; Appendix B; Fig. 2(g).Thus, the only changes observed in the Tr-pHD amplitude will stem from pump-induced changes to amplitude and phase of genuine time-resolved near-field features.Therefore, Tr-pHD procedure provides a reliable method for extracting artifact free time-resolved near-field amplitude and phase.

Artifact-Free Time-Resolved Near-Field Results in the Near-IR Range
In the previous sections we discussed advantages of the Tr-pHD method.In Fig. 1 we have shown that while we detected a finite Tr-sHD signal in the VO 2 /TiO 2 [001] films, we were unable to reproduce these results using Tr-pHD.We note that long timescales for recovery from the metallic state to the insulating state is characteristic of VO 2 films grown on substrates where the thermal conductivity is close to or less than that of the film itself [48].These substrates include SiO 2 [31], and TiO 2 [19].The recovery timescale can extend to hundreds of μs in films grown on these substrates, which supports the view [19] that cumulative heating may be the dominant pump-induced effect at our repetition rate of 300 kHz and may account for the observation of zero pump-induced near-field signal when the probe arrives hundreds of picoseconds after the pumping laser.To overcome this difficulty, we examined VO 2 films on substrates with thermal conductivity significantly higher than that of VO 2 , where the recovery time is much less than 1.5 μs.These latter substrates include Al 2 O 3 , MgO, and Au.
In Fig. 4 we display the results obtained using Tr-pHD for a VO 2 / Al 2 O 3 film.In Fig. 4(a) we plot the AFM topography.Grains are clearly observed in this image, which are typical of polycrystalline films [17,41,49].In Fig. 4(b) we plot the static pHD data, where we observe slight variations in near-field signal at grain boundaries, which are probably due to a geometric modification to the local field enhancement.In Fig. 4(c Our pump fluence is below the threshold required to fully excite the IMT.This low-fluence regime is expected to show rapid (sub-ps) decay dynamics according to [30,50] and our results are consistent with these earlier reports.We attribute the pump-induced change in near-field signal at 0 ps t  to the injection of free carriers into the conduction band.The pump-induced change to the near-field amplitude, which were collected with Tr-pHD, shows a completely homogeneous response at this time delay, within our noise floor.The difference in our findings from those presented in [30] can stem from a number of factors including: the different wavelength of the probe we utilize, the differences between crystalline nano-beams utilized in [30] and granular films in Fig. 4 or the different data acquisition method used.Interestingly, while optical contrast is observed at grain boundaries in the static near-field image, we do not observe features at these locations in the image which displays the measured pump-induced difference in near-field signal.We speculate that the observation of optical contrast in the static image stems from a geometric enhancement of the near-field signal when the AFM probe is inside of the grain boundaries.This is common between the pumped and reference images that are acquired, and this signal enhancement cancels in the difference signal shown in Fig. 4(c) in lieu of a genuine variation in the carrier density at these locations.We note that the noise level, which is < 0.5% RMS is well below the observed pump induced change of 2% that is shown in Fig. 4(c).While the pump induced contrast is lower than in [30] this likely stems from the difference in probe wavelength used in our work and is in rough agreement for a pump fluence that is near, but below, the threshold needed to excite the IMT.The data presented in Fig. 4, which were collected with the Tr-pHD method, show time-resolved near-field images collected with the s-SNOM technique that are guaranteed to be artifact free.

Outlook and conclusions
In the header of Fig. 5 we briefly outline selected materials and phenomena that may be explored with pulsed laser sources in time-resolved and spectroscopic near-field measurements.At the longest wavelengths, THz s-SNOM is ideally suited to control and interrogate electronic properties of complex materials [4,51], Josephson plasmon resonances in layered superconductors [52], hyperbolic polaritons in topological insulators [53], spin precession in ferromagnets [54] and anti-ferromagnets [55], as well as vibrational [56] and rotational [57] motions in a wide range of systems.The mid-IR spectral range is sensitive to the plasmonic modes in graphene [26,27,47,58,59], hyperbolicity in Hexagonal Boron Nitride [60], phonon resonances [35,41] as well as the electronic properties of many materials [17][18][19]41].The pulse duration of mid-IR radiation is typically 40-200 fs, which is sufficient to gain access to timescales where electron-phonon, and electron-spin coupling have not yet brought the electronic system into thermal equilibrium [50].In the visible range several interesting spectral features such as excitonic modes in transition metal dichalcogenides [61,62], plasmonic modes in metals and topological insulators [63], as well as resonances related to interband transitions in insulators, across charge transfer and Mott-Hubbard gaps can be observed.The pulse duration of visible radiation, which can be in the range of 4-40 fs, also enables indirect access to resonant modes in the infrared spectral range such as coherent phonons and Raman active modes [64,65].Additionally, ultra short light pulses can be used for sub-cycle interrogation of processes excited with carrier envelope phase stable mid-IR [66] and THz pulses [55].We stress that the efficient background suppression afforded by Tr-pHD with pulsed laser sources may find use in a wide array of spectroscopic measurements in addition to time-resolved control and interrogation of matter at the nanoscale.
In Fig. 5 we also display the calculated near-field signals and additive background contribution in these spectral ranges.These results show that as the demodulation order of the tip-tapping harmonic is increased the background contribution tends toward zero more quickly than the near-field signal over the entire spectral range plotted.We emphasize that taking higher harmonics in the tapping technique does not eradicate multiplicative background artifacts, as discussed in Section 2 and Appendix C. Advanced techniques, such as Tr-pHD, are required to generate data that are immune from background complications.We note that pulsed laser sources are uniquely qualified for the task of generating radiation across the ultra broad spectral range displayed in Fig. 5 with a single light source.The extremely high peak power densities, which are commonplace in pulsed laser sources, are ideal for exciting non-linear processes thereby enabling generation radiation spanning from the EUV-THz.Thus, in addition to the time-resolved studies, which are at the focus of our analysis, Tr-pHD with pulsed sources may find great utility in ultra-broadband, static, characterization of samples as well.Coupling pulsed sources to an s-SNOM enables novel opportunities for steady-state and time-resolved characterization of samples over an ultrabroad spectral range.
In the visible range achieving an excessive near-field to background ratio requires that minimal tapping amplitudes and very high harmonics are used, which in turn implies that there is a significant sacrifice to the attainable dynamic range in pristine nanoscale measurements.Additionally, the ultrashort pulse durations imply a significant relative bandwidth, which in turn leads to error in the extracted pHD amplitudes as the authors discussed in the context of Fig. 3.More exotic techniques that do not rely on the tapping technique have been demonstrated where all of the detected radiation stems from the nearfield interaction [67,68].These techniques have the capacity to preserve high S/N ratios, as well as ultra short pulse durations of broadband visible radiation, without compromising the high levels of background suppression that are required for proper near-field detection.The techniques in [67,68] may, therefore, eventually provide a significant enhancement to the performance of Tr-SNOM experiments in the visible spectral range.In the mid-IR range the second or third harmonic provides nearly background free data.Interestingly, in the THz range even demodulation to linear order may provide adequate background suppression in many cases [51].Under these conditions Tr-pHD can be used for reliable artifact-free timeresolved near-field imaging.The main panel shows the magnitude of the background electric field phasor (solid lines) calculated as described in Appendix A for harmonics of the tip-tapping frequency s 1 (red), s 2 (yellow), s 3 (green), s 4 (light blue), s 5 (dark blue), s 6 (purple).We also show the calculated magnitude of the electric field phasor from the near-field (dots at 10 μm) in the identical color scheme.In the inset we show a schematic representation of scattering processes that yield the background electric fields plotted here and discussed in Appendix A, with radiation from the near-field indicated by the red arrow, radiation that is directly scattered from the tip-shaft indicated by the black arrow and radiation that is scattered off of the sample, and then by the tip-shaft indicated by the green arrow.The near-field contribution to the signal is found to significantly outweigh the background contribution for high harmonics of the tapping frequency throughout the entire spectral range plotted.
In conclusion, we have critically evaluated various detection protocols for time-resolved near-field measurements.Our modeling and experiments on VO 2 films show that the pHD method of acquiring transient pump-probe data is guaranteed to eradiate complications arising from multiplicative background.Furthermore, we demonstrated that for narrowband pulsed laser sources (Δω/ω c < 10%) pHD may be used in the same fashion as continuous wave laser sources -with the caveat that the pHD amplitude and phase recovered will be integrated over the bandwidth of the pulsed laser source.The limitations of, as well as novel time-resolved and spectroscopic possibilities using, Tr-pHD were detailed.Finally, we presented timeresolved nano-imaging data for VO 2 /TiO 2 films collected with the Tr-pHD method.The totality of data and analysis presented in this work indicates that Tr-pHD is a powerful tool for static and time-resolved nano-imaging and spectroscopy across a broad spectral range.between near-field and background information merits confidence that a pHD signal is attributable to the material's response in the near-field.In the case of time resolved studies, experimental observables constitute a small fraction (ΔR/R = O(10 2 -10 6 )) of the overall signal.While the demand for adequate sensitivity in a pump probe experiment raises the bar for requirements on the signal-to-noise ratios of the near-field signal, adequate suppression of the background radiation cannot be sacrificed to achieve a higher dynamic range.This section is intended to evaluate the possibility for time-resolved near-field signals to compete with transient features from the background contribution.
In the inset of Fig. 5 we show a schematic layout intended to illustrate possible origins of background contributions.The focal plane of the off-axis parabolic mirror, which is used to focus and collect light from the tip-sample interaction, can be brought above (black) or below (green) the plane of the sample (red).In each of these cases, radiation that interacts with the AFM probe can be back-scattered into the detector.As the AFM probe is lifted by a height ΔH, the backscattered radiation experiences a phase shift of magnitude 2/ Hcos      , where θ is the angle of incidence and λ is the wavelength of probe radiation.Therefore, throughout the tip-tapping cycle, the phase of radiation that is backscattered directly from the probe is modulated as Ω bg cos t

 
, with the tip-tapping frequency of Ω .The background electric field is given by: Where we have again used the Jacobi-Anger expansion in the second half of Eq. ( 12).Equation (12) shows immediately that background radiation will have a finite value at all harmonics of the tip tapping frequency.The background electric-field can, therefore, couple to the reference arm's electric field and produce a finite pHD signal in high harmonics of the tip-tapping frequency.While this source of background is an issue for static s-SNOM experiments, the background contribution in Eq. ( 12) is not affected by the pump-probe probe time delay, Δt ps in Fig. 2(g).This background source is, therefore, eliminated in Tr-pHD experiments.
It is also possible to measure a finite pHD signal from a background contribution that depends on the reflection coefficient of the sample.One situation in which this is possible is shown by the green beam in the inset of Fig. 5 where the focal plane of incident radiation is brought below the plane of the sample.In this case light that is reflected off of the sample, with reflection coefficient r scatt , scatters off of the AFM probe shaft and is brought to the detector.By symmetry the phase shift experience by this reflected wave will be bg   - Ω cos t  throughout the tip-tapping cycle.The background electric field in this case is given by: In this case, it is possible to measure a finite Tr-pHD signal from the background contribution, since Scatt r is a function of the pump-probe time delay, Δt ps .We note, however, that the background contribution is strictly diffraction limited and cannot vary on a deeply sub-wavelength length scale in real space.
In the case that the incident probe radiation is brought into the focal plane of the sample (red in the inset of Fig. 5), both aforementioned background sources contribute.Together with the bona fide scattered field associated with near-field interactions, the total electric field at high harmonics of the tip tapping frequency is given by: Where we have explicitly noted dependencies on the local spatial coordinate, x, and the pump-probe time delay ps t  .The electric field in Eq. ( 14), which includes both near-field and background contributions is mixed with the reference arm to generate the pHD signal in a realistic experimental setting.
In static s-SNOM experiments, a sufficiently high harmonic is taken such that the background contribution becomes negligible with respect to the near-field signal.Starting from Eq. ( 14) we note that the tapping amplitude, ΔH, is chosen such that 2 / 1 Hcos      to Taylor expand the Bessel functions to first order about zero.This gives us the total scattered electric field at the n th harmonic in a typical experimental setting [69]: It can be observed from Eq. ( 15) that the magnitude of the background term in a pHD signal will scale as (ΔH/λ) n .As the extent of the local evanescent wave that generates the near-field signal is approximately the tip-radius, ~20 nm, a tapping amplitude of ΔH ~60 nm is sufficient to provide a large modulation of the near-field signal at an arbitrary wavelength.However, by keeping tapping amplitude constant, the degree to which the background radiation is affected by the tapping motion is strongly wavelength dependent.We show the background contribution for harmonics S 1 (Red) -S 6 (Purple) as a function of wavelength in Fig. 5.The magnitude of near-field signal as a series of increasing harmonic order was calculated at a wavelength of 10 μm using the lightning rod model [35] and is shown by the dots on the right hand side of the plot, with the identical color scale as the background contribution.We note that, in the absence of strong resonant enhancement of the near-field signal, the ratio of near-field signal to background signal should be nearly independent of wavelength, provided that the tip remains a good electrical conductor in the frequency range of the probe and that the wavelength remains much larger than the tip radius.The magnitude of the near-field signal relative to the background was normalized by using the experimentally measured ratio of the near-field to background contribution in the second harmonic of the tip tapping frequency at the probe wavelength of 1.5 μm.This is the ratio of pHD S measured in when the AFM probe is in contact with the sample to the value pHD S of when the sample is fully retracted.The ratio of background to near-field contributions will, however, depend critically on the focused spot size as well as geometric factors so there may be significant error in the absolute comparison.

Appendix B: Demodulation of the Pseudoheterodyne Signal using a Periodic pulse train of Femtosecond Light Pulses
In the case of a periodic train of laser pulses, which are emitted at a repetition rate of ss t  , Fig. 2(g), the electric field of a pulsed laser source is generally expressed as: Where the frequency of the laser source is given by  .The electric field magnitude and phase   , ss E t p t  and   ss t p t   respectively, depend on the absolute time coordinate, t.The sum is over the set of all integers .In a general case the electric field from the reference arm, and backscattered radiation from the AFM probe can have distinct time dependent amplitude and phase.Noting that the backscattered radiation is accurately expressed as a Fourier series in terms of the harmonics of the tapping frequency, n, the general form of the electric field phasor from the sample is: In pHD, this is combined with the electric field from the reference arm: The two electric field phasors combine to form an intensity, Where we have introduced the repetition rate of the laser system, 1/ ss ss ft .To illustrate the influence of periodic laser pulse train we consider the simplified pulse structure -


. One can derive that Fourier coefficient of the pulse train is and laser sources with repetition rates of f <1 GHz, c p is approximately constant for at least the first thousand harmonics of the laser repetition rate.We note that one must consider a convolution of this intensity with the temporal response of the detector, which evolves on a much slower timescale.Thus, we can safely assume that p c is a constant for all harmonics of the laser repetition rate, which are measured and can be ignored in our notation that involves only proportionalities.One can observe that the detected voltage can be reduced to: For all frequencies, Ω / NM n  , which surround observable harmonics of the laser's repetition rate.It is then intuitively clear that in the case that a detector with a bandwidth Ωf D nf  is used, only the p = 0, of this Fourier series survive.The extraction of the pHD signal is then identical to the case with a C.W. laser, so that we call this mode of operation quasi-C.W.If a detector with a bandwidth f D f  is used, the boxcar technique effectively performs a sum over all of the observable harmonics of the laser pulse train where pT<f D .Each data point output by the boxcar integrator is, therefore, centered at zero frequency with a bandwidth f B = f/2k, where k is the number of averages considered in the pulse train and the factor of 2 comes from the Nyquist criterion.
Where n E (with n = 0, 1, 2, etc.) is the magnitude of the electric field at the n th harmonic of the tip-tapping frequency and n  is the optical phase of scattered light encoded in the n th harmonic of the tip-tapping frequency.The leading term term, 0 E , is largely unrelated to the tip-sample near field interaction so that this term may be accurately described as the background electric-field phasor, BG E (section 2).With increasing harmonic order the background contribution decays rapidly while the near-field contribution does not (Appendix A, Fig. 5).Thus, when high harmonics of the tip-tapping frequency, n E , are accessed by demodulation of the detected intensity only terms that are proportionate to electric field phasor scattered by the near-field interaction, NF E , contribute to the signal [34].Apart from a select few detection methods, such as electro optic sampling and photoconductive antenna detection, modern detectors measure the intensity of light rather than the electric field, as emphasized in Eq. (1).Since BG E dominates the high-harmonic component of signal by orders of magnitude the leading order term in the n th harmonic is [34]: To simplify the equations, we have introduced the phase difference . In a pump-probe experiment one is exploring the difference between signals collected from the sample's pump induced states, at a time Δt ps following the pump pulse, and its static states.We, therefore, need to add an additional term to Eq. ( 22) to form the Tr-sHD signal: It can immediately be appreciated that eight independent variables contribute to the Tr-sHD measurement; four of which are from the background electric field phasor.The presence of four BG variables, which all contain strictly diffraction limited information in Eq. ( 23) is discomforting.A cursory analysis of Eq. ( 23) reveals that the most troubling feature is an sHD experiment is not directly sensitive to the near-field phase.The coupled response between the, potentially, time dependent far-field phase and spatially dependent near-field phase can generate fictitious pump-induced features on the sub wavelength scale, Fig. 1(b).We note that this problem is not relevant if there are no pump-induced changes to the far-field phase.
We now consider the case that a reference arm is added, Figs.2(b), and (e).In this case one measures: Where we introduced, .EE [39].A quantitative estimate is also required to determine the possible contribution of background radiation in results generated with the Tr-HD method.In Fig. 6 and Table 1 we include such an estimate for near-IR range calculated using Eq. ( 22).In the case of Tr-HD (I) we use typical values for the magnitude electric field from the reference arm relative to that of the background as justified below (solid lines).In the case of Tr-HD (II) we set the magnitude of the background electric field equal to zero, which can be accomplished in practice [36,37].When Tr-HD (II) is combined with two-phase detection [38,47] the results are free from the influence of multiplicative background radiation and can be used to extract the amplitude and phase of the near-field signal, although this has not yet been demonstrated with pulsed laser sources.Other techniques, such as EoS or detection with a photo-conductive antenna are also immune from the multiplicative background contribution, and are compatible with rapid nano-imaging in the mid-infrared and terahertz frequency ranges [28,51].Thus there is a whole arsenal of nano-optics methods that offer means and ways for multiplicative background radiation suppression.
We proceed to apply the above considerations to the model scenario shown in Fig 6(a).This rather extreme, but plausible scenario, shows one single pixel in the field of view that is characterized by a transient response that is different from other pixels in the same field of view.In Fig. 6(b) we show the outcomes of this scenario evaluated with the help of Eq. ( 23).We see that a large ~4.5% Tr-sHD signal is predicted when the AFM probe is at the blue pixel.Furthermore, using reasonable experimental parameters for the optical constants of VO 2 , as detailed towards the end of this section, we have arrived at a Tr-sHD signal that is in approximate quantitative agreement with the measured near-infrared Tr-sHD results, Fig. 1(b).
By adding a reference arm it is possible to reduce, or completely suppress, the contribution of multiplicative background radiation.We proceed to calculate the anticipated time-resolved response of the Homodyne signal using Eq. ( 24) with the parameters given in Fig. 6(a).In Fig 6(c) the solid lines show the result using an experimentally reasonable electric field in the reference arm.Using reasonable intensity values for the reference arm, as described below, we observe that 1) the magnitude of the fictitious time-resolved response at the blue pixel is reduced by nearly one order of magnitude, and 2) a finite time-resolved response can now be observed at the red pixel.The artificial response at the blue pixel is, however, anticipated to remain dominant in a typical experimental setting.The dashed lines in Fig. 6(c 24).The artificial time-resolved response at the blue pixel is eliminated only in the limit that the sHD signal is removed, while the time-resolved response at the red pixels remains observable.Thus, we conclude that artifacts in HD data are expected to remain significant in a typical experimental setting.In order to perform background free detection, it is essential to eliminate the contribution from Tr-sHD detection in the measured signal.In Fig 6(c) we show the modeling results when a reference arm is added, which is known as homodyne detection Tr-HD (I) [39].By considering typical intensity values for the reference arm relative to the sHD contribution we calculate that the magnitude of the fictitious change at the blue pixel should be reduced by a factor of 5x, as shown by the solid lines.By modulating the amplitude of the reference arm it is possible to fully suppress the contribution of the multiplicative background, which is known as heterodyne detection, Tr-HD (II) [36,37].It is additionally possible to use a two-phase detection with Tr-HD (II) to extract the background-free near-field amplitude and phase [37], although this has not yet been applied to pulsed laser sources.Likewise, when Tr-pHD is used in conjunction with a proper utilization of the tapping technique (appendix A), as in our current approach, one extracts near-field amplitudes and phases that are guaranteed to be artifact-free (sections 2, and 5).
In our model the majority of pixels, displayed in red, were assigned a static near-field phase r  = 0. We assign the phase of a single pixel, shown in blue, at b  = 0.86 rad relative to the static phase of light scattered from the red pixels, which is the measured phase difference between the insulating and metallic states of VO 2 at our probe wavelength of 1.5 μm.We also include a pump-induced change in near-field phase at the red pixels of For the magnitude of the Tr-HD change, we used the measured value that 4% of the incident intensity is backscattered by the AFM.The precise value may vary strongly, however, as this depends on many factors including the sample's roughness, probe beam-waist and collection efficiency of the off axis parabolic mirror.Radiation from reference arm comes from a mirror with 95-98% reflection.Therefore, without attenuation, the reference arm intensity outweighs the sHD intensity by approximately 25x.Since the electric field from the reference arm, rather than its intensity, enters into Eq.( 24) a fully constructive interference between tipsample and reference arms is approximately 5x greater than the sHD value.A fully constructive interference between HD detection and the near-field signal, in the majority of the sample, further implies that ref r   .
The results that are anticipated with various detection methods, summarized in Table 1, demonstrate that time-resolved signal that bears little to no relationship to the genuine optical constants provided are anticipated with Tr-sHD and Tr-HD methods.We therefore conclude, that the BG contribution to data gathered with Tr-sHD detection could produce the response shown in Fig. 1(b) and can generate observable artifacts in a general setting.We summarize the results of the above discussion in Table 1.The values shown are for the pump-induced changes in VO 2 film in near-IR range at the moment when the pump and probe arrive at the sample at the same time, Δt ps =0.The genuine near-field pump-induced change in amplitude (Δs/s) and phase (Δ) are shown in the "actual" column for the red and blue pixel.It can readily be observed that only the near-field phase at the red pixel is nonzero.The calculated results of the Tr-sHD signal are shown in the third column.We see that in contrast to the "actual" change in near-field optical constants, the Tr-sHD signal is expected to show contrast at the blue pixel.When Tr-HD (I) is used, false contrast is reduced by a factor of 5x and the time resolved response of the optical properties of the red pixel become observable.When Tr-HD (II) is used the fictitious signal at the blue pixel is fully suppressed while the finite signal at the red pixel is enhanced, which is in accordance with the "actual" scenario.Finally, when Tr-pHD is used, as discussed in section 5, measurements are expected to yield accurate results for the near-field amplitude and phase.We conclude that in the case that a detection method that is not affected by multiplicative background radiation such as Tr-pHD, Tr-HD (II), or EoS, are used together with a sufficiently high harmonic in the tapping technique (appendix A) results are guaranteed to be artifact-free.
Figures 1(b), and (c) summarize our key experimental results.Here we plot data collected for VO 2 film on TiO 2 [001] R substrate.In Fig. 1(b) we plot the Tr-sHD signal, (b) is in quantitative agreement with an artificial response generated from the background radiation.The results of Figs.1(b) and (c) demand a critical evaluation of the data-acquisition protocols for time-resolved near-field measurements.

Fig. 1 .
Fig. 1.Infrared time-resolved nano-imaging experiment and results.a) Diagram showing the experimental apparatus.The ultrafast probe beam (purple) is focused onto the apex of an AFM probe at a precise time delay following a perturbation caused with a second ultrafast pump beam (red).Static infrared image, which was collected with the Tr-pHD method using the 5th harmonic of the tip-tapping frequency with a pulsed laser source.This image was obtained on a representative 10x10 μm 2 region.This image reveals metallic regions (gold) due to the compressive strain of the substrate as well as insulating regions (blue), where the film is strain relieved.b) Tr-sHD results obtained on the VO 2 /TiO2 [001] sample in a 5x5 μm 2 region at the pump probe time delay Δt ps = 300 ps.c) Tr-pHD results obtained on the VO 2 /TiO2 [001] sample in a 5x5 μm 2 region at the same time delay as in panel (b).

Fig. 2 .
Fig. 2. Schematic of detection methods.a-c) Various detection methods with the radiation from the probe (purple), reference arm (blue) and pump (red) shown.BMS = 50/50 Beamsplitter; RM = Reference Mirror; D = Detector.a) sHD method, backscattered light from the AFM is steered into the detector.b) HD method where a reference arm is added.c) pHD method where the reference arm position is modulated at a frequency M. d-f) Signals acquired using the detection methods in a-c.d) sHD signal, which shows peaks at high harmonics of the tip tapping frequency nΩ.e) HD signal, which shows that the magnitude of the peaks at nΩ are enhanced.f) pHD signal, which shows that, the peak at nΩ has returned to its sHD value.Additional peaks appear at the sum and or difference frequencies between the high harmonics of the tip tapping frequency and the reference arm nΩ +/NM.g) Schematic of the pulses involved and relevant time scales.In the schematic we show the individual pump (red), probe (purple) and reference (blue) pulses on the femtosecond timescale.A much longer time delay, Δt sswhich is the inverse of the repetition rate of the laser systemis indicated by the dashed line.The dashed line separates the first (ON) event, where both the pump and probe pulses arrive at the sample and a second (OFF) event where only the probe pulse arrives at the sample.This process is periodically repeated, and data are collected by separately integrating the detected voltage from many ON and OFF events.In the case of HD and pHD methods radiation in the reference arm (blue) temporally overlaps with the probe radiation.In the case of the pHD method, the time delay between reference and probe light, Δt rs is modulated sinusoidally at a frequency M.

Fig. 3 .
Fig. 3. Modeling amplitude errors in Tr-pHD method.a) Spectral field used in our calculation, ( ), vs frequency,  .The Fourier transform of this field is displayed in the inset.b) Near- field amplitude collected with a pulsed laser normalized to the value that is anticipated for a monochromatic source, s n /s cw .We plot this quantity against the relative bandwidth of the laser source, ( ) / c   ) we plot the Tr-pHD signal, which is the normalized relative difference of the pHD signals taken from the pumpinduced, observe a ~2% homogeneous increase in the Tr-pHD signal at approximately pump-probe overlap, 0 ps t  .

Fig. 4 .
Fig. 4. Artifact-free near-field data with a pulsed laser source.a) AFM data, which measures the topography, or local height, of the film in a 2x2 μm 2 region.b) pHD data with a pulsed laser source corresponding to the topography in panel a. c) Tr-pHD data that was collected simultaneously with Figs.4(a) and (b).

Fig. 5 .
Fig. 5. Numerical values of the near-field and background contributions in s-SNOM measurements and the spectroscopic observables that may be explored with Tr-pHD.In Vis -Near-IR spectral regions the temporal duration of laser pulses δt s is typically greater than 4 fs.In Mid-IR typical values of δt s are greater than 40 fs.In THz region one usually deals with δt s greater than 400 fs.Various spectroscopic observables are highlighted.TI = Topological Insulator; SC = Superconducting; TMD = Transition Metal Dichalcogenides; hBN = Hexagonal Boron nitride; FM = Ferromagnetic; AM = Anti-Ferromagnetic; CT = Charge Transfer; MH = Mott-Hubbard; 2DEG = 2D-Electron Gas.The main panel shows the magnitude of the background electric field phasor (solid lines) calculated as described in Appendix A for harmonics of the tip-tapping frequency s 1 (red), s 2 (yellow), s 3 (green), s 4 (light blue), s 5 (dark blue), s 6 (purple).We also show the calculated magnitude of the electric field phasor from the near-field (dots at 10 μm) in the identical color scheme.In the inset we show a schematic representation of scattering processes that yield the background electric fields plotted here and discussed in Appendix A, with radiation from the near-field indicated by the red arrow, radiation that is directly scattered from the tip-shaft indicated by the black arrow and radiation that is scattered off of the sample, and then by the tip-shaft indicated by the green arrow.The near-field contribution to the signal is found to significantly outweigh the background contribution for high harmonics of the tapping frequency throughout the entire spectral range plotted.
We emphasize that the radiation in the reference arm does not interact with the sample, thus the ref E and ref  are properties that cannot depend on the temporal or spatial coordinates of the sample.Therefore, results generated with Tr-HD signal will be valid if the amplitude of the reference field is much stronger than that of the background field, ref BG ) are obtained by setting in Eq. (

Fig. 6 .
Fig. 6.Model calculations for the Tr-sHD and Tr-HD signals in near-IR.a) Schematic showing the AFM probe on a pixelated surface.The dominant area of the sample is state indicated with red, and is assigned the near-field phase  r .A single pixel is blue, and is assigned the near-field phase  b .The numeric values of these phases are shown in the inset.b) Transient response of the sHD signal at a red pixel (red) and blue pixel (blue).c) Transient response of the HD signal at a red pixel (red) and blue pixel (blue).The solid line shows the predictions for a typical ratio of reference arm to sHD intensities.The dashed line shows the case that the sHD intensity is set to zero, where the fictitious result at the blue pixel is removed.

Table 1 .
Model calculations for the Tr-sHD and Tr-HD signals for VO2 film in near-IR range.The results of Tr-sHD were calculated using Eq.(23).The results of Tr-HD (I)were calculated using Eq.(24) with realistic magnitudes for the electric field of the reference arm relative to that of the background.The results of Tr-HD (II) were calculated using Eq.(24) with |EBG| = 0.The values displayed for Tr-pHD can be obtained using  is the pump probe time delay.We emphasize that the magnitude of   0.04 rad is modest for VO 2 .The BG phase is simply the area averaged phase of near-field pixels, and thus ps t   is the Heaviside function,  is an arbitrary relaxation time constant, and ps t