Image formation properties and inverse imaging problem in aperture based scanning near field optical microscopy

Aperture based scanning near field optical microscopes are important instruments to study light at the nanoscale and to understand the optical functionality of photonic nanostructures. In general, a detected image is affected by both, the transverse electric and magnetic field components of light. The discrimination of the individual field components is challenging, as these four field components are contained within two signals in the case of a polarization-resolved measurement. Here, we develop a methodology to solve the inverse imaging problem and to retrieve the vectorial field components from polarization- and phase-resolved measurements. Our methodology relies on the discussion of the image formation process in aperture based scanning near field optical microscopes. On this basis, we are also able to explain how the relative contributions of the electric and magnetic field components within detected images depend on the probe geometry, its material composition, and the illumination wavelength. This allows to design probes that are dominantly sensitive either to the electric or magnetic field components of light.


I. INTRODUCTION
In nano-optics, the domain of research that deals with the interaction of light with objects having a critical length scale in the order of a few up to hundreds of nanometers, the optical near field is a key quantity. Only in the near field the fraction of the angular spectrum that is associated with evanescent waves has a notable amplitude. And it is especially this fraction of the angular spectrum that carries information about the interplay between the illumination and an object that is smaller than the wavelength. The near field contains information that is inaccessible in the far field. In consequence, relying on traditional microscopes, to study nanooptical structures, provides only marginal insights. Accessing optical near fields by experimental means, therefore, is desirable to obtain insights into the optical but also the material properties of nanostructures.
This triggered the development of scanning near field optical microscopes (SNOM). This technique relies on the perturbation of the near field by a sharp tip. Roughly spoken, in this scattering process evanescent waves are converted to propagating waves, that can be detected by classical instruments. This basic idea led to different instrumental implementations.
In an apertureless (also called scattering) SNOM, a tip with an apex size of several nanometers scatters the light in the near field, that can be collected by traditional optical instruments in the far field. Depending on the geometry of the tip and the experimental setup, the scattered signal can be linked to a specific component of the electric field or to the electric field amplitude [1][2][3][4] .
Alternatively, more compact devices that base on an aperture version of the SNOM were developed. There, a tapered waveguide is covered with a metallic film. A small opening at the apex locally collects the light. For such aperture SNOM, which we study in this contribution, the mechanisms that are responsible for the image formation process are complex and the interpretation of detected images turned out to be difficult. Usually, the quantity that is measured has been understood only a posteriori to a measurement. This has been done by comparing measured images with simulated field distributions of selected electromagnetic field components. [5][6][7][8][9][10][11][12][13] This procedure led to no general agreement concerning the contribution of different field components to a detected image. To study the image formation process in detail, exper-imental investigations were made with carefully tailored electromagnetic fields that allow a clear discrimination of individual electric and magnetic field components in a detected image. Nevertheless, results obtained in different experimental situations persisted to be inconclusive. For example, it has been suggested by Burresi et al. in a seminal work 9 that a standard circular aperture probe is unable to detect magnetic field components and has, therefore, to be replaced by a split-ring type probe. On the other hand Denkova et al. 14,15 and Kihm et al. 16,17 concluded that the influence of magnetic field components in their detected images is dominant, despite the circular aperture they used. Eventually, Kohlgraf-Owens et al. 18 found indications for the simultaneous influence of the electric and the magnetic field components in a detected image. Their findings could connect the mutually exclusive statements that were found. Along these lines, Feber et al. investigated the relative influence of magnetic and electric field contributions in a detected image 19 . In their pioneering contribution they could show that the relative influence of the individual electromagnetic field components depends not only on the probes aperture diameter but also on the tips relative position to the sample. However, up to now a physical coherent explanation on the contributions of individual electric and magnetic field components in a detected image has not been given.
To close this gap, we develop here a theory to describe the image formation in an aperture SNOM. The theory is ab initio in a sense that it doesn't require any prior information from the outcome of a measurement. It is, therefore, truly predictive. Only the information on the experimental geometry, of course, is necessary. At its heart, our theory relies on an eigenmode expansion technique. Eigenmodes are solutions to Maxwell's equations in the absence of sources for specific geometries. The eigenmodes of free space are plane waves, of periodic optical structures the eigenmodes are Bloch waves, and of waveguides we have guided modes.
Here the nanoaperture at the apex of the tip is treated like a metal coated cylinder with the according transversal geometry. Analysing the supported eigenmodes of this nanoaperture and discussing its interaction with the free space modes enables a deeper understanding of SNOM measurements. The eigenmodes of the nanoaperture depend on the geometry of the tip, the material from which it is made, and on the wavelength. It is no surprise that in different configurations studied in the past, different conclusions were made on what has been measured. All these aspects emerge from our treatment of the image formation process. Moreover, beyond a mere descriptive work, we demonstrate in an extension to our theory the possibility to extract the complete vectorial electromagnetic field information from phase-resolved measurements.
The paper is divided into two parts. Sections II and III concentrate on the image formation process obtained by non-interferometric measurement techniques, whereas Sec. IV focusses on phase resolved measurements obtained by a heterodyne detection scheme.
The model to describe the image formation process is introduced in Sec. II. It is shown that generally the electric as well as the magnetic field components contribute to a detected image. This makes the interpretation of measurements tremendously difficult. Only by the investigation of electromagnetic fields with marginal spatial variations across the aperture, a simple discussion of the contributions of different field components is possible. This is shown in Sec. III. This requires the aperture of the SNOM probe to be sufficiently small compared to occurring field variations. In this quasi static approximation, it is possible to derive simple expressions that allow to conclude on the dominant field component in a specific image. Therefore, the field distribution of the eigenmodes of the nanoaperture are of key interest. They depend on the wavelength, the geometry of the probe, and the material properties. Both can be chosen on purpose and tips can be identified that allow to detect dominantly magnetic field components at specific wavelengths, whereas other tips can be identified to detect electric field components. Practical indications on which tip is suitable to measure which field component are also given.
To extract information on the entire electromagnetic field from a measurement without a priori knowledge, it will be shown in Sec. IV that phase-resolved measurements are mandatory.
Such measurement capabilities exist with current state of the art 9,12,13,19-22 . We will provide an algorithm to reconstruct the complete vectorial electromagnetic field information at the position of the tip from such a phase-resolved measurement. Our contribution, therefore,

II. THEORY
Understanding the coupling process of the investigated field, which carry structural information about the sample, into propagative modes in the fiber taper is at the heart of the image formation in an aperture SNOM. In the following, this conversion process will be analysed and we will unfold an analytical framework to understand the proposed theory to describe the image formation.
The aperture SNOM probe consists of a tapered dielectric optical fiber surrounded by a metal coating. In order to avoid any side coupling of light into the probe, the thickness of the metal coating is much larger than the skin depth at the wavelengths of interest. A subwavelength aperture is milled at the end of this taper (see Fig. 1). Due to the thick metal coating the coupling process of the investigated field into the tapered fiber occurs only via the aperture at the end of the taper.
This suggests that the aperture itself can be understood to convert the near into far fields that are guided by a fiber to the detector. To model the coupling process between the investigated field and the aperture at the apex of the SNOM tip, we treat the aperture as a cylindrical waveguide and calculate the eigenmodes. The excitation process of these modes by the external field to be investigated, reveals all crucial information about the detected images.
Due to the deep subwavelength size of the aperture, it supports only two propagative modes.
With z as the principal propagation direction, these are the mainly x-and mainly y-polarized states of the generalized HE 11 -modes in the aperture (see Fig. 2) 23,24 . To conclude from the knowledge of these excitation strengths of the aperture modes on an image measured by a detector, not only the aperture but also the complete measurement setup needs to be considered at the same time. We require here that the excitation strengths of the aperture modes are not modified by any coupling process with any other mode at a later stage of propagation through the system. This requires adiabatic modifications to the geometry of the fiber, e.g. while being tapered or being bended. Then, the modal amplitudes at the position of the detector are linearly related to the ones in the aperture.
The total power guided by the fiber to the detector is proportional to the absolute square of the excitation strengths of the supported modes 25 . Generally, the power can be written as P = ∞ i=1 |t i detector | 2 , assuming orthogonal and power normalized modes. In the considered case solely two modes are excited which are linearly related to the ones in the aperture. It follows, that the guided power at the position of the detector can be written as P ∝ |t x aperture | 2 + |t y aperture | 2 . Consequently, we identify a detected image I via the guided power at the position of the detector from the knowledge of the excitation strengths in the aperture as I = |t x aperture | 2 + |t y aperture | 2 .
These excitation strengths depend on the position of the probe relative to the sample. Consequently, a detected image, when the probe scans the sample at constant height, should be written as (1) Here the excitation strengths of the mainly linearly polarized modes in the aperture were renamed as t x,y := t x,y aperture . These amplitude coefficients correspond to the projection of the investigated field onto the modes supported by the aperture. It follows, that the detected image does not correspond to the investigated field components directly, than rather to the mentioned projection.
Keeping in mind the assumptions that were made, we have simplified with this understanding the complex problem of the image formation process in aperture SNOM to the question of understanding the coupling process of the two fundamental modes excited through the externally investigated field. This remaining problem can be regarded in terms of classical fiber optics and is in direct analogy to the treatment of splicing-and coupling losses in optical fibers.
The incident field will transmit power into the two propagative modes in the aperture, but also into the unbound radiation and evanescent modes supported by the aperture. Additionally, it also causes a partial reflection of the incident field. To extract the coupling coefficients of the system, Maxwell's boundary problem has to be solved at the position of the aperture. However, only the excitation coefficients for the guided modes are important.
They will be derived in the following. In this derivation we assume monochromatic fields with a time dependency proportional to e −iωt .
By formally introducing Dirac's notation, the transverse electromagnetic fields of the m th forward-propagating eigenmode in the aperture is denoted as |T + m , where m = 1, 2 corresponds to the mainly x-and mainly y-polarized mode of interest and any higher m stands for evanescent and radiation modes. The Dirac notation should be understood in the sense of a 4-component vector denoting the transverse mode profile as |M + m = [e xm (x, y), e ym (x, y), h xm (x, y), h ym (x, y)] , that needs to be computed by any mode solving technique. We do this here by a finite element method. Consequently, also the investigated incident and reflected field is regarded as a superposition of eigenmodes in free space.
We choose in the following a plane waves basis, where every plane wave is written in Diracnotation as |P ± k ⊥ . There, the ± sign either denotes forward or backward propagating waves. These are the incident and the reflected fields, respectively. k ⊥ stands for the unique transverse wave vector of the plane wave k ⊥ = (k x , k y ) .
The boundary problem, describing the coupling of the investigated field into the modes supported by the aperture requires continuity of all tangential field components at the aperture x, y, z = 0) (analogous for the magnetic fields H ⊥ ). This can be written as follows where symbolizes the discrete sum of the finite number of bound modes together with the integration of the infinite number of radiation modes in the system. Beneficially, Eqn. 2 can be solved by exploiting the unique properties of the modes that are governed by an unconjugated reciprocity 25,29 . An inner product can be defined as in between the supported eigenmodes of the structures, where n z is the unity vector in z-direction. The orthogonality relation for modes of the same eigenmodal system reads as with α m to be the normalization factor. On the base of this inner product, Eqn. 2 can be solved self-consistently requiring the explicit knowledge about the limited number of bound modes but also on the unlimited number of unbound radiation modes. The inclusion of the infinitely extended radiation modes on a numerically truncated grid is unfeasible. Therefore, further truncations depending upon the system to be investigated are required.
In the considered case for the description of the image formation in aperture SNOM the neglection of the reflection in Eqn. 2 is a reasonable assumption. This approximation can be considered as the first order perturbation theoretical approach to describe the system. Due to the deep subwavelength dimensions of the aperture of the tip the reflective response of the tip is weak and has a wide spectral bandwith in the angular plane wave spectrum, justifying the assumptions made. It corresponds to the assumption that the field to be measured is not perturbed by the tip of the SNOM.
By neglecting the reflection in Eqn. 2, the problem reduces to the expansion of the investigated field into the modes supported by the aperture. The explicit representation of the investigated field in the plane wave basis is no longer necessary and thus the incident field E(x, y), H(x, y) will be renamed in Dirac-notation as | E H . For the description of the image formation process the interest lies on the excitation strengths of the two propagative fundamental modes in the aperture. These excitation strengths can be calculated by projecting the modes of the aperture onto the investigated field t x,y (x, y) = T − 1,2 | E H . A detected image thus is described using Eqn. 1 by where E(x, y) and H(x, y) are the investigated field components and e − 1,2 (x, y), h − 1,2 (x, y) are the mode profiles of the two associated backward propagating mainly x-and y-polarized modes in the aperture.
From Eqn. 4 it can be seen that the information on the investigated field components in a detected image is encrypted through the overlap-integral equations. In general, a direct identification of investigated field components in a detected image is impossible, which makes the interpretation of measurements tremendously difficult.
The result can be regarded in terms of classical microscopy, where the image is formed correspondingly as the convolution between the point-spread function and the field of the investigated structure of interest in the case of an isoplanatic condition 30,31 . However, in contrast to classical microscopy, the point spread function in aperture based SNOM is completely vectorial in nature, encoding not only electric but also magnetic field components in a detected image. In this way, the propagative modes in the aperture can be interpreted as the coherent vectorial point spread function of the scanning near field optical microscope, defining the possible resolution (see Fig. 2).
It becomes evident that the dominant influence of either the electric or the magnetic field components in a detected image not only depends on the investigated field but also on the electromagnetic properties of the two modes accessible in the aperture. In the following, these influences will be analysed more precisely. As it was discussed in section II, the distinction of individual field components contributing to a detected image is generally impossible due to the encryption through the integralequations 4. To get a deeper insight into the relation between investigated field components and a detected image, the problem can be considered in the case of a quasi-static limit. It only requires to consider the investigated field as constant across the aperture. For sufficiently small apertures this assumption can always be satisfied. With this assumption, the overlap integral 4 describing a detected image can be rewritten. For the excitation strength corresponding to the mainly x-polarized mode in the aperture it follows that In Eqn. 5 two integrals are vanishing due to the odd symmetry of the corresponding modecomponents (see Fig. 2). In a next step, the value h 1y of the magnetic mode component is expressed through the value e 1x of the electric mode component At this point the mode impedance factor Z M is introduced, denoting the ratio between the electric and magnetic mode components. By repeating the calculations for the mainly y-polarized mode accessible in the aperture the result  is in agreement with the results that can be found in literature 5,6,[8][9][10][11][12][13][14][15][16][17]  has two individual contributions which are called amplitude-and phase term in the following. Since these parts influence a detected image differently, their properties will be discussed individually. In a first step of discussion, the phase term will be neglected. By analysing the influence of |H ⊥ | 2 + |Z M | −2 |E ⊥ | 2 to the image formation, the simultaneous influence of electric and magnetic field components becomes obvious. To analyse the relative ratio of electric and magnetic field components contributing to a detected image the ratio is evaluated. By replacing |E ⊥ | |H ⊥ | as the transverse field impedance |Z ⊥ |, the following result is obtained Γ = |Z ⊥ | 2 |Z M | 2 . It is the ratio between the investigated and the mode impedance that determines the relative influence of the individual electric and magnetic field components in a detected image.
Since the mode impedance |Z M | depends upon the geometry and the material of the tip, the detection properties of an aperture SNOM are strongly influenced by the choice of the probe.
In Fig. 3  By assuming an electromagnetic field with the impedance of free space Z 0 = 376 Ω to be investigated at a wavelength of λ ≈ 650 nm, the image detected with a gold coated tip would then strongly be influenced by magnetic field components. It would be the opposite for an aluminum coated tip, where a detected image would be dominated by electric field components.
Generally, for electromagnetic fields associated to a sample the investigated field impedance |Z ⊥ (x, y)| is a locally varying quantity dependent on the local optical quantities of the measured structure. In a measurement this behaviour is equivalent to the local variation of the dominant influence of either the electric or the magnetic field parts in a detected image.
This behaviour and its influences in the image formation process can serve as a possible explanation for the contradicting statements that can be found in the literature concerning which component of the field has been actually measured.
Additionally, an unexpected influence on a measurement can be caused by the phase term within the image formation process. Here local phase differences between local electromagnetic field components influence a measurement. It corresponds to an intrinsic coupling of electric and magnetic field components in a detected image. This can be interpreted as the influence of an interference term. To be interpreted not as a phenomena between different fields, than rather between different field components. This effect can lead to unexpected influences in a detected image, making them difficult to interpret.
However, for investigated fields, that additionally obey a slowly varying envelope approximation, the phase differences (φ Hy − φ Ex − φ Z M ) and (φ Hx − φ Ey − φ Z M ) remain approximately constant across the scanning area. The phase term then would affect a detected image only by a decrease in contrast. Although this seems to be a strong restriction, it actually constitutes to be valid in many practical situations.

IV. INVERSE IMAGING PROBLEM
In the previous sections, the model to describe the image formation process in aperture SNOM was introduced and the fundamental encryption of the electromagnetic field components in a detected image was discussed. To go one step further, we present in the following an algorithm to reconstruct the complete vectorial electromagnetic field information from a phase resolved measurement of the two excited eigenmodes. Such a measurement technique was developed during the last years and enables the measurement of a polarization-resolved signal in both amplitude and phase by using a heterodyne detection scheme 20 . The establishment of a heterodyne detection scheme for aperture based scanning near field measurements led to outstanding results 9,12,13,19,21,22 .

Polarization and phase resolved signal
In the context of the developed theory, it is natural to assume, that the polarization-and phase-resolved measured signal corresponds to the excitation strengths of either the mainly x-or y-polarized mode t x,y (x, y) accessible in the aperture. Within the quasi-static-limit the recorded signals correspond then either to a superposition of E x , H y or E y , H x (see Eqn. 6).
This result coincides with experimental conclusions presented in 19,21 . However, due to imperfections in the experimental setup, like fiber-bends, a polarization coupling can occur, causing a mixing of the individual detection paths. This problem can be solved by a prior calibration of the setup while investigating linearly polarized light. By independently in-specting linearly x-and y-polarized fields, the Jones-Matrix of the SNOM can be determined.
From the knowledge of these matrix-entries in amplitude and phase, the polarization distortion introduced by the measurement system can be analysed and it is possible to remove the distortions numerically to infer on the actual quantities of interest. Eventually, such calibration allows direct access to the complex, polarization resolved excitation coefficients A similar formula was shown in 19 for the direct explanation of measurement results based on reciprocity theory [32][33][34] . However, there further assumptions had to be made to retrieve the fields e 1,2 (x, y), h 1,2 (x, y), which naturally emerge from our treatment to describe the image formation process in aperture SNOM.

Full vectorial field reconstruction
In the following we will present an algorithm to reconstruct the investigated field components from a phase resolved measurement. Actually, from a single measurement only the two amplitude coefficients according to the two propagative modes in the aperture can be accessed. To inversely infer on the actual field, to be detected by the probe, also all the non accessible amplitude coefficients t 3, ... of the radiation modes are required to be known.
However, all the information carried by the infinite number of the remaining coefficients t 3, ... is lost and cannot be retrieved while measuring only two mode coefficients. Thus, vectorial field reconstruction seems to be impossible. The pertinent question concerns how the missing information from an aperture SNOM measurement can be retrieved that would allow to reconstruct the investigated field. The main idea, which is suggested to solve the problem, is the exploitation of the fact that the two amplitude coefficients t x,y are accessed on a spatial grid (x, y). Thus, knowledge on this spatial dependency can be used for the vectorial field reconstruction. Generally, Eqn. 9 is mathematically a simple convolution integral, which can be rewritten by the convolution-theorem in Fourier-domain. By introducing the Eqn. 9 can be written in Fourier domain as Here quantitiesj(k ⊥ ) are the Fourier-representations of the respective prior ones j(r ⊥ ).
The individual M ik (k ⊥ ) can be interpreted as the vectorial transfer-function of the SNOMtip determining not only the resolution but also the achievable contrast. These quantities are completely determined from the knowledge of the two fundamental modes accessible in the aperture.
The system of Eqns. 11 can be inverted independently for every transverse wave vector component k ⊥ resulting in the two independent field componentsẼ x (k ⊥ ) andẼ y (k ⊥ ) in Fourier-representation. From these information the remaining four field componentsẼ z (k ⊥ ), H x (k ⊥ ),H y (k ⊥ ) andH z (k ⊥ ) can be derived using Maxwell's equations for the homogeneous free space. By inversely Fourier transforming the independent field components the investigated field can be reconstructed. Thereby the achievable resolution in the reconstructed investigated field components is limited by the accessible spatial frequencies in the different The presented algorithm has the capability to fully vectorial reconstruct the investigated field components and is in terms of the developed theory the only way to extract information obtained by an aperture SNOM measurement without any a priori knowledge. This approach to evaluate phase resolved near field measurements has the ability to give novel insights in nano-optical research activities due to the ability to infer onto the complete vectorial information from a measured image.
From the numerical point of view, the only problem that is arising concerns the different discretizations of the investigated field and the modes supported by the aperture. Due to the nanoscopic dimensions of the aperture and the macroscopic extension of the investigated fields, the numerical sampling is different and has to be unified for the evaluation of Eqn. 11.
Due to the usual extreme ratio of the two different scales, approaches as zero padding can cause difficult memory issues and prevent the calculation of the investigated field components. To prevent the problem, chirp z-transformations can be used in order unify the grids without causing memory problems 31,35-37 .

V. CONCLUSION
In conclusion, we have presented a theory to describe the image formation process in aperture based scanning near field optical microscopes. Due to the complex mechanisms responsible for the image formation, the vectorial field information of the investigated field is intrinsically encrypted in a detected image. The discrimination of individual field components in a measurement is generally impossible. Only for the situation where the field to be measured can assumed to be constant across the aperture, an interpretation of a measurement becomes possible. We predicted the influence of the phase-term in a detected image as well as the different detection characteristics for aluminum and gold coated probes. We related all of our results to the ones accessible in the literature and we have been able to give a unified interpretation for the disparate statements presented in the past. With that, we have settled a long standing debate. In an aperture SNOM the electric and magnetic field components can be detected, depending on the modal properties of the eigenmodes supported by the nanoaperture. These properties can be tailored and provide therewith the means, to design probes that can detect specific field components on demand.
In addition, we provided an answer to the yet impossible extraction of complete vectorial field information from phase resolved measurements. The provided algorithm for the inverse imaging problem is the only possibility to interpret detected images without any a priori knowledge about the investigated fields. We believe that our presented work will pave the way for further developments of scanning near field microscopes that are inevitable tools to study and explore nano-optical phenomena. Understanding such an instrument as an enabling technology in other fields of research, such as biology and medicine, we impact with our work the broader stream of science.