Scattering and three-dimensional imaging in surface topography measuring interference microscopy.

Surface topography measuring interference microscopy is a three-dimensional (3D) imaging technique that provides quantitative analysis of industrial and biomedical specimens. Many different instrument modalities and configurations exist, but they all share the same theoretical foundation. In this paper, we discuss a unified theoretical framework for 3D image (interferogram) formation in interference microscopy. We show how the scattered amplitude is linearly related to the surface topography according to the Born and the Kirchhoff approximations and highlight the main differences and similarities of each. With reference to the Ewald and McCutchen spheres, the relationship between the spatial frequencies that characterize the illuminating and scattered waves, and those that characterize the object, are defined and formulated as a 3D linear filtering process. It is shown that for the case of near planar surfaces, the 3D filtering process can be reduced to two dimensions under the small height approximation. However, the unified 3D framework provides significant additional insight into the scanning methods used in interference microscopy, effects such as interferometric defocus and ways to mitigate errors introduced by aberrations of the optical system. Furthermore, it is possible to include the nonlinear effects of multiple scattering into the generalized framework. Finally, we consider the inherent nonlinearities introduced when estimating surface topography from the recorded interferogram.


INTRODUCTION
Surface topography measuring interference microscopy [1] (hereafter, just referred to as interference microscopy), and closely related coherent imaging techniques, such as digital holographic microscopy [2] and optical coherence tomography (OCT) [3], are key tools for biomedical imaging and the surface measurement of engineered materials. In the industrial area, phase-shifting interferometry (PSI) [4] and coherence scanning interferometry (CSI, also known as scanning white-light interferometry [5]) are the two most common modalities of interference microscopy (see Fig. 1) and have been used for high-accuracy three-dimensional (3D) measurements in a broad range of applications, due to their high sensitivity to small variations in object geometry and low measurement noise (subnanometer level) at all system magnifications [1]. Reviews of the basic principles and applications of interference microscopy can be found elsewhere [1,6,7].
Although interference microscopy is a well-established technique, enhancement in measurement capability and general applicability have been driven by continuous advances in the freedom and complexity of new product design, enabled by precision manufacturing and additive manufacturing [8]. In the absence of noise, environmental disturbances and mechanical imperfections, the measurement accuracy of interference microscopy is limited by the imaging model and the inversion algorithm for object reconstruction. Understanding 3D image formation is critical to future advancement of interference microscopy. In this paper, "3D image" refers to the recorded to the linear (Fourier) theory of optical systems as described by Goodman [10].
Imaging models of interference microscopy that consider the effects of a finite 2D pupil have been developed to study instrument response, including nonlinear effects, such as the "batwing effect" near the sharp edge of a step [11] and measurements of rectangular gratings [12]. 2D modeling has also been applied to challenging problems in high NA imaging, including for complex, optically unresolved surface structures, using pupil-plane integration and rigorous diffraction models [13].
A simple and commonly used model for scattering from thin objects or near planar surfaces, such as thin phase gratings, is based on the small height approximation. In this approximate model, the object/surface is treated as a small perturbation from a datum plane; the phase of the field at the datum plane is assumed to be directly proportional to the surface height variation [10]. In this case, the finite resolution of a real optical system will combine scattered components of similar phase and, providing that multiple scattering is negligible, linearity is assured. One way to determine the validity condition of this approximation is known as the Rayleigh criterion [14], according to which the maximum phase variation caused by the surface height variation should not exceed π/2 within the resolution cell, corresponding to a total height variation of λ/8 for normal incidence. In interference microscopy, the small height approximation used in conjunction with the 2D transfer function has recently been coined the 2D elementary Fourier optics (EFO) model [15]. We note here that the Rayleigh criterion is quite stringent, and in practice, it need only apply over a finite region defined by the resolution of the optical system. In this way, the method can be applied in a piecewise manner with a locally varying datum plane, as further discussed in Section 6.

B. Image Formation in Three Dimensions
The application of interference microscopy to a large range of 3D objects [8] provides the motivation for a comprehensive analysis of 3D imaging that inherently accounts for surface height variations greater than the depth of field, as frequently observed in practice. As shown in the following, 3D image formation depends strongly on the light illuminating the object and the scattered light propagating through and recorded by the imaging system.
The fundamental assumption that there is no multiple scattering (also applied in 1D and 2D models) simplifies the 3D imaging process. Under this assumption, scattering and recording are expected to be linear filtering processes applied to an appropriately defined object function. The most widely used linear scattering model for weak scattering objects is the (first) Born approximation [16]. Also, the Kirchhoff approximation (also known as the Kirchhoff theory of scattering or the physical optics method [14]) that has less restrictive conditions for validity has been widely used for scattering from surfaces of materials that have a refractive index much larger than unity.
Under these approximations, the linear filtering process can be characterized by a 3D transfer function that applies to a specific 3D representation of the object. The linear relationship between an object's scattering potential and the holographic recording of the scattering data was first studied by Wolf [17] and, shortly afterwards, by Dändliker and Weiss [18]. 3D reconstruction of the scattering potential based on Wolf's method was demonstrated by Fercher et al. [19]. McCutchen [20] showed that an imaging system can be characterized by the generalized 3D pupil, such that the 3D diffraction pattern that is generated when a lens produces an image of a point source can be calculated from the Fourier transform of the 3D pupil function, corresponding to the 3D amplitude point spread function (PSF) calculated from the Fourier transform of the 3D CTF. In the context of confocal microscopy, a series of studies on the image formation and 3D transfer function have been conducted by Sheppard and his colleagues [21][22][23][24][25][26][27]. The connection between interferometric imaging and confocal microscopy can be understood by observing that the reference beam in interferometric imaging acts like a synthetic confocal pinhole. 3D microscopic imaging and 3D transfer function theory are summarized elsewhere [28].
Coupland and Lobera [29] compared the 3D transfer function of monochromatic optical tomography, (reflection-mode) confocal microscopy and interference microscopy (using quasimonochromatic illumination) and showed that these methods are equivalent under the Born approximation [16]. For measurement of materials that have strongly reflecting surfaces, such as a perfect conductor, Coupland et al. [30] explicitly showed the derivation of the 3D image formation and effective transfer function of CSI by starting from the equation of potential scattering and considering the boundary conditions of the Kirchhoff approximation. Lehmann and Xie [31,32] used the Kirchhoff approximation to model scattering from a profile of a prismatic rectangular grating and the signal of a CSI instrument with an NA of 0.9. This quasi-3D model is limited to the plane of incidence and implemented by integrating over all incident angles and the spectral component of the light source. This model should yield the same result as the 3D model described in Ref. [30], but for an instrument with a slit pupil. Another quasi-3D model [31,32] uses the small height approximation to describe the scattering process, and uses the so-called Richards-Wolf model that is based on the Debye diffraction integral [33] to describe the diffraction by the aperture of the imaging system and the electric field distribution in the vicinity of the focal region of a lens. Similar to the concept of McCutchen's 3D pupil, high NA can be incorporated in the Debye diffraction integral.

C. Motivation and Aim
Although the major components of the 3D imaging theory for interference microscopy have been developed and documented, the use of the 3D theory in the design and development of new surface-measuring interferometric imaging methods is advancing slowly (see [8] for an overview of the latest advances in CSI). The main reasons include the difficulty of understanding the 3D theory due to its complexity, inconsistencies in the previous derivations, and the missing links between scattering and imaging processes, between the 2D and 3D imaging theories, and among different approximations in the context of interference microscopy.
In this paper, we generalize the theoretical framework for 3D image (interferogram) formation in interference microscopy. This work brings together the work to date, aligns the results given by Sheppard et al. [25,[34][35][36][37][38] and Coupland [29,30], as well as Born and Wolf [16], Beckmann and Spizzichino [14] and Goodman [10], and demonstrates how they are connected. Also discussed are the links between the Born and the Kirchhoff approximations, between Ewald and McCutchen spheres, and the relationship between the spatial frequencies of the scattered waves, the 3D object function and the (interferometric) image.
We consider that the object is illuminated with an angular spectrum of plane waves, each of which gives rise to a spectrum of scattered waves to be collected by the objective lens. In confocal microscopy with a coherent source, the amplitude of the scattered field is summed over the incident angular spectrum, such that images can be calculated from any appropriate model for the scattering process [25,39]. We will show this result is also valid for interference microscopy, where the 3D imaging requires the combination of two or more interferometric measurements of the optical field scattered from an object with different illumination conditions, or equivalently, by measuring the interference observed as the object is scanned through focus, as in CSI.
Throughout the work, we will use the scattering amplitude (defined in Section 2.A) as the thread to link scattering with 3D image formation. Although the analysis will be focused on the linear (i.e., single scattering) regime and the scalar case, it will be briefly shown that the theoretical framework supports scattering models in the nonlinear (i.e., multiple scattering) regime and with considerations of polarization effects. Moreover, an aberrant system can also be described under this framework. The fundamental similarities and differences between the different scanning methods in interference microscopy is explained with the theory of defocus.
Finally, we derive the 2D image formation in interference microscopy within the 3D theoretical framework and give an insight into the relationship of the small height approximation with the Kirchhoff approximation. The linear regime for surface measurement is also discussed.

A. Basis of the Scalar Theory for Scattering
We start our analysis with the integral equation of potential scattering [16]. The scattered field produced by a scattering medium occupying a volume in free space (Fig. 2) is expressed as where j = √ −1, vector r specifies the location of point P in the 3D space, and vector r specifies the location of a point inside . The scattering potential of the medium is defined as where k 0 = 2π/λ is the wavenumber, λ is the wavelength in free space, n(r) is the spatial distribution of the refractive index, and n(r) = 1 outside . The total field U (r) can be expressed as the Light scattered by an inhomogeneity in free space. Observation point P is distant from the scattering object.
sum of the incident field U i (r) and of the scattered field U s (r), i.e., The term V B (r)U (r) in Eq. (1) can be considered as the distribution of source points in that emit light. Let us define the origin of the coordinate system within (Fig. 2). If P is far away from the origin and is small compared to the distance O P , then where s is a unit vector in the direction of O P and r = |r|.
In a Cartesian coordinate system, s =x +ŷ +ẑ and r = r xx + r yŷ + r zẑ . The Green's function in the far field can be approximated as Substituting Eq. (5) into Eq. (1), we obtain the scattered field in the far field as where k s = k 0 s and k i = k 0 s i are the scattered and incident wave vectors, respectively, and is known as the scattering amplitude [16]. From Eq. (6), it is clear that the scattered field behaves as a set of outgoing spherical waves and the integral in Eq. (6) can be recognized as the Fourier transform of the source distribution. The scattering amplitude is generally nonlinear in terms of the scattering potential, and consequently, the relationship between the scattered and incident wave vectors is not unique.

B. Born Approximation
Equation (7) does not have a closed-form solution, as the total field in is dependent on the scattered field, but with specific approximations can be linearized with sufficient accuracy to gain useful insight into the measurement process. A widely used approximation is the (first) Born approximation-a perturbation method [16], where the incident wave can be considered unaltered during its propagation through inhomogeneity (Fig. 2). Accordingly, the major conditions for the validity of the Born approximation are that there is weak scattering, i.e., that the refractive index contrast [n 2 (r) − 1] is small and/or the size of the scattering object is limited, such that the phase perturbation at all points within is also small. Under the Born approximation, the total field in Eq. (7) can be replaced by the incident field, i.e., Considering a monochromatic plane wave U i (r) = e j k i ·r incident on the scattering object, and substituting Eq. (8) into Eq. (7), the scattering amplitude under the Born approximation is expressed as where K e = k s − k i (10) is called the Ewald vector (to be discussed in Section 3) and implies the Bragg condition for diffraction [10]. From Eq. (9), we know that the scattering amplitude can be calculated as the Fourier components of the scattering potential evaluated at spatial frequency K e .

C. Kirchhoff Approximation for Surface Scattering
The Kirchhoff approximation has been extensively used for solving forward and inverse surface scattering problems [14,38,40].
Using the integral theorem of Helmholtz and Kirchhoff, the scattered field can be expressed as a surface integral in which Kirchhoff's boundary conditions can be applied [14]. The first boundary condition approximates the total field at a surface point r s as the sum of the incident field and the reflected field that is determined by the Kirchhoff (amplitude) reflection coefficient R of the tangent plane at that point, thus, The tangent-plane approximation is equivalent to dividing a surface into small segments that are assumed to be locally flat. The choice of R will be discussed in more detail in Section 2.D. The second boundary condition assumes that the normal derivative of the field is wheren is the normal to the surface at r s . The major validity condition of the Kirchhoff approximation is that the radius of curvature of any surface irregularity must be significantly greater than a wavelength. A surface that contains many sharp edges and sharp points will reduce the modeling accuracy [14]. Other conditions for validity of the Kirchhoff approximation are listed below.
1) The Kirchhoff reflection coefficient needs to be constant.
This condition is fulfilled in the case of a perfect conductor at any angle of incidence. Alternatively, if the angles of incidence are not too large (e.g., less than 45 • ) and/or the refractive index contrast between the material and the surrounding medium is small, R is approximately a constant (see Section 2.D).
2) Shadowing and multiple scattering effects are negligible. Alternatively, a second-order Kirchhoff approximation and an appropriate shading function need to be considered [41][42][43].
3) The observation of the scattered field must be distant from the scattering object.
The validity of the Kirchhoff approximation for surface scattering has been studied extensively by many researchers in the areas of acoustics and optics, such as [43][44][45].
A widely cited expression for surface scattering under the Kirchhoff approximation is sometimes referred to as the Beckmann-Kirchhoff solution [14]. This solution follows from the principle of stationary phase, such that only the specular reflection from the tangent-plane has a major contribution to the scattered field. The solution can be simplified using spatial frequencies [24,25,38,40], such that (in a similar manner to Eq. (21) of [25]) the scattering amplitude can be written where F K (r) is referred to as the "foil model" of the surface [30] and is expressed as where δ() is a 1D Dirac delta function that follows the surface height Z s (r x , r y ) of a homogeneous material as a function of the lateral position. W(r) is an appropriate shading function that may increase the accuracy of the Kirchhoff approximation, e.g., by reducing the impact of sharp points or by taking account of the shadowing effects at large angles of incidence (e.g., at 70 • [46]). Similar to Eq. (9), the scattering amplitude is the weighted Fourier components of the foil model evaluated at spatial frequency K e . We will revisit the Kirchhoff weighting term in Section 2.D. Here, the calculated scattering amplitude of a prismatic random surface is shown in Fig. 3. The surface is assumed to be perfectly conducting and is space-limited by a Gaussian shading function. A monochromatic illumination with λ = 0.6 µm is considered. The magnitude of the scattering amplitude f K (K e ) is calculated for all K e within the half-circle, which has a radius of 2k 0 (related to the Ewald limiting sphere that will be discussed in Section 3.A). To access all values of f K (K e ) within the half-circle (in the monochromatic case), we should allow the illumination and observation to be made for polar angles within ±90 • with respect toẑ and for all azimuthal angles. Usually, due to limited angles of illumination and observation, the scattering amplitude can only be physically measured within certain areas of the half-circle in K-space (see Section 3.A).

D. Link between the Kirchhoff and Born Approximations
Considering the profile of a prismatic surface [such as that in Fig. 3(a)], with the plane of incidence in the plane of the paper, the Kirchhoff reflection coefficient R can be replaced by the Fresnel reflection coefficients. For a perfect conductor, the Fresnel reflection coefficients for s and p polarizations only differ in phase (π ), as In these circumstances, the scattered field is always in the plane of incidence, and depolarization will not occur if the incident wave is purely s -or p-polarized. However, in general, a non-flat surface depolarizes. Considering an unpolarized incident wave and applying the principle of stationary phase, the standard Beckmann-Kirchhoff solution for a 3D surface [14] has been derived by setting R = 1 for a perfect conductor to simplify the complex depolarization problem. As stated in Ref. [14], the choice of R = 1 was based on the fact that the reflected wave must have the same amplitude as the incident wave. R can also be interpreted as an effective Fresnel reflection coefficient given by The effective reflection coefficient has been considered in the context of confocal microscopy with an axially symmetric system, where the object is illuminated by an angular spectrum of plane waves [47]. For a perfect conductor, R e = 1. Note that the negative sign in Eq. (16) is an artifact of the conventional choice of coordinates, as for normal incidence, there is no physical difference between s and p polarizations. For normal incidence from air to a medium, where n s is the refractive index of the scattering medium. If the condition is satisfied (where θ i is the angle of incidence), then

Research Article
The accuracy of the approximation in Eq. (19) is discussed elsewhere [47]. Now, recalling Eqs. (13) and (14), letting the shading function equal unity and replacing the Kirchhoff reflection coefficient R by the effective reflection coefficient R e , we have (20) By considering that the surface is flat and level, the Fourier spectrum will be constant along the K z axis and zero elsewhere. However, f K (K e ) is not constant along K z due to the Kirchhoff weighting factor. The Ewald vectors (K e ) aligned with the K z axis are associated with specular reflection, such that K e ·ẑ = 2k 0 cos θ s , where θ s is the angle subtended by the scattered wave vector and the K z axis. A perfect conductor should reflect radiation independent of angle. The dependency of the scattering amplitude on (cos θ s ) −1 arises as the object is defined by the foil model of the surface that can be considered as a single layer of atoms. This is consistent with that observed in the context of imaging confocal microscopy [21,36].
Using the differentiation property of the Fourier transformation [48], we have where F{ } denotes a Fourier transform and H step ( ) denotes a Heaviside step function. Now, we can convert the foil representation of the object to an equivalent bulk material [36], defined in the region r z ≤ Z s (r x , r y ), by rewriting Eq. (20) as (22) Similarly, by taking the term K 2 e inside the Fourier integral of Eq. (22), we have where is a form of scattering potential that is appropriate for largeangle backward scattering and can be considered to represent a dipole layer following the surface as a result of the Laplacian operator [36,38]. The Born scattering potential V B (r) defined in Eq. (2) represents the bulk material and is often used for studying small-angle forward scattering under the Born approximation. The difference of the object representations is illustrated in Fig. 4. The Kirchhoff scattering potential is more realistic for large-angle scattering, as backscattering occurs as a result of changes in refractive index, rather than the value of the refractive index itself [38]. For a dielectric material, we consider the effective reflection coefficient in Eq. (19), which is assumed to be a constant at the surface. Consequently, we have For a weak scattering potential, n s is close to unity, as assumed in the Born approximation. Taking the first term of the binomial expansion of ln n(r), we have Up to this point, we have shown that the expressions for the scattering amplitudes under the Born and Kirchhoff approximations, as given by Eqs. (9) and (23), have the same linear form-a Fourier transform of a scattering potential. The Born scattering potential, given by Eq. (2), is a linear function of the medium perturbation, i.e., the volumetric distribution of the refractive index contrast. As the Laplacian is a linear operator, from Eqs. (24) and (25), we know that the Kirchhoff scattering potential V K (r) is linear in the effective reflection coefficient, which is in general a nonlinear function of the refractive index contrast. As shown by Eq. (26), an exception is the case of a weak scattering object, where V K (r) is linear in the Born scattering potential V B (r) and, therefore, is also the refractive index contrast. This result is consistent with that in the context of acoustic imaging [49].
From Eqs. (9) and (23), we understand that the object functions can be partially reconstructed by an inverse Fourier transform of the scattering amplitude, as In Section 4, we will show how image formation in interference microscopy depends on the scattering amplitude. First, however, we consider how the scattered and recorded planewave components are explained using the graphical methods of Ewald and McCutchen.

EWALD SPHERE AND MCCUTCHEN SPHERE
The Ewald sphere construction [16], which originated from the study of x-ray scattering from crystalline structures, can be used to provide an elegant visualization of the scattering process under the Born and the Kirchhoff approximations.
McCutchen's generalized 3D pupil of an imaging system, also known as the McCutchen sphere, is closely related to the Ewald sphere [37]. These geometrical constructions significantly simplify the understanding of the mechanism of scattering and 3D imaging.

A. Ewald Sphere Construction
From Eq. (9), we know that under the Born approximation, for incidence with a wave vector k i , all possible wave vectors of the scattered fields (k s ) are determined by the Ewald vector K e = k s − k i . For elastic scattering, |k s | = |k i | = k 0 . Consequently, all possible scattered wave vectors due to a planewave incidence are associated with Ewald sphere ε 1 (Fig. 5), which is a spherical shell in K-space. The Ewald sphere has a radius of k 0 , is centred at −k i and passes through the origin A. For all possible incident wave vectors, as characterized by the spherical shell ε 0 , all Ewald spheres are bounded within the so-called Ewald limiting sphere E L with a radius of 2k 0 (Fig. 5).
The Ewald sphere can be expressed by a 1D Dirac delta function asG Then, Eq. (9) can be written as a filtering operation applied to the Fourier transform of the scattering potential, Similarly, under the Kirchhoff approximation, Eq. (13) can be written as  Using the Ewald sphere construction, the effects of wavelength and the angles of illumination and observation on far-field scattering can be intuitively understood (Fig. 6). For simplicity, the effects are illustrated for the case where the scattered waves are only observed in the plane of incidence, corresponding to an optical system with a slit pupil. If the incident wave has a shorter wavelength, the Ewald sphere will have a larger radius. Illuminating the object with multiple wavelengths will broaden the shell ε 1 [ Fig. 6(a)]. Allowing multiple angles of incidence [ Fig. 6(b)] generates more Ewald spheres and allows more scattered wave vectors. Letting the angles of illumination and observation (from the same side of the object) lie within the range of ±90 • [as in Fig. 6(c)] or ±45 • [ Fig. 6(d)], the scattering amplitudes that can be physically measured are specified by the spatial frequency vectors only within the highlighted region of the Ewald limiting sphere in Figs. 6(c) and 6(d). In principle, more information to characterize the object can be gained by allowing more wavelengths and/or more angles of illumination and observation.

B. 3D Pupil and McCutchen Sphere
McCutchen introduced the generalized 3D pupil [20]. In similarity to the 2D case, the 3D CTF of an imaging system is a scaled version of the 3D pupil. In k-space (i.e., the space of wave vectors), an ideal 3D pupil is a full spherical shell having its center at the origin A' and a radius of k 0 [see Fig. 7(a)]. Under diffraction-limited conditions, this spherical shell can be expressed as the imaginary part of the Fourier transform of a free-space spherical wave [37],

Research Article
(31) As we are only interested in far-field scattering and imaging, we consider only the on-shell components, which implies the Bragg diffraction condition. The real part of the Fourier transform in Eq. (31) represents the off-shell case and is related to evanescent waves [37]. The positive and negative signs correspond to outgoing and ingoing waves, respectively.
Here, we refer toG(k) as the McCutchen sphere, which is similar to the concept of the Ewald sphere, but the latter intersects the origin of the reciprocal space of the object, i.e., K-space (point A in Figs. 5 and 7). There is a translational shift by the illumination wave vector k i between the coordinate systems of kand K-spaces, i.e., k = K + k i [see Fig. 7(a)].
For an interference microscope subject to a finite NA, the 3D pupil function is a truncated McCutchen sphere [see Fig. 7(b)]. The 3D pupil determines the maximum angles for illumination and observation. For the observation we havẽ whereẑ is parallel to the optical axis. For illumination we havẽ In general, the NA values can be different for illumination and observation. The illumination NA can be limited by the condenser aperture or the objective aperture, whichever is smaller (see Fig. 1). The observation NA is solely determined by the objective aperture. Moreover, additional phase terms can be added to account for lens aberrations and defocus [50].
The 3D pupil given by Eqs. (31) or (32) has a uniform angular apodization, corresponding to the Herschel condition [51] (see Fig. 8). A weighting term k ·ẑ/k 0 can be introduced to the 3D pupil function for a perfect aplanatic case. Alternatively, multiplying with a weighting term k ·ẑ/k 0 will result in a 3D pupil function that has a uniformly distributed projection in the pupil plane (i.e., the back focal plane of the objective lens), corresponding to a (commonly assumed) top-hat 2D pupil function and 2D CTF. The projection can be calculated by integrating the 3D pupil along the optical axis [20].

LINK BETWEEN 3D INTERFEROMETRIC IMAGING AND SCATTERING A. Coherent Demodulation of the Scattered Field
Interferometric techniques can be used to measure the phase information within the scattered field. The measurement process can be considered as a coherent demodulation through superposition of the scattered wave U s (r) with a reference wave U r (r) that originates from the same source. The intensity measured by a square-law detector for a single incident plane wave is expressed as where the asterisk (*) denotes the complex conjugation. The first two terms in Eq. (34) represent the intensity of the light scattered from the surface and the reference mirror, respectively. In an interference microscope, the term |U s (r)| 2 represents the image that can be obtained with a conventional wide-field microscope. The reference wave is first reflected by the beam splitter and then by the reference mirror (Fig. 9). For plane-wave incidence, we let the incident wave U i (r) = e j k i ·r . As k r is effectively equal to k i , we can replace the reference wave U r (r) by U i (r) in Eq. (34). The interferogram is associated with the real part of the interference term, where The modulation of the interferogram is proportional to the amplitude of the scattered wave, and the phase of the fringe is determined by the phase difference between the scattered and reference waves. Note that the first two terms in Eq. (34) are not of interest in this analysis and can be separated from the interference term in the spatial frequency domain in most practical cases (see Fig. 2 in Ref. [2]).

B. Partially Coherent Imaging
With an extended source and a finite illumination pupil (larger than zero), the field on the surface would always be partially coherent. The final interferogram can be considered as a superposition of the interferograms resulted from a set of incident plane waves k (m=1,2,3...) i on the square-law detector, as Considering Eq. (35), the Fourier transform of the interferogram is where ξ represents the spatial frequencies of the 3D interferogram and the two terms on the right-hand side are symmetrical about the origin in the spatial frequency domain. Taking the Fourier transform of the final interferogram and substituting Eq. (38) into Eq. (37), we obtaiñ The summation of the complex-valued terms implies a coherent superposition of the signal in the 3D pupil. The result is also valid for the case of a polychromatic source.

C. Image Formation in Terms of Scattering
Considering a monochromatic plane wave U i (r) = e j k i ·r incident on the object, in the far field we obtain the 3D angular spectrum of the (propagating) scattered field through the Fourier transform of Eq. (6), as All possible scattering wave vectors are constrained byG(k s ) in a theoretically ideal situation, or byG NA (k s ) in a practical interference microscope. In the following analysis, we will only consider the practical case. As discussed in the preceding section, in the linear regime of scattering, e.g., under the Born approximation, f (k s , k i ) can be replaced by f B (k s − k i ), and can be calculated as the filtered spectrum of the Born scattering potential [see Eq. (29)]. Therefore, considering that k s = K e + k i , we can rewrite Eq. (40) asŨ Let us recall Eq. (36) and take the Fourier transform of the interference term O(r). We obtaiñ Letting ξ = K, combining Eqs. (41) and (42), and considering the case of an illumination with a set of plane waves as described by Eq. (39), the spectrum of O(r) can be expressed as It is clear from Eq. (43) that the 3D interferogram records the filtered scattering potential under the Born approximation. The summation implies a coherent superposition of the Fourier spectrum of the scattering potential. In fact, the scattering amplitude in Eq. (40) can be calculated based on any appropriate scattering model. For example, under the Kirchhoff approximation for surface scattering, we can replace f (k s , k i ) in Eq. (40) by f K (k s − k i ) in Eq. (30) and obtainÕ This result is identical to Eq. (30) in Ref, [30], which was derived in a different manner.
In the nonlinear regime where multiple scattering is prevalent, the spectrum of the scattered field is not simply a filtered Fourier spectrum of the object function. Instead, for a specific incident plane wave, the scattering amplitude f (k s , k i ) can be obtained by using numerical techniques to solve Maxwell's equations exactly, such as by rigorous coupled wave analysis [52], finite-element methods [53], and boundary-source methods [54]. With such scattering models, considering an illumination with a set of plane waves, the spectrum of the interference term can be calculated as where ξ = k s − k i and no longer represents the spatial frequency of the object function, i.e., ξ = K. A rigorous Research Article interference microscopy model based on Eq. (45) has been demonstrated elsewhere [55].

D. 3D Transfer Function
The imaging capability of an interference microscope can be comprehensively characterized by its 3D transfer function [29,30,56]. It is important to understand that a transfer function applied to a specific representation of an object is an effective transfer function, and it differs slightly from the CTF or OTF, which transfers optical amplitude or intensity. Here, we will show the 3D volume transfer function that applies to the Born scattering potential and the 3D surface transfer function (STF) that applies to the foil model of a surface. From the 3D image (i.e., 3D interferogram) the surface topography can be calculated from an appropriate reconstruction algorithm [1].
Assuming an extended monochromatic source, e.g., a red light-emitting diode (LED), is used in a wide-field interference microscope, the image of the source can be projected onto the back focal plane of the objective lens through the Köhler illumination configuration [1]. Ideally, the objective aperture should be fully filled, such that the illumination pupil is maximized and equal to the observation pupil that is dependent on the objective NA. In the opposite case, if there is a point source in the objective aperture, the illumination would be coherent. For everything between these two extremes, the objective aperture is partially filled. If the condenser NA is smaller than the objective NA, then for a Michelson or Mirau objective, the illumination pupil is determined by the condenser NA; and for a Linnik system, the illumination pupil for the reference path is the product of the condenser pupil and the pupil of the objective for the reference path. The illumination pupil can also be apodized and characterized by a suitable distribution function. For example, in case of a Mirau objective, the central obscuration due to the reference mirror will influence the apodization of both the illumination and observation pupils. For an instrument using an incandescent lamp, the apodization is dependent on the filament.
With the extended source, each point in the illumination pupil can be considered as an independent point source generating a single plane wave specified by k i . Using the sifting property of the Dirac delta function, the summation term in Eq. (43) can be written as a convolution integral, whereG NA (K) has the same expression asG NA (K) but is dependent on the condenser NA if it is smaller than the objective NA. For a fully filled illumination pupil,G NA (K) =G NA (K). If a broadband source is used, such as a white-light LED, which is characterized by a normalized power spectrum density S(k 0 ) (has a dimension of inverse wavenumber), then we have The 3D volume transfer function with respect to the Born scattering potential is defined as The 3D STF with respect to the foil model of a surface under the Kirchhoff approximation is defined as The 3D transfer function is in general complex-valued. Its magnitude weights the Fourier components of the object function, the bandwidth characterizes the measurement resolutions [50], and its phase is related to optical aberrations of the instrument [56]. An example of the 3D STF is shown in Fig. 10. The difference betweenH K (K) andH B (K) is expected to be small because the Kirchhoff weighting term in Eq. (49) changes slowly within the passband. The 2D STF (corresponding to the 2D pupil at the back focal plane of the objective lens) is the in-pupil-plane projection of the 3D STF [ Figs. 10(b) and 10(e)] and is calculated by integratingH K (K) with respect to K z [20]. In Fig. 10(c), the unnormalized on-axis transfer function shows that the high NA broadens the bandwidth in the axial direction and shifts the peak toward lower spatial frequencies. Consequently, the coherence envelope of the fringe along the z-direction will be narrowed, which improves the optical sectioning capability, and the fringe spacing will be broadened.
If we know the geometry of the object surface accurately, it is possible to retrieve the 3D STF of an interference microscope (Fig. 11) by dividing the spectrum of the interferogram by that of the foil model,H The most convenient way to measure the 3D STF is to measure a precision microsphere [57,58]. Through inversion of the phase of the measured 3D STF, it is possible to "repair" the 3D interferogram and reduce the systematic error when measuring complex surfaces, featuring varying slopes and spatial frequencies, to the order of 10 nm [56].

REFERENCE DEFOCUS AND SCANNING IN INTERFERENCE MICROSCOPY
The ability to model imaging processes in 3D provides further insight into the effects of defocus and scanning in methods used in interference microscopy. The term "microscopic focus" used here refers to the focus of the objective lens. Features of an object located exactly at the microscopic focus will be imaged with the optimal intensity contrast that is characterized by the term |U s (r)| 2 in Eq. (34). On the other hand, the term "interferometric focus" refers to the plane of focus of the reference mirror.
Ideally, the reference mirror should be placed in the plane conjugate to the focal plane of the objective lens (see Fig. 9). In this case, the system works at the best focus condition, and the 3D transfer function is dependent on the expression in Eq. (46). However, if the reference mirror is shifted (along the axial direction) by a distance z away from the conjugate focal plane of the objective lens, an offset is introduced to the argument of the free-space Green's function in Eq. (31), such that it represents a source irradiating the reference wave from the new position [50]. Rewriting Eq. (31), we obtain the defocused 3D pupil function for the reference path as where the far-field Green's function is applied [see Eq. (5)]. The defocused 3D pupil function subject to NA is given bỹ In the defocused case, Eq. (46) is rewritten as An interference microscope working at the z = 0 condition will have a transfer function dependent on Eq. (53). Defocus causes a reduction of the passband and fringes in the phase of the 3D transfer function, and consequently, a broadening and axial shift of the 3D PSF (Fig. 12). In such a case, even if the object is located at the best microscopic focus position, the interferogram will not be generated with the best contrast, and measurement accuracy will decline. More discussion on this topic and the experimental proof can be found in Ref. [50]; however, we note here that the negative impact of interferometric defocus is dramatic and stronger in an interference microscope using a high NA than that with a low NA (Fig. 12).
Here, we consider defocus effects that are apparent in the scanning methods of certain interference microscopes. In practice, object scan and reference scan [59] are the two common ways to acquire a 3D image in an interference microscope [8]. Object scan refers to scanning the object through the focal plane by either moving the object or the objective lens such that the microscopic and interferometric focuses are coincident throughout the scan. In a reference scan, only the interferometric focus is changed due to the motion of the reference mirror. However, moving the reference mirror changes z and causes defocus as discussed, and its effect will change throughout the scan. We can expect errors due to reference scanning to be significant when imaging a surface with height variations larger than the depth of field.

Research Article
Many systems are designed with a reference scan mode. In an interference microscope that uses a small system NA to measure objects with height variations smaller than the depth of field, a reference scan may be used without too much loss of image quality and is widely used in OCT. It is common in Fourier-domain OCT or wavelength-scanning interferometry [3] to further introduce defocus to avoid overlapping of the dc component and the interferogram. In this case, there may be a benefit to using the full 3D model to take into account focus effects.
Finally, we note that the effects of defocus are inherent in traditional PSI, which effectively employs reference scanning over a limited range. In PSI, field-dependent focus effects are known to be a significant issue for surface departures approaching or larger than the depth of field. For this reason, PSI is most prevalent in laser Fizeau systems with a small system NA, while in interference microscopes, PSI is often limited to shallow surfaces, given the limited depth of field at high NA. Here again, a 3D model has the potential for broadening the range of acceptable surface height variations.

IMAGING UNDER THE SMALL HEIGHT APPROXIMATION
As mentioned in Section 1.A, a simple and widely used scattering model for a thin object or a near planar surface is based on the small height approximation, where the actual 3D surface topography is represented as a 2D, complex-valued phase object [10,15]. This approximation is implicit in most basic treatments of full-field interferometric measurements (see Section 7).
In the traditional 2D treatment [10,15], far-field surface scattering is calculated through the propagation of the 2D angular spectrum of the field at the mean level plane, expressed as e j ϕ(r x ,r y ;k i ) , where ϕ r x , r y ; k i = 2k i ·ẑZ s r x , r y , and Z s (r x , r y ) is the surface height function equivalent to that in Eq. (14), and the factor of 2 results from the doubled path length due to reflection. 2D image formation is then obtained through an inverse Fourier transform of the filtered 2D angular spectrum of the surface field. For coherent illumination, the filtering operation is characterized by a 2D CTF (usually assumed to be uniformly apodized) [15]. In fact, the small height approximation can be considered as a special case of the Kirchhoff approximation, in which a surface is considered to be made up of locally flat segments. Here, we show that 2D interferometric imaging under the small height approximation (i.e., the EFO model) can be derived based on the foil model of the surface under the generalized theoretical framework.
Considering a perfectly conducting surface with a small height variation, the surface can be represented by the foil model of a flat plane with an additional phase term, as where the constant Z 0 is the offset between the datum plane and the mean plane of the surface. The datum plane is usually chosen to be the plane of the microscopic focus. The spectrum of the complex-valued foil model is obtained as where Z 0 = 0 corresponds to the case where the surface is located at the microscopic focus. Evaluation of the integral in Eq. (56) gives the 2D angular spectrum of the approximated field at the mean plane.
we obtain the spectrum of the 3D interferogram term under the small height approximation, The summation in Eq. (57) needs to be calculated for each incident wave vector due to the dependency of the complexvalued foil model on k i . However, the Fourier spectrum of the real-valued foil modelF K (K) can be taken outside of the summation, such that the calculation of Eq. (44) is more convenient compared to that of Eq. (57). However, if only the 2D interferogram of the object surface at the datum plane (Z 0 = 0) is of interest, which is reasonable under the small height approximation, the 2D spectrum of the interferogram can be obtained by integrating Eq. (57) with respect to K z , as Evaluation of the integral in Eq. (58) gives the 2D STF for a specific k i . This result is the same as that from the EFO model with pupil-plane integration [15], with the exception of the Kirchhoff weighting factor resulting from the foil representation of the surface, and we note that this factor disappears if the object is represented by the Born/Kirchhoff scattering potential, as discussed in Section 2.D.
As image formation under the small height approximation can be derived from the Beckmann-Kirchhoff solution, the validity conditions for the Kirchhoff approximation should also apply (discussed in Section 2.C). In addition, the Rayleigh criterion (phase variation less than π/2, corresponding to a total height variation of λ/8 for normal incidence) suggests when it is reasonable to represent the 3D surface topography by a 2D complex-valued foil model in Eq. (55).
The fundamental differences between the 3D imaging (under the Kirchhoff approximation) and the 2D imaging (under the small height approximation) are evident from Fig. 13, by comparison of the Fourier spectra of the interferograms resulting from four prismatic sinusoidal gratings with different amplitudes.
If a surface satisfies the validity conditions for the Kirchhoff approximation and the Rayleigh criterion, such as those gratings with peak-to-valley (PV) amplitudes of λ/32 [ Fig. 13(a)] and λ/8 [ Fig. 13(d)], then we expect thatÕ P (K) ≈Õ K (K). Consequently, |Õ K (K)| must only contain continuous vertical lines within the passband of the instrument (as |F . We conclude that the small height approximation is valid to predict the interferogram for both surfaces. The spectrum of the λ/32 grating has a strong zeroorder component plus peaks at the fundamental frequency [ Fig. 13(c)]. The spectrum of the λ/8 grating has noticeable but small second-harmonic components [ Fig. 13(f )].
As the amplitude of the surface increases, the amplitude of the harmonics grows rapidly, as shown in Figs. 13(g)-13(i). As the PV is beyond the Rayleigh criterion, we begin to see discontinuities in the 3D spectrum [ Fig. 13(h)] and discrepancies between the 2D spectra in Fig. 13(i). The accuracy of the interferogram predicted by the small height approximation is, therefore, reduced.
If the grating amplitude is increased further to 2λ [ Fig. 13(j)], then the effect of the NA and the true 3D nature of the imaging process become apparent. In Fig. 13(k), a "splintering" of the diffracted orders in the K z direction is observed. Consequently, significant discrepancy between 2D spectra is observed in Fig. 13(l). In this case, the small height approximation should be used with caution (e.g., in a piecewise fashion) when modeling the interferogram generation in interference microscopy.
For normal incidence with a single plane wave, corresponding to spatially coherent illumination, the expression of the 2D interferogram term becomes much simpler, as For partially coherent illumination, with an extended source and with a spread of wavelengths, Eq. (58) can be simplified by Fig. 13. Spectra of the interferogramÕ K (K) of four perfectly conducting sinusoidal gratings. The simulated interference microscope uses a monochromatic source (λ = 0.6 µm) and a slit pupil with an NA of 0.7. The gratings have a period of 5 µm, and the PV amplitudes are (a) λ/32, (d) λ/8, (g) λ/2, and (j) 2λ, respectively. The K x -K z cross sections of the spectra and the K z -integrated projections are shown below the corresponding surface profiles. |Õ (2D) P (K x )| calculated using Eq. (58) for each surface is plotted in red circles for comparison.
using an equivalent wavenumber k e to consider the effect of the NA [15], such that ϕ r x , r y ; k e = 2k e Z s r x , r y , where 2k e can be obtained from the K z coordinate of the center of gravity of |H K (K)| [see Fig. 10(a)]. By using the equivalent wavenumber, we havẽ (61) For a thin sinusoidal surface, the strength of the nth order in the Fourier transform in Eq. (61) is given by a Bessel function of the first kind J n [10]. Calculation of the image formation using Eqs. (59) and (61) can be more computationally efficient in comparison to the 3D model, but at the cost of additional loss in accuracy due to the required approximations. In Section 7, we will discuss the surface reconstruction process that is closely related to the small height approximation and the linear regime for surface measurement.

FROM INTERFEROGRAM TO SURFACE TOPOGRAPHY
From the 3D interferogram (which is a stack of sequentially acquired 2D interferograms), the surface topography can be reconstructed. Most reconstruction algorithms assume that the surface height is linearly proportional to the phase of the fringes (along the axial direction) [15,60]. This is, of course, the basic assumption of the small height approximation or the complex-valued foil model discussed in Section 6. In real-world applications, surface topographies, such as those of freeform optics, are often characterized by their power spectral density functions. It is convenient to use the instrument transfer function (ITF), defined as the ratio of Fourier components for the measured and true topographies [56], to characterize the measurement when it is treated as a linear filtering process, and to provide a method to quantify the metrological characteristic "topography fidelity" [61].
The ITF can only be used within the linear regime of surface measurements. However, inferring the surface topography from the interferogram may introduce nonlinearities. As discussed in Section 6, in the cases of the λ/32 and λ/8 gratings (neglecting the weak second-harmonic term of the latter), we can expect the phase of the interferogram to reflect the mean height within the resolution cell, which is characterized by the (interferometric) PSF. This is the regime when the surface measurement can be considered a linear process. For the λ/2 grating, as the amplitudes of the higher harmonics are considerably reduced due to the modulation of the transfer function, harmonic distortion will inevitably occur; the phase of the interferogram is no longer directly proportional to the surface height, and the surface measurement process becomes nonlinear (example surface measurements with nonlinearity effects can be found elsewhere [58]). Amplification of the high-frequency response within the passband (i.e., by flattening the transfer function) may effectively reduce the nonlinearity, but at the cost of increasing noise [58]. For the 2λ grating, the higher harmonics begin to fall outside of the instrument passband, causing more significant harmonic distortion and increasing nonlinearity in the surface measurement.
In summary, although the interferogram generation can be characterized by a linear filtering process, inferring the surface topography from the interferogram will introduce nonlinearity. We note that the filtering process can be applied either by using the 3D theory or in a piecewise fashion, using the 2D (EFO) model over regions that deviate from a plane by less than the depth of field and coherence length of the source. We consider that the linearity of surface measurement is assured if the more stringent Rayleigh criterion is observed (PV < λ/8) over surface regions of interest. In this case, giving sufficient sampling of the image by the camera, the ITF can be approximated by the magnitude of the K z -integrated, 2D projection of the 3D STF (see Fig. 10). More discussions on the ITF can be found elsewhere [15,60].

SUMMARY
In this paper, we provide a unified theoretical framework for 3D image (interferogram) formation in surface topography measuring interference microscopy (interference microscopy). The links between the Born and the Kirchhoff approximations, between Ewald and McCutchen spheres, and between the spatial frequencies of the wave, the object, and the interferogram have been discussed.
We have shown that surface scattering, characterized by the scattering amplitude, plays a pivotal role in the 3D image formation in interference microscopy. Once the scattering amplitude corresponding to a specific incident plane wave is known, the 3D image can be synthesized. Under the Born and the Kirchhoff approximations, scattering can be considered as a linear filtering operation applied to a specific model of the object, and imaging is a holographic recording of the scattered field. Therefore, the imaging process is also linearly related to the object model and can be characterized by an effective transfer function. In the nonlinear regime, where multiple scattering is not negligible or the Kirchhoff approximation is no longer valid, the scattering amplitude can be calculated by using numerical techniques to solve Maxwell's equations exactly. In this case, the 3D image must be calculated iteratively for each incident plane wave. In addition, the effects of defocus on the transfer function and PSF are discussed. The fundamental difference between the different scanning implementations in interference microscopy has been explained with the theory of defocus.
We have also derived the 2D image formation in interference microscopy, within the generalized framework, by treating the small height approximation as a special case of the Kirchhoff approximation. Finally, the linear regime for surface measurement and the relationship between the ITF and the 3D STF have been discussed.
This theoretical framework is not limited to surface topography measuring interference microscopy but can be applied to other interferometric/holographic imaging techniques, such as digital holographic microscopy and OCT, for analysis of industrial and biomedical specimens.