Regularized pseudo-phase imaging for inspecting and sensing nanoscale features

: Recovering tiny nanoscale features using a general optical imaging system is challenging because of poor signal to noise ratio. Rayleigh scattering implies that the detectable signal of an object of size d illuminated by light of wavelength λ is proportional to d 6 / λ 4 , which may be several orders of magnitude weaker than that of additive and multiplicative perturbations in the background. In this article, we solve this fundamental issue by introducing the regularized pseudo-phase, an observation quantity for polychromatic visible light microscopy that seems to be more sensitive than conventional intensity images for characterizing nanoscale features. We achieve a significant improvement in signal to noise ratio without making any changes to the imaging hardware. In addition, this framework not only retains the advantages of conventional denoising techniques, but also endows this new measurand (i.e., the pseudo-phase) with an explicit physical meaning analogous to optical phase. Experiments on a NIST reference material 8820 sample demonstrate that we can measure nanoscale defects, minute amounts of tilt in patterned samples, and severely noise-polluted nanostructure profiles with the pseudo-phase framework even when using a low-cost bright-field microscope.

angular momentum, and the orbital angular momentum [12,13], are only observable as functions or derivatives of irradiance. A primary issue with irradiance is that numerous different types of noise sources can easily deteriorate the fidelity of the carried information. For example, the charged coupled device (CCD) suffers from readout noise, cosmic radiation noise, reset noise, thermal noise, readout noise, spatial nonuniformity in the quantum efficiency, photon shot noise, demosaicing noise, quantization noise, and even post imagecapture effects [14]. These noise sources can potentially distort the original signature. Hence, any function or derivative based on the irradiance inevitably involves the effect of error propagation from the noise sources. As the math for error propagation is specific to the chosen function or derivative, we seek to engineer the function or derivative to minimize the effects of noise and thereby surpass conventional methods regarding SNR and sensitivity.
In this article, we show how to extract weak signals from otherwise noisy data in a general imaging instrument by introducing and implementing a regularized and physically meaningful pseudo-phase retrieval process. The regularization and the physical interpretation of the pseudo-phase are key advances compared to prior work [15,16]. Our framework is capable of partially eliminating the adverse effects of imperfections in the imaging components and significantly improving the SNR and contrast of in-focus samples. These advantages make it an ideal technique to inspect and sense ultra-small events that are difficult to observe by conventional intensity-based techniques, including the optical detection of killer-defects [5], the sensing of nanoscale tilting of wafer surfaces in flip chip packages [17], and the general reconstruction of nanostructures [10,11]. Here, we should mention that the term "general imaging instrument" is not limited to Abbe imaging-based instruments (e.g., optical microscopy and transmission electron microscopy), but also may be applicable to other imaging techniques that can provide fine through-focus images. Fig. 1. Schematic of the imaging system in epi-mode. OBJ, objective; BS, beam splitter; CCD, charge coupled device. In the simulation, we fix the wavelength, numerical aperture of objective, and transverse magnification at 405 nm, 0.9, and 100, respectively. The focal lengths are f 1 = 12 mm and f 2 = 1200 mm.

Defining a figure of merit (FOM) to quantify the rejection of noise from out-offocus sources
Starting from the scattering field distribution at plane S [18,19], by applying a curl operation and by assuming the point M is far away from plane S, the field at point M can be formulated as [20] 0 where e r is the unit vector pointing from a point N (x 1 , y 1 ) on the plane S to the observation point M. k 0 and n are the wave number and the normal of surface S directed toward the left half space, respectively. The superscripts I and O indicate the image plane and object space at plane S, respectively. The integral kernel of Eq. (1) is exactly the form of a spherical wave with polarization state determined by |r| is the distance between points N and M. E indicates the electric field. With Eq. (1), we can calculate the image field, resulting from an arbitrary field distribution on plane S, by summing the spherical waves coming from all the discretized point sources on plane S. This is referred to as the m-theory [18] because a spherical wave component is equivalent to an electric field due to a harmonically oscillating magnetic dipole (MD) of moment O ( ). N × n E As illustrated in Fig. 1, we can consider an Abbe imaging system in an epi-illumination mode and trace each ray stemming from a MD in the plane S in the framework of the generalized Jones calculus. One can derive the image field (corresponding to an on-axis MD) in the vicinity of the first principal focus of the detection lens [20,21] I O  O  1  2  I  2  I   I  O  O  1  2  I  2  I   I  O  O  3  I  3  I [ + cos (2 )]+ sin (2 ) [ -cos(2 )]+ sin (2 ) 2 ( cos + sin ).
φ I is the polar angle of an observation point r I = (|r I |cosφ I , |r I |sinφ I , z I ) in the image plane. L 1 - L 3 are the integrals of the form   2  2  1  2  1  2  0  0 I  2  0  I  2  O  1  2  1   2  2  2  2  1  2  2  0 I  2  0  I  2  O  1  2  1   2  2  2  3  2 1  0 I  2 where λ and f 2 are the wavelength in free space and focal length of the detection lens, respectively. z 0 and z 1 are the z-coordinates of a source point N on the plane S and an observation point M on the image space, respectively. θ 1 and θ 2 are the off-axis angle and convergence angle of a specific ray following the geometrical optics, respectively. α is the maximum of θ 2 . The sine condition (sinθ 1 /sinθ 2 = f 2 /f 1 ) holds for this model. From Eqs. (2) and (3), we can find that the out-of-focus images are related to z 0 , which is the parameter used to represent the offset of the plane S along z axis. Because any image is a coherent superposition of the image fields of all the MDs in a coherent microscope, analyzing the through-focus property of typical MD arrays aids in determining the system characteristics. We arrange each MD array as a square in a plane parallel to x 1 -y 1 with MDs equally spaced in the form of P × P, where P is the number of MDs along the edge of the square. Here, P is set from 1 to 5, whose corresponding arrangements are illustrated in Fig.  2(d). We define a FOM to characterize the concentration of scattering energy at the focal plane (z I = 0) for an arbitrary MD array: Here, N x and N y are the numbers of pixels along x and y directions, respectively. L ij is the pixel distance defined as pixel. L max is the maximal pixel distance as illustrated in Fig. 2 and O z I δ + are the images obtained at positions z 0 , z 0 -δ, and z 0 + δ, respectively. δ is a small offset quantity that is fixed at 1 μm. To study the effect of significantly out-of-focus MDs, we investigated the dependence of the FOM on z 0 , the z coordinate of the plane where the MDs are situated, by evenly sampling values in the range −10 -10 mm with 0.67 mm steps and also in the range −0.5 -0.5 mm with 0.005 mm steps using simulation with a monochromatic source. For a specific MD array, the amplitude and initial phase of each MD are randomly chosen in the ranges of 0 -1 V/m and 0 -π, respectively. However, after the selection, the properties of each MD do not change with the variation of z 0 . We fix the wavelength, numerical aperture of objective, and system magnification at 405 nm, 0.9, and 100, respectively. As shown in Figs. 2(f) and 2(g), the normalized FOM reaches its maximum only at the best focal position for all the MD arrays with various amplitudes and initial phases. It is small for large out-of-focus offsets. For other wavelengths, we have similar curve shapes to Figs. 2(f) and 2(g), i.e., a peak at the center but with different noise tails. This indicates that we can effectively eliminate the effect of MDs due to an off-focus artifact (e.g., dust, system defects, or undesired background noise) using incoherent polychromatic light because of the superposition of intensities whereas this may not always occur with coherent polychromatic light because of the superposition of fields. Because the in-focus sample after processing retains its high FOM, the contrast of the in-focus sample is improved. This shows that we can engineer a function of the through focus images to boost the SNR. Interestingly, we find that the central difference for the through-focus images as required in the definition of our FOM is akin to the right-hand term of the transport-of-intensity equation (TIE) [22][23][24][25][26]. This indicates that the phase obtained by solving TIE is potentially a more robust parameter to systematic errors. This direct link between the SNR improvement and phase of samples motivated us to explore the inherent operator characteristic of the TIE and the impact of systematic errors on the reconstructed phase. Poisson equation [22]. We formulate and solve the Poisson problem in matrix form

Regularized, matrix-based, non-iterative TIE solution
and T is a tri-diagonal matrix with non-zero elements [ , and taking the divergence, we obtain another Poisson equation, which we can solve using the same procedure to obtain the phase matrix φ. However, in Eq. (7), we find that instability can arise because of division by small S(i, i) + S(j, j). Systematic and random errors in the matrix G exacerbate this issue. These errors can arise either from noise in the measurement itself or from approximating the through-focus derivative by a central difference. We therefore modify Eq. (7) by multiplying the right-hand side by a regularization filter function ω 1 (i, j) that is akin to that of Tikhonov (or Phillips) regularization [27,28], which is given by where ε 1 is a regularization factor that depends on the noise level ρ in the raw intensity images. Similarly, there is instability from division by small are small compared to ρ 2 and q, respectively, and unity weight when they are large compared to ρ 2 and q, respectively. The right-hand term of Eq. (5) is the through-focus difference, which partially removes the effect of large off-focus MDs on the final images and enhances the in-focus events. Moreover, the subtraction mitigates the impact of common mode noise, such as spatially varying additive noise (e.g., thermal and flicker noise [14]). The regularized pseudo-phase retrieval further suppresses the instability arising from both the operator characteristic and the systematic errors. In addition, the pseudo-phase retrieval also reduces the effect of spatially varying multiplicative noise (e.g., illumination intensity and photo-response non-uniformity [14]) because of the division by O ( ) . I z ij Therefore, the extracted phase using the proposed regularization is a more accurate and sensitive parameter than amplitude.
Our TIE solver code is available online in MATLAB and Python formats [29]. We optimized the MATLAB version for speed. After a 90 ms one-time initialization, it takes on average 220 ms per image to extract the pseudo-phase of a sequence of 1024 x 1024 pixel images using double precision on the central processing unit (CPU) of a desktop computer with an Intel i7-4770 3.4 GHz 4-core processor. Using single precision and a 1344-core GTX-670 graphics processing unit (GPU) on the same desktop computer, the average run time was reduced by an order of magnitude down to 20 ms per image, which is fast enough for most real-time applications. However, we recommend using double precision to minimize the accumulation of numerical rounding errors and a GPU optimized for fast calculations in double precision. Interested readers can download the codes related to the references [15,16] and compare the results for appropriately collected experimental images with the results obtained with our method. We believe that the regularization technique in Eq. (8) is the key to the success of our method in significantly suppressing the measurement errors, when compared with those prior works. We should mention that those solvers are still valuable because they solve, for instance, the problem of boundary conditions, which was not examined in this paper. Thus, those solutions may be combined with our technique to further improve the results.

Optical system assessment and nanoscale defect inspection
To validate our proposed pseudo-phase imaging method, we used an artifact designed for calibrating scanning electron microscopes, namely the NIST reference material (RM) 8820 as the optical testing sample in this paper. NIST fabricated RM 8820 samples on 200-mm silicon (Si) wafers using 193 nm extreme ultraviolet light lithography and dry-etching. The patterns were made in the top amorphous Si layer that sits on a 2-nm thick SiO 2 etch stop layer that sits on the Si wafer. The sample includes numerous patterns such as scatterometry targets and 1D and 2D arrays of periodic patterns. Before fabricating the patterns, NIST determined the thickness of the amorphous Si layer to be 97.3 nm with a standard deviation (SD) of 1.6 nm [30,31].
We used a low-cost metallurgical microscope (ME520TC Amscope Inc.) that is composed of a 24V 100W halogen lamp, a 50X magnification / 0.75 numerical aperture (NA) infinitycorrected plan achromatic objective, and a low-cost 14MP USB3.0 microscope digital camera (MU1403B Amscope Inc.). The camera is equipped with a 0.5X reduction lens and an infrared (IR) filter to capture the digital images of the RM 8820 artifact in the bright-field epiillumination mode. The dynamic range and SNR of the CMOS camera are 65.3 dB and 35.5 dB, respectively. We did not use any polarization components.
We captured an image of the blank area in the RM 8820 sample to evaluate the performance of the entire system. Ideally, the measured image should be a uniform intensity map in consideration of the ultra-small SD (1.6 nm) of the sample thickness and the valid wavelength range (380-650 nm) of the IR-filtered camera. However, as shown in Fig. 3(a), the measured image is extremely noisy. See the large bright specks circled by white dotted boxes in Fig. 3(a). Further, the intensity distribution has a clear decreasing trend from the center to the edge. The mean values of the entire image (3286 × 3286 pixels) and the central region (1643 × 1643 pixels) are 0.59 and 0.74, respectively. This indicates that the illumination input is non-uniform and/or that there are relatively strong aberrations in the optical system. By applying the proposed regularized algorithm for the intensity image, we obtain a quite uniform pseudo-phase distribution map. The method has removed the spatial non-uniformities of the light source and detector. Further, the method removed all the bright specks. These objects could include dust on optical components inside the microscope or on the sample. Moreover, from the pseudo-phase, we can clearly observe some bright and some dark spots due to nanoscale defects on the sample [32], most of which are overwhelmed in the intensity image. To make a quantitative comparison between the intensity and pseudo-phase images, we define a fractional contrast C peak mean peak mean , where S peak is the peak value of a local area in the image and S mean is the mean value of the entire image. Therefore, the range of C is 0 -1 and the larger C corresponds to a higher contrast of a local signal. We then selected 12 nanoscale defects as marked by A -L in Fig.  3(b) and computed the corresponding fractional contrast C in Fig. 3(c) for both the intensity and pseudo-phase images. As expected, the contrast of each nanoscale defect captured on the phase image are far larger than that on the intensity image. The nearly identical and high contrasts for all the defects associated with the phase are because of the ultra-small background signal, i.e., S mean = −2.13⨯10 −4 , which demonstrates that the proposed method can suppress the systematic and random errors with minimal alteration of the signals of nanoscale physical objects around the focal plane. We confirmed that these 12 defects are related to the sample by translating the sample laterally and observing that their positions in the image moved accordingly.

Ultra-sensitive and robust nanopattern surface tilting sensor
Whenever inspecting nanoscale patterns, it is vital to ensure that the surface of the sample is homogenous and flat. This is because the amplitude of light scattered by a deep subwavelength nanostructure with primary dimension d from a beam of unpolarized light of wavelength λ is proportional to d 3 /λ 2 , which is so weak that sample tilt-induced image distortion can easily overwhelm it. This is extremely critical in the field of flip chip packages in semiconductor packaging [17]. In this section, we will demonstrate that the proposed pseudo-phase imaging method can provide an ultra-sensitive measurement of the sample tilt of a patterned surface even in a highly noisy environment. This suggests a new way for highly accurate wafer alignment. The method is much more efficient than point-to-point or scanning probe-based techniques [33]. We first investigate the effect of a tilted sample on the far-field image for a high NA optical system. The system magnification and NA in the simulation are set to be the same as that of the low-cost microscope. A silicon nanopattern consisting of identical cross-type elements with the dimensions P = 1 μm, W = 250 nm, and H = 97 nm, as shown in Fig. 4(a), is used to model the in-focus intensity image by the method in Sec. 2. We define a rotation axis [the black dotted line in Figs. 4(a)-4(c)] that overlaps with one of the diagonals of the pattern, as shown in Fig. 4(a). We rotate the nanopattern by 2° with respect to the rotation axis and present the corresponding in-focus image in Fig. 4(b). Apparently, the intensity distribution maps on two sides of the rotation axis behave differently. The intensities on the B and C slices for the tilted case does not retain any central symmetry, but rather presents a slight damping tendency in both the horizontal and vertical directions, see curves B and C in Fig. 4(d) and compare them to those in Fig. 4(e) which is for 0° tilt. Moreover, we can find that the damping directions for B and C curves are opposite, which indicates that we could potentially utilize the BC slices to estimate the direction and the degree of tilting. We then compute the phase by the method in Sec. 2 for the tilted sample, as shown in Fig. 4(c). We present three slices in Fig. 4(f) at the same positions of the phase image as that of the intensity image. We observe strong damping tendencies for both the B and C curves in Fig. 4(f) compared to the constant trend for the case of 0° tilt in Fig. 4(g). The damping directions are opposite to that of the intensity, see Figs. 4(d) and 4(f). We define a degree of damping (DD) P P P back_mean P back_mean max( ) min( ) , max( max( ) , min( ) ) where S P and S back_mean represent the signal corresponding to the patterned regions and the mean value of the background (non-patterned area). The computed DDs for the intensity and the phase B curves in Figs. 4(d) and 4(f) are 14.1% and 43.5%, respectively, which indicates that the pseudo-phase is about three times more sensitive than the intensity to the tilting angle. We utilize the top row of the four-line pattern (4LP) area of the RM 8820 sample, as shown in Fig. 5, in this section. We selected this row because each column has the same geometrical dimensions and there is no intentional line edge roughness for this row like there is in the other rows of the 4LP area. We show the optical intensity map in Fig. 6(a). Figure  6(b) presents the intensity curve for an arbitrary slice across all the patterns. There is only a weak damping tendency from left to right (the blue dotted line) that is akin to the curve B in Fig. 4(f). The weak damping tendency in the intensity image may arise from both the small tilting angle and the system and random errors of the system. By applying the proposed method, we obtain the pseudo-phase map in Fig. 6(c) and slice in Fig. 6(b), in which we can see a strong damping tendency from left to right (the solid red curve). The computed DD of the pseudo-phase is 0.6061, which is seven times larger than that of the intensity (0.0841). To estimate an upper limit of the pseudo-phase in sensing the surface tilting, the RM 8820 sample is put on top of a Ø1" kinematic mirror mount VM1 with vertical drives (THORLABS, Inc., Newton, NJ, USA). Two fine adjustment screws that offer a 1/4° of angular adjustment per revolution enable high rotational sensitivity for the VM1. There are 50 divisions per revolution. Therefore, each minor division represents 0.005° (0.25°/50) or 18 arcseconds of rotation. The original tilting angles (θ x , θ y ) of the mount are unknown, where θ x and θ y are defined as the angles between the main axes [L1 and L2 in Fig. 7(a)] of the mount and the horizontal plane, see Fig. 7(a). The axis L1 is pre-aligned perpendicular to the long axis of the 4LP pattern on the RM 8820 sample. We first measure the intensity images of the 4LP patterned region with 0 amplitude at the original position of the mount, then rotate the screw by only one division for θ x and repeat the measurement. We implement a horizontal slice akin to the one in Fig. 6(a) for the two sets of intensity images and present the corresponding intensity distribution in Fig. 7(b), in which we can hardly see any difference regarding to the 18 arcseconds of change in θ x . By computing the pseudo-phase and making the same slices, we obtain two quite different curves as shown in Fig. 7(c), indicating that we can easily sense the ultra-small tilting angle. Note that the width of Fig. 7(a) is 125 μm. Thus, the potentially maximal vertical displacement due to the 18 arcsecond angle is only 125 μm × sin(0.005°) = 10.9 nm. This clearly demonstrates that the pseudo-phase is an extremely sensitive and robust parameter for sensing the nanoscale vertical inhomogeneity even in an extremely noisy environment.

Reconstruction of nanostructure profiles
In this section, we will show how the pseudo-phase can help in recovering the profiles of nanostructures from a set of heavily contaminated intensity images. We use the scatterometry line targets, which consist of a series of periodically arranged lines on the RM 8820 sample, here for demonstration. We choose a target with the linewidth and pitch equaling 50 nm and 500 nm, respectively. The length of an individual line is 50 μm, and the array of lines forms a square area, see the inset of Fig. 5. We place the sample on top of the VM1 mount with an alignment based on the method in Sec. 3.2 for reducing the tilting angle as much as possible. An intensity image of the target is captured by the low-cost microscope with the same objective lens as before (NA = 0.75). As shown in Fig. 8(a), the in-focus image of the scatterometry target is highly contaminated by the system errors and random errors in the microscope -the specks and other noise heavily distort and overwhelm the profile of the deep sub-wavelength lines. From the analysis and definition of the regularized pseudo-phase in Sec. 2, we know that the defined pseudo-phase is inherently robust to the out-of-focus MDs and the errors caused by the operator instability. Thus, the pseudo-phase image should present a higher SNR and contrast for the line array. Therefore, we capture a set of through-focus images to compute the pseudo-phase image of the scatterometry target. We present the result in Fig. 8(b). We can roughly see each individual line from the phase map. Most importantly, we can also reconstruct the signal associated with the nanoscale defects (marks A, B, and C) on the pattern, whereas in the intensity map, the measurement errors severely overwhelm the signals at the same positions. We present the intensity and phase curves corresponding to a horizontal slice (the blue dotted lines in Fig. 8) at the same pixel position in both the intensity and pseudo-phase maps of the scatterometry target in Fig. 8(c), in which we clearly observe that the central region of the phase curve is periodic and each cycle represents the correct pitch of the pattern, whereas the intensity curve looks like random noise. In summary, we proposed and experimentally evaluated a regularized matrix-based imaging framework to improve the SNR and enhance the contrast when extracting the signature of nanoscale structures in an optical imaging system. We formulated the process as the reconstruction of the pseudo-phase of the sample, which has a more directly physical significance when compared with other approaches. We used the NIST RM 8820 standard artifact to verify the feasibility of the proposed method. We successfully recovered nanoscale defects, sensed extremely small angles of sample tilt and nanoscale vertical inhomogeneity, and reconstructed scatterometry targets with deep sub-wavelength features. The achievements demonstrate that the regularized pseudo-phase is potentially less sensitive to system errors and more sensitive to the micro-and nanoscale variation of the target sample than intensity images. Although transparent samples were not studied in this paper, we expect the improvement in sensitivity will also be realized in transmission mode. We believe, this new imaging framework, paves an economical and highly efficient way to rearm the conventional optical instruments in exploring the small world.